Predicting frame-level information from song-level labels

Jen-Yu LiuSpeaker: Jen-Yu Liu (Visiting scientist; National University of Taiwan and Academica Sinica, Taiwan)

Date: June 4, 2015


In music information retrieval, we often want to know the frame-level labels. For example, we may want to know where the guitar play is in a song or we may want to know where the male voice is in a song. However, it is time-consuming and tedious to ask people to label the songs at the frame level. My current work in my Ph.D study investigates the possibility of getting frame-level labels from only the song-level labels by neural networks. We find that this is possible.  I will present our result on two tasks: a music auto-tagging task and a singing voice detection task.