Date: June 4, 2015
In music information retrieval, we often want to know the frame-level labels. For example, we may want to know where the guitar play is in a song or we may want to know where the male voice is in a song. However, it is time-consuming and tedious to ask people to label the songs at the frame level. My current work in my Ph.D study investigates the possibility of getting frame-level labels from only the song-level labels by neural networks. We find that this is possible. I will present our result on two tasks: a music auto-tagging task and a singing voice detection task.