The NAR dataset

NAR is a dataset  of audio recordings made with the humanoid robot Nao in real world conditions for sound recognition benchmarking. All the recordings were collected using the robot’s microphone and thus have the following characteristics:

  • recorded with low-quality sensors (300 Hz – 18 kHz bandpass)
  • suffering from typical fan noise from the robot’s internal hardware
  • recorded in mutiple real domestic environments (no special acoustic charateristics, reverberations, presence of multiple sound sources and unknown locations)

The dataset is available at this location : The NAR dataset (ZIP file, 35MB). The data are freely accessible for scientific research purposes and for non-commercial applications.

The dataset is organized as follows:

  • Each class is represented by a folder containing all the audio files labeled with the class.
  • The name of a folder is the name of the class attached. The name of an audio file is “foldername$id.wav” where $id is an incremental identifier starting at 1.
  • Each audio file is provided in a WAV format (mono signal, 48kHz sampling rate and 16 bits per sample).
  • 42 differents class for 852 sounds have been recorded and organized into four scenarios :
Scenarios Classes
Kitchen Eating, Choking, Cuttlery, Fill a glass, Running the tap, Open/close a drawer,Move a chair, Open microwave,Close microwave, Microwave, Fridge, Toaster
Office Door Close, Open, Key, Knock, Ripped Paper, Zip, (another) Zip
Nonverbal Fingerclap, Handclap, Tongue Clic
Speech 1,2,3,4,5,6,7,8,9,10, Hello, Left, Right, Turn, Move, Stop, Nao, Yes, No, What

Related Papers:

ICRA2014_Janvier.pdf BibTex

IROS.pdf BibTex