🔈 Audio
Make all sounds the same
ffmpeg -i nombre_file_wav -n -acodec pcm_s16le -ac 1 -ar 16000 nombre_fichero_salida
-n: No sobreescribir fichero audo original-ac 1: Mono
1D wave signal
Sample rate: Points per second
Due to the Nyquist–Shannon sampling theorem the highest frecuency that you can caputure is Sampling Rate / 2.
| Sample rate | Max freq | Description |
|---|---|---|
| 8.000 Hz | 4 kHz | Used in telephone & walkie-talkie. |
| 11.025 Hz | 5,5 kHz | Used for lower-quality MPEG, PCM audio |
| 16.000 Hz | 8 kHz | Used in most VoIP and VVoIP, extension of telephone band. |
| 22.050 Hz | 11 kHz | Used for lower-quality PCM and MPEG audio. |
| 44.100 Hz | 22 kHz | Used for music CDs (MP3, MPEG-1 VCD, SVCD). |
| 48.000 Hz | 24 kHz | Used by digital video equipment and movies |
| 88.200 Hz | 44 kHz | Used by some professional recording equipment |
| 96.000 Hz | 48 kHz | Used by Blu-ray audio tracks, HD DVD audio tracks. |
| 176.400 Hz | 88,2 kHz | Used in professional applications for CD and HDCD. |
| 192.000 Hz | 96 kHz | Used on professional video LPCM DVD, Blu-ray, HD DVD |
| 352.800 Hz | 176,4 kHz | Used for Super Audio CDs. Digital eXtreme Definition. |
| 384.000 Hz | 192 kHz | Highest sample rate available for common software. |
Create dataset
youtube-dl –extract-audio –audio-format mp3 https://www.youtube.com/watch?v=TCudWnNMr0s
Amplitude parameter (pixel intensity)
- scale: linear or Decibel
Example
filename = 'my_sound.wav'
y, sr = librosa.load(filename)
References
- https://fastaudio.github.io/
- https://www.youtube.com/c/ValerioVelardoTheSoundofAI/videos