Really cool stuff! Can't wait to finish the series.
@MrDari884 жыл бұрын
Amazing material amazingly explained. You are doing great, Valerio. Thanks for making audio processing fun and easy.
@ValerioVelardoTheSoundofAI4 жыл бұрын
I'm happy you like the series!
@wisteriahu45512 жыл бұрын
I am doing my IB Math final IA on this and had no idea how to code, this really clear and extremely well explained video literally saved my life, thank you so much
@shivenlak Жыл бұрын
Do you have any tips, I am doing my IA on this right now. Just starting, confused, please help!
@kalagaarun96384 жыл бұрын
really a life saver... thanks for uploading this sir 👏🏻 keep posting great stuff like these 🙌🏻
@ValerioVelardoTheSoundofAI4 жыл бұрын
Thank you Kalaga!
@qingyuliu1176 Жыл бұрын
Good Video!!! I've learned a lot in this video and written many notes and codes these days. As an EE student, it helps me understand many courses better such as Signals & Systems and DSP.
@i_am-ki_m2 жыл бұрын
So intuitive, potential but since basics and nicely video. Go ahead!
@dimitrijemitic4974 жыл бұрын
Hi Valerio, great video as always, you described the topic in a interesting and intuitive way...it would be great if you can explain Mel Spectrograms and MFCCs the same way you explained Fourier Transform :D
@ValerioVelardoTheSoundofAI4 жыл бұрын
Thank you! I'm planning on doing that after I've tackled the Short-Time Fourier Transform over the next couple of videos.
@hoomansaadatmand8295 ай бұрын
perfect video. great explanation.
@jimcrowjoe4512 жыл бұрын
Thanks Valerio. The frequency is accurate but how do you get the correct amplitude when doing the FFT?
@Underscore_12346 ай бұрын
Definitly useful, I was familiar with the theory but didn't know the libraries. I'm gonna check out your video for the spectrograms! (you got a new subscription here!) Although, in the theory, (if I remember well), the fourier transform phase is related to the phase shift of each sinusoids. Do you know any application where it is used or is it always tossed away? (for instance, is the magnitude distribution enough to recognize an instrument or does the phase help?)
@ValerioVelardoTheSoundofAI6 ай бұрын
Magnitude is enough for analysis, Phase is necessary for audio generation.
@НиколайНовичков-е1э Жыл бұрын
Thank you!
@hardypatel46654 жыл бұрын
Sir, why magnitude spectrum mirrors after nyquist frequency?
@ValerioVelardoTheSoundofAI4 жыл бұрын
This is a bit of a tricky topic that warrants some time to explain. I suggest you to check out this resource that provides a thorough explanation www-elsa.physik.uni-bonn.de/~dieckman/DFT/DFT.html In a nutshell, the complex numbers in the right side of the spectrum are complex conjugates of those in the left side. This determines the typical mirror symmetry we see in the examples.
@hardypatel46654 жыл бұрын
@@ValerioVelardoTheSoundofAI Thank you Sir :)
@likestomeasurestuff35542 жыл бұрын
Thank you, very helpful!
@hoomansaadatmand8295 ай бұрын
I am browsing about FFT to figure out if it works for my thesis or not. I have temperature history for about 2 hours in 3d printers( metal => DED). I was looking for benefits from FFT to predict final residual stress. do you think could FFT be related to my topic? Thanks
@omidbagheri91592 жыл бұрын
Thanks for your valuable tranning. I really appriciate it. I have a question about enviroment sound processing. I want to seprate signals fromenviroment and recognize what we heared. Can you show me a roadmap to find out the solution? Thanks
@chahinezhigoun10784 жыл бұрын
Thanks for the video
@ValerioVelardoTheSoundofAI4 жыл бұрын
You're welcome!
@saurabhdeshmukh2182 Жыл бұрын
Valerio, can you please explain: why is the len(magnitude_spectrum_violin) is equal to 59,772 I thought it should be equal to sample rate. because there should be one complex number for every frequency
@alpcnar58772 жыл бұрын
how can you determine the frequency axis: np.linspace(0,sr,len(magnitude_spectrum) , how can you know this is right? np.linspace(1000,4000,len(magnitude_spectrum) ? gives the same , why we start with 0 and end with sr ?
@fardalakter4395 Жыл бұрын
sir or everyone, i have question. in around 9:00, 59772 is not the power of 2 which fft would be more efficient, is that ok ? why don't we use dft ?
@ektabajaj16833 жыл бұрын
Valerio, can you please explain how to extract fundamental frequency taking the help of fourier transform using python i.e. I mean how to check the peak values....
@ValerioVelardoTheSoundofAI3 жыл бұрын
I'll definitely cover pitch detection in the future. Stay tuned :)
@ektabajaj16833 жыл бұрын
@@ValerioVelardoTheSoundofAI thank you.
@michaelswanson31623 жыл бұрын
cant you normalize the output of the fft so that the magnitude spectra levels match the time signal? something like this: Xmag = 2 * np.abs(ft) / N # normalization by 2/N so that the magnitude spectrum shows the estimated amplitudes of the input signal where I think N = length of signal in samples. I saw this in a different video, maybe you can elaborate? thanks! really helpful videos
@tuffCOOKIEanimations3 жыл бұрын
Can you show how to do this on signal express?
@rhwood11543 жыл бұрын
OMG yes that would be amazing!
@godfather_19943 жыл бұрын
how can i compare two spectrums from a sound of a word that i've recorded and from an extracted sound from a video and see how many times the word that i've said is on the video ? i think coherence is the solution
@nezardasan50154 жыл бұрын
Hallo Valerio, thank you very much for your usful work its realy great, my quetion is: why you use in numpy for fourier np.fft.fft and not only use np.fft jus once...?
@zero44333 жыл бұрын
He explain this at 8:26
@ajayshriram91862 жыл бұрын
Does any know at 7:30, what the number 4 signifies (in 'violin_c4'). Same goes with the number 5 in 'piano_c5'. I am really sorry to ask, I am a complete beginner to coding.
@alpcnar58772 жыл бұрын
Hey N is not power of 2 ? can you help ?
@canernm3 жыл бұрын
Thank you for the videos. Something cofuses me though, i'd really appreciate it if you could give me an advice here: the audios we load with librosa.load(), returns an array which i suppose is the audio signal. However, is this a digital signal? Has it already undergone the process of sampling and quantization? Thanks!
@ValerioVelardoTheSoundofAI3 жыл бұрын
Yes, with librosa.load() you get back the waveform and the sample rate. The starting audio file (wav or mp3) is already sampled and quantised. In other words the file has already undergone the process of sampling / quantisation. When using librosa.load() it is possible to re-sample the signal. If you don't pass any sample rate, the signal will be converted to 22050Hz.
@canernm3 жыл бұрын
@@ValerioVelardoTheSoundofAI thanks a lot. I realize now that what I said was a bit nonsense. Since we work with computer , the signal would of course be digital! Thanks a lot
@AminKiany3 жыл бұрын
Hallo Valerio, Thanks for the useful material. I was trying to follow your code step by step. At 7':25'' you were explaining the dimension of the short time Fourier Transform. The first dimension is the #frequency bins = (Frame Size)/2 + 1 which is inline with what librosa returns. However, for the seconde dimension, I got (174943 - 2048)/512 + 1 = 339 which is not equal to 342 derived from the output of the librosa SFTT. (here 174943 is number of samples, 2048 is the frame size, and 512 is the hop size). Do you have any comment about this discrepancy? Or I miss something here? Thank you in advance
@prabhuramnagarajan1893 Жыл бұрын
(174943 )/512 + 1 equal to 342
@frederiksidenius Жыл бұрын
Thanks a lot for the great videos! They are well-made and very informative, however, I can’t help noticing that your frequency axis is wrong. It’s a minor inaccuracy but the way you define frequency with ‘np.linspace(0, sr, len(magnitude_spectrum))’ is wrong. The frequency resolution should be ‘sr/len(magnitude_spectrum)’ which you can achieve for example either with ‘np.linspace(0, sr, len(magnitude_spectrum) + 1)[:-1]’ or ‘np.linspace(0, sr - sr/len(magnitude_spectrum), len(magnitude_spectrum))’. In short, the DFT returns N frequency bins from 0 to N-1 and therefore no bin is equal to the sample rate. As I said it is a minor error, however, this would be the accurate way to do it.
@Lhtokbgkmvfknv4 жыл бұрын
Hi Valerio, Personally I find your work very helpful so I wanna thank you very much for all that you've been doing. I have a question about load.librosa. I realized you didn't explicitly set the sampling rate to 44100 which is the sr for the audio files used in the notebook and the default value for sr in the function is 22050. I want to ask if you did it on purpose as it makes no difference or it is just because you didn't pay attention to it?
@ValerioVelardoTheSoundofAI4 жыл бұрын
Good catch Imece :) 22050Hz is a reasonable default to use in audio/music processing, so I didn't bother specifying a custom sampling rate. You should definitely treat this value as a hyper-parameter, when you're optimising ML algorithms.
@piasroy36293 жыл бұрын
what is frequency bins? is it a range or a single frequency that is equally distributed in the (0, sample_rate) range?
@ValerioVelardoTheSoundofAI3 жыл бұрын
It's a range of frequencies.
@piasroy36293 жыл бұрын
@@ValerioVelardoTheSoundofAI so the size of each frequency bins is len(magnitude_spectrum)?
@PranilPatil-l7t7 ай бұрын
Hi, please give notation meaning in the notes with formulae. It would be more convenient to understand
@Bigman740664 жыл бұрын
I would have liked it if you'd manually calculate the first few discrete fourier samples. Just to get a feel for the algorithm...
@ValerioVelardoTheSoundofAI4 жыл бұрын
I think I do something similar in previous videos.
@Bigman740664 жыл бұрын
@@ValerioVelardoTheSoundofAI Can't remember you did... Anyway, just saying... Excellent work
@ValerioVelardoTheSoundofAI4 жыл бұрын
@@Bigman74066 I thought I did -- but honestly I can't remember :)
man, just avoid us all the typing and copy and paste and just comment on the existing code! you wanna make it shortest, not add boring times.... otherwise I am sure the content is good but I am leaving less than 5 mins thru!
@ValerioVelardoTheSoundofAI3 жыл бұрын
Thank you for the feedback. Pedagogy-wise, I find there's more value for the learner in typing line-by-line than commenting on already-written code.
@Goblindad912 жыл бұрын
I disagree with this - hearing what we're doing be explained as each line is typed is super helpful. The pace was good
@yunfan70342 жыл бұрын
Hi Does anyone know when using numpy.fft.fft to get fourier frequency, How does numpy.fft.fft know what sampling rate is to calculate the frequency(Hertz) range? For example, if I have 10 data points 1) for 10 seconds long audio, so the sampling rate is 1 sample/sec, and the the frequency(Hertz) range would be 0, 1/10, 2/10...10/10 Hertz 2)for 5 seconds long audio, so the sampling rate is 2 sample/sec, and the the frequency(Hertz) range would be 0, 1/5, 2/5, 3/5...10/5 Hertz How does numpy.fft.fft know what sample rate is to calculate frequency(Hertz) range? I check the document, not found any parameter about default sampling rate Thank you