20: Phase Vocoder (part 3), C++ Real-Time Audio Programming with Bela

  Рет қаралды 5,298

Bela Platform

Bela Platform

Күн бұрын

Lecture 20 of C++ Real-Time Audio Programming with Bela. This final part of a mini-series on the phase vocoder explains how to manipulate audio signals in the frequency domain. You will make a pitch shifter and other FFT-based effects.
In this lecture:
Section 1: The phase vocoder signal chain 0:00
Section 2: Robotisation 5:50
Section 3: Whisperisation 23:22
Section 4: Pitch shifting 29:01
Section 5: Improving robotisation 50:32
Example code and companion materials:
github.com/BelaPlatform/bela-...
This course is a deep dive into how Bela processes data, and how to implement real-time audio and interaction. If you want to learn or improve your skills with C++ and audio programming, this is a great way to start!
Bela is the open-source platform for creating beautiful interaction. If you’d like to get a Bela system to follow along with these lectures, visit shop.bela.io. Use promo code CREATEATHOME for £10 off Bela and Bela Mini Starter Kits.
Music by Vula Viel (vulaviel.com).

Пікірлер: 26
@sqfx744
@sqfx744 3 жыл бұрын
Awesome series!! Respect for teaching all this stuff. I like to see people like you making this information more accessible.
@MoXyiD
@MoXyiD 3 жыл бұрын
Woa! THIS IS NEW STUFF! Ill check it out tomorrow ^^
@jeyko666
@jeyko666 Жыл бұрын
loved the series, thank you so much!
@FroodyColluphid
@FroodyColluphid 2 жыл бұрын
Amazing tutorial! It’s worth noting that Antares Auto-Tune actually operates in the time domain using the TD-PSOLA algorithm for pitch shifting.
@Toste1041
@Toste1041 3 жыл бұрын
The k prime calculated in the synthesis step 2 (around @38:45) does not contribute the output synthesis phase since it will be cancelled out the in step 4 with the phi_rs calculated in step 3. Also checked the code and had the same conclusion. Maybe I missed something? Thanks!
@wojtekpilwinski
@wojtekpilwinski 3 жыл бұрын
Hello! I have one more question. Hopefuly anybody could help me. The phase vocoder implementation in this video works perfect, but there are some phase issues in my case. When I change pitch slider everything seems to be OK, but when I slider back to default position (pitch ratio = 1) then the sound is no more the same as original. Not sure how to explain it. The sound has original pitch, but it sounds like there are some phase cancelations or some filter applied. I found that solution is to zero all lastInputPhases and lastOutputPhases everytime I change the pitch slider position. Now everytime I set back pitch slider to it's default value the sound seems to be the same as original. But the problem with my solution is that during moving pitch slider the sound is unpleasent. So I suppose I should update in someway the last input and output phases but not with simply zero. But can't figure it out how to do that properly. For any help reat thanks in advance. Best Regards
@boggo3848
@boggo3848 3 жыл бұрын
That is not surprising, because the moment the effect starts to run you are only updating the phases as an estimate of the real ones, so over time they will drift. If you jump suddenly to a different location all bets are off on the phase so it will probably sound smeared. This is why to get the most out of the phase vocoder you have to pick appropriate times to reset phase back to what it is in the original signal rather than update it in the running state. For the pitch shifter there isn't a really good rule of thumb for when to resync the phase, which is why the phase vocoder pitch shift kind of always sounds a bit smeared.
@wojtekpilwinski
@wojtekpilwinski 3 жыл бұрын
Hello, great thanks for that video. But I have some important question. When you implemented slider for changing hop size there is some misunderstanding for me concern to hop size/window size/fft size. When for example set slider to min value 64 then the window length is 64*4=256. But in process_fft when copying to unwrapped buffer you use fft size loop and call window for whole fft size range which is 1024. So it use values from outside of window range. How it is work? Is it appropriate? For any help great thanks in advance. Best Regards.
@apm414
@apm414 3 жыл бұрын
If you're talking about the robotisation effect then yes you do end up with some funny combinations of parameters you wouldn't otherwise encounter in the phase vocoder. That's because the drone-like distortion of those settings is precisely the effect we're trying to create. But as a more general point, it's not unusual to have an FFT that is longer than the window. For example you could have a window that wasn't a power of 2, and the FFT would usually be the power of 2 about that (for efficiency reasons). You can still reconstruct the original (shorter) window with more FFT bins than you strictly need. The real problem is the other way around, when your FFT is shorter than the window, and that's where the bad distortion in the robotisation effect starts to creep in.
@wojtekpilwinski
@wojtekpilwinski 3 жыл бұрын
​@@apm414 Great thanks for your answer. Sorry I forgot mention that my question was concern to whisperisation effect, but it is also relevant for robotisation. And yes I understand the problem when FFT is shorter than window. But my question concerns exactly to situation when FFT is longer than window. And I am stil not sure if I understand it. Let's say we start with "ideal" size where FFT is the same as window, and hop length is equal to FFT length devide by 4. (let's say FFTsize=winSize=1024, and hopSize=256). So we have precalculated window vector with size 1024. So all members of that vector is filled with relevant values. And now let's shrink the length of hopSize to let's say 128. So we need to update winSize to hopSize*4=512. And we now we calculate again all data in window vector, but only for first 512 members of it, and the rest of members still have data from window size 1024. And in your cose we still use that old data which now are out of range. Should't we avoid that in some way? I am asking because I am writing my own code and I want to avoid using additional thread for fft, so for efficiency I want data to be collected in unwrappedBuffer all the time to avoid additional fftSize loop in fft procedure. But I meet some problems with windowing. And that is one among some other issues.
@vincentbrigand8440
@vincentbrigand8440 8 ай бұрын
Hello Great tutorial. Would this be possible you explain how to make a timestretch algorithm using phase vocoder? Thanks in advance
@oromoiluig
@oromoiluig 8 ай бұрын
Timestretching can be achieved by resampling the output of a pitch shifter. E.g.: to double the duration do the following: - pitch up by one octave (2x frequency) - upsample the output by 2x - play back the upsampled output at the input sampling rate You can achieve arbitrary timestretching ratios by using fractional ratios in the pitch shifter and resampler. Keep in mind that constant-rate timestretching in real-time is kind of impossible or unusable because you either run out of input data if you are speeding up or start building up an ever increasing delay if you are slowing down.
@Nicole_Tanner
@Nicole_Tanner 2 жыл бұрын
Thank you for this. I still can't get my head around why so many other sources require an interpolation step but we don't need one here.
@apm414
@apm414 2 жыл бұрын
There are effectively two ways of pitch shifting with the phase vocoder. One is to manipulate the frequency components directly like this example does. The other is to time stretch the signal, keeping its pitch constant, then resample it to get back to the original speed (but with a different pitch). That approach requires interpolation.
@Nicole_Tanner
@Nicole_Tanner 2 жыл бұрын
@@apm414 Thank you for your answer. I am wondering why anyone would do the interpolation version, this one here seems quite a bit easier to me. Is there any difference in quality?
@sqfx744
@sqfx744 3 жыл бұрын
Question, for anyone really. I notice that with my pitch shifter, the fundamental frequency of the FFT size (mine is 1024, it's close to an F) becomes audible in the output. Does anyone have any idea why? Every note I play sounds like some frequency modulated note with F, even simple sine waves.
@apm414
@apm414 3 жыл бұрын
You're probably hearing the period of the hop size, rather than the FFT size, though you could check this by changing the hop size to see if the effect changes. There are a lot of reasons that could happen, but basically, something is probably not right in either the phase calculations or the windowing, leading to an effect similar to robotisation where the phase is deliberately reset each hop. In the example, the windowing is done for you, so probably it has something to do with the phase reconstruction. Check that you have a static (or global) array of type float to hold the output phases and that you're updating it properly each hop, following the solution in the video. Then check the implementation of the specific equations. You might see whether the pitch is present when there should be no pitch shift, or only once you try changing the pitch.
@sqfx744
@sqfx744 3 жыл бұрын
@@apm414 Thank you for the reply! I'm still stuck, but you were right that I was hearing the frequency of the hop size - changing it changed the harmonic. So like you suggested I revisited my code and it translates pretty well with the solution. The reason I say this is that I am using visual studio (not Bela) and using a VST setup so there are some small differences, but I think it all should be pretty much the same. I.e. I cannot use the auxiliary task feature and I may be incrementing the Write Out pointer after when I should (right now I increment it by 1 hop size after I copy from the FFT buffer to the out buffer). To answer your last question, the pitch is present when there should be no pitch shift, and stays constant regardless of the input (meaning the input is pitched but the bad pitch remains). If I remove the final window (I know I shouldn't) there is no bad pitch present until I touch the pitch knob, and then it sounds worse than before with or without the knob. Does this still sound like a phase problem to you, or more like a window one? I appreciate your time.
@sqfx744
@sqfx744 3 жыл бұрын
I found that the "solution" to my problem was that my hop size was half the size of the FFT size. My FFT size was 512 and my hop size was 256. When I changed the FFT size to 1024 and hop size to 128 like in the video, the sound became much clearer, at the expense of performance. Thanks for the ideas, Andrew.
@apm414
@apm414 3 жыл бұрын
@@sqfx744 Hard to say since the underlying platform (and therefore the threading structure) is different, but what I would do is try doing a straight passthrough in the frequency domain (no pitch shift code at all, just FFT --> IFFT), but with analysis and synthesis windows. That will trace the problem to either the overlap-add code or the phase vocoder code.
@sqfx744
@sqfx744 3 жыл бұрын
@@apm414 Thanks for your reply. I figured it out, my FFT and Hop Sizes needed to be adjusted, after that it sounded decent. I have realized that this technique is too computationally expensive and too phase-y sounding for me, so currently I'm working on a time domain one. I'm sticking with the time domain and trying out a method where I find the period by collecting samples approx. 0. Then I am changing the "size" of buffer so that when I reach the end I just back continuously to the beginning, using some trig stuff. I think my problem is when I get a new buffer I get clicks (I think the phase is off). Anyway, thanks for the help.
@rec-trick
@rec-trick Жыл бұрын
this library have autotune ?
21: ARM Assembly Language,  C++ Real-Time Audio Programming with Bela
1:00:46
1: Real Time, C++ Real-Time Audio Programming with Bela
41:24
Bela Platform
Рет қаралды 15 М.
Best father #shorts by Secret Vlog
00:18
Secret Vlog
Рет қаралды 22 МЛН
HAPPY BIRTHDAY @mozabrick 🎉 #cat #funny
00:36
SOFIADELMONSTRO
Рет қаралды 17 МЛН
New model rc bird unboxing and testing
00:10
Ruhul Shorts
Рет қаралды 23 МЛН
Four Ways To Write A Pitch-Shifter - Geraint Luff - ADC22
43:32
ADC - Audio Developer Conference
Рет қаралды 10 М.
Scientific Concepts You're Taught in School Which are Actually Wrong
14:36
Making a Pitch Shifter
16:13
JentGent
Рет қаралды 69 М.
"Let's Write A Reverb" || Geraint Luff
40:01
The Audio Programmer
Рет қаралды 11 М.
But, what is Virtual Memory?
20:11
Tech With Nikola
Рет қаралды 237 М.
15: MIDI part 1, C++ Real-Time Audio Programming with Bela
50:53
Bela Platform
Рет қаралды 6 М.
is this what the left wants?
9:41
Sisyphus 55
Рет қаралды 4,6 М.
How Bugatti's New Electric Motor Bends Physics
9:26
Ziroth
Рет қаралды 7 М.
Самые крутые школьные гаджеты
0:49
НЕ ПОКУПАЙ СМАРТФОН, ПОКА НЕ УЗНАЕШЬ ЭТО! Не ошибись с выбором…
15:23
Что не так с раскладушками? #samsung #fold
0:42
Не шарю!
Рет қаралды 213 М.
Сколько реально стоит ПК Величайшего?
0:37