Tips: You can transform your device's audio output into a "microphone" on Windows, so you don't need to place your headphones over your microphone. 1. Press Windows key + R -> type "mmsys.cpl" 2. In the Recording tab, enable the Stereo Mix option. Now, "Stereo Mix" is an available microphone option! You can select it as the audio input.
@weekendmakeit77608 ай бұрын
this really helped me! Thank you!
@aoeu2567 ай бұрын
this a grewt idea, i was using voice meeter as a virtual audio thingy and its complicated to use
@OliNorwell7 ай бұрын
Epic! - These videos are some of the best stuff on KZbin - love the idea with the image generation at the end
@theraybae8 ай бұрын
This is amazing and inspiring. I love the ending of the video and can’t wait for Wednesday. As a dyslexic person I think you unlocked a new use case for learning.
@MultiBigkush5 ай бұрын
Code: import os import time import wave import pyaudio from faster_whisper import WhisperModel # Определяем константы NEON_GREEN = '\033[32m' RESET_COLOR = '\033[0m' os.environ["KMP_DUPLICATE_LIB_OK"]="TRUE" # Функция для записи аудио-фрагмента def record_chunk(p, stream, file_path, chunk_length=1): """ Записывает аудиофрагмент в файл. Args: p (pyaudio.PyAudio): Объект PyAudio. stream (pyaudio.Stream): Поток PyAudio. file_path (str): Путь к файлу, куда будет записан аудиофрагмент. chunk_length (int): Длина аудиофрагмента в секундах. Returns: None """ frames = [] for _ in range(0, int(16000 / 1024 * chunk_length)): data = stream.read(1024) frames.append(data) wf = wave.open(file_path, 'wb') wf.setnchannels(1) wf.setsampwidth(p.get_sample_size(pyaudio.paInt16)) wf.setframerate(16000) wf.writeframes(b''.join(frames)) wf.close() def transcribe_chunk(model, file_path): segments, info = model.transcribe(file_path, beam_size=7) transcription = ''.join(segment.text for segment in segments) return transcription def main2(): """ Основная функция программы. """ # Выбираем модель Whisper model = WhisperModel("medium", device="cuda", compute_type="float16") # Инициализируем PyAudio p = pyaudio.PyAudio() # Открываем поток записи stream = p.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024) # Инициализируем пустую строку для накопления транскрипций accumulated_transcription = "" try: while True: # Записываем аудиофрагмент chunk_file = "temp_chunk.wav" record_chunk(p, stream, chunk_file) # Транскрибируем аудиофрагмент transcription = transcribe_chunk(model, chunk_file) print(NEON_GREEN + transcription + RESET_COLOR) # Удаляем временный файл os.remove(chunk_file) # Добавляем новую транскрипцию к накопленной транскрипции accumulated_transcription += transcription + " " except KeyboardInterrupt: print("Stopping...") # Записываем накопленную транскрипцию в лог-файл with open("log.txt", "w") as log_file: log_file.write(accumulated_transcription) finally: print("LOG" + accumulated_transcription) # Закрываем поток записи stream.stop_stream() stream.close() # Останавливаем PyAudio p.terminate() if __name__ == "__main__": main2()
@josequinonez29413 ай бұрын
I LOVE U
@shravanhegde22373 ай бұрын
@@josequinonez2941 can u share me the code in engish?
@AkashDesai-ef9mk2 ай бұрын
thankss
@saarza9991Ай бұрын
Ok but when I run this it just runs and ends. What to do? I'm new to this plz help
@keeganreeve21 күн бұрын
Спсб очень! )) у тебя гитхаб? я б хотел найти твою страницу
@filipphenderson63427 ай бұрын
Pulling in people with a flashy thumbnail of a Python code that works and then trying to monetize your code based on a library that is already supposed to be open source is in my opinion bs. it is not fair for beginners that might not know Python or whisper very well. for that I give you a thumbs down!
@christianmccauley7340Ай бұрын
Wow, an AI channel scamming people? Who would’ve ever heard of such a thing! Tired of the fucking grifternet man, how did this happen?
@jbtesla358129 күн бұрын
for real this is a fking scam, the code is in gifthuf free
@ReadyMedia-no8 ай бұрын
There is a product for Live video Transcription there. Live text services are expensive and does not work on many current languages.. Set up a server/service that will ingest a RTMP video source, delay the video and overlay text on video in perfect sync. then offer RTMP output with burned in Live text. :) There is need for this service.
@svenborgers69087 ай бұрын
I have tried to get this to run on M1 MacBook. No joy. The CPU maxes out even with the tiny model. But then I tried with the Whisper.cpp implementation which is compiled for apple silicon. I found a whisper-cpp-python wrapper for that library. That actually runs and is far less CPU bound. It has a bit of a stutter, it is not as clean, it misses words between the chunk processing but you can see that with just a little bit more power it could work.
@MrThaitrinh6 ай бұрын
Hi Seven, could you please share your code with me? Thank you very much!
@ryanjames39078 ай бұрын
wow !! great video !!! Thank you for being so generous and teaching this to us, this is epic stuff! I can already start see all kinds of use cases, I cant wait to get it running, I'm really looking forward to Wednesday's video . Thanks again from Canada
@cristobalmunoz842 ай бұрын
Nice video!! thanks for your help in this topics!!
@ArmandoMenicacci8 ай бұрын
Fantastic !!! A bit fast in explaining and showing, but I can always pause!
@benscottbongiben8 ай бұрын
Good to see transcription and generate responses as audio in real-time for phone call
@reddyparthu59786 ай бұрын
how to get the code for this?
@enesgul29708 ай бұрын
Gerçekten çok iyisiniz.
@bigswede882 ай бұрын
Heja Sverige ! Bra jobbat
@ferluisch4 ай бұрын
Hey man this is really cool! I'd like to know if you: 1) used the whisper v3 model? or the v2? 2) If you have seen the demos from gpt4, they also showed that gpt ASR is better than whisper v3, wonder if it will be open like whisper.
@HammerOnTheNet8 ай бұрын
Amazing and inspiring work! Kris what about something less powerful but better accessible in terms of hardware?
@unrealminigolf40158 ай бұрын
Awesome bro! ❤
@calvinapollos5 ай бұрын
Great video! Thanks for going through this in such an easy-to-understand way! Can you share the python scripts?
@maizizhamdo5 ай бұрын
i love your videos man , please video about fastwhisper on docker api please
@aoeu2567 ай бұрын
This will be a good tool for language immersion chinese / japanese / indonesian along with the deepl clipboard tool, edge browsers tts engine.
@AdrianC2006Uk4 ай бұрын
That image gen project was pukka!
@ShariqueAMКүн бұрын
I want to do speech to text Audio from the browser speaker and not from the mic , how can we do that in real time ?
@prakashsahu-xn6qyАй бұрын
how can i get this code which you used in this videos same code i need.
@kimsteinhaug8 ай бұрын
Interesting stuff on the image creation at the end while talking, not sure if you are taking into consideration puctuation in you sentences? Im pretty sure this would have to do with something cool, maby keeping an overview of all the text that has been moving out of the "buffer" for style ? Looks like something I could have a lot of fun with, do not have the GPU though :/ Colab however.
@hjoseph7772 ай бұрын
I have been looking where to start, fantastic work, where can I have the code for testing
@110gotrek8 ай бұрын
Now make it translate and do phone-cals
@rne12238 ай бұрын
Noooo…pls nooo. We got plenty auto callers already.
@ibrahimelshenhapy91797 ай бұрын
@@rne1223 Where?
@luluw9699Күн бұрын
Hello ur computer has a virus
@huhaifan2 күн бұрын
cannot find the code in github
@martinvizar64307 ай бұрын
Impresario thank you
@magnoliasphinkter862218 күн бұрын
thanks this is great! Where can I find the actual code you have on your screen? Struggling to find it on the github
@thedoctor54788 ай бұрын
I think there's an even faster whisper module but I forget what it's called
@AustinKang-wk8cl8 күн бұрын
did you find out?
@mattaylor-qg4yw5 ай бұрын
just joined. would be good to get my grubby paws on the files for this.
@t-dsai8 ай бұрын
Thanks for sharing your knowledge/experience. I'm bit perplexed. The description here mentions 45+ prompts in the PDF book, the newsletter website says 40+, and the PDF doc says 35+. Which number is correct?
@aseel69105 ай бұрын
If there any way to translate this text to another languages it will be awesome
@mujahidali2369Ай бұрын
welldone
@عبدالرحيمعبدالرحيم-غ5غ8 ай бұрын
could you do another demo to see how it can translate in real time?
@gregh74578 ай бұрын
yes! there are no really good or fast translation apps available. KZbin auto translate is horrible!
@eliasbosc12 күн бұрын
Can you pls share you code?
@ItsNsour5 ай бұрын
can it translate?
@himanshujaviya60214 ай бұрын
Can we get the code used in this video that would be really helpful
@claudiobalderrama15996 ай бұрын
Do you think this could be used to transcribe, for example, phone calls made through the browser? I would greatly appreciate your response :)
@kebman8 ай бұрын
The sentiment analysis really scares me. I mean, there's absolutely no chance that'll be abused by big tech in terms of political marketing. I mean, like, there's no way in hell right?
@George-kx8fl7 ай бұрын
Would it be possible to do speaker recognition then pipe it into translation
@jotixh5 ай бұрын
Is there a way to connect a live streaming url?
@leucome8 ай бұрын
Faster whisper and Insanely Fast Whisper don't seem to have AMD gpu support yet. So I had to go with an alternative for the 7900xt. I used wishper.cpp with cuda/HIP + distilled whisper model. Seriously this combination is kinda real-time too, even when using the distil large v2. Though there is a downside to that, the TTS and Whisper on the GPU gobble up like 8GB or vram. This put some limit to the LLM model I can use at same time.
@maxstauss95794 ай бұрын
i cant find the script of the realtime translation pls help me finding it :((
@gmazuelАй бұрын
Where can find the code .
@lutusp8 ай бұрын
Hey, it's in your video description, therefore easily fixed: the word is "transcription". Why not avoid the irony of a video that extols modern AI voice to text ... transcription ... in which the AI engine will surely avoid this mistake, and at the speed of light.
@agardner-to7vi2 ай бұрын
that is awesome. Sooo i am trying to do something like this. My sister is deaf and i want something that can also just label the who is speaking. So for a small group it will say user 1 user 2 user 3. and who ever is speaking it will let person know. Do you think that is possible.. How could i do that. I got everything but that last part.
@maverick19017 ай бұрын
running fully local is one thing ... doing this via webaudio api towards a backend is a different topic - is there any implementation for that as well foreseen?
@kebman8 ай бұрын
I might be jaded but... I mean really, how about an AI that calculates the probability of drone attacks or artillery attacks? How about an AI that calculates the probability of soldiers hiding in terrain? I mean, there are already good search algorithms out there, that one may-or-may-not use to carry out artillery strikes. I'm just thinking aloud here. Probably nothing.
@saqqara63613 ай бұрын
how to access your sourcecode as a paid channel member?
@crazyforhyunwoo1196 ай бұрын
Can I did this with javascript?
@AlexPopov-hv3kp4 ай бұрын
what is a transcribe_chunk function in the code? Seems that it's not from faster_whisper?
@danielgh48147 ай бұрын
Hi, I'm a subscriber but I do not have access to your github ,can you helpme please?
@RicardoMaciasYepez69134 ай бұрын
Can this run on raspberry pi?
@thnmanucian79935 ай бұрын
Hello. I’m beginner in this major. How can I get your code to refer? Thank you
@kate-pt2ny8 ай бұрын
Kris, you are a genius. Real-time speech transcription can do a lot of things. The last example is great. I can’t wait to watch the video released on Wednesday. My computer is a Mac M chip computer. I found the code in your github and changed it to run on the CPU. Later, some problems occurred, such as incomplete transcribed content and OSError. Can you release a version suitable for Mac computers? grateful
@Siri-tz7dz5 ай бұрын
where do i get the setup/python code
@vallu-Tech6 ай бұрын
Bro can you put th video about live streaming voice to text
@MiguelCayazaya2 ай бұрын
pip install patience and kindness
@isaacmasinde1994Ай бұрын
Which gpu are you using ?
@Edward_ZS8 ай бұрын
Has anyone updated the code from the previous video to use this recording method instead?
@henrijohnson77797 ай бұрын
@Kris : I already joined as an Adept member on Jan 18th 2024 and requested access to the Github Repo via email and also via Discord but have not had any response from you yet ?
@ytemre6 ай бұрын
I became a member how do I get access to the code and the github for this
@AllAboutAI6 ай бұрын
hello :D send me a e-mail at kris@allabtai.com
@haloBean5 ай бұрын
Hi, Can get the github repo of the above code ? Thanks
@digitalsoultech8 ай бұрын
The accuracy sucks. Many words are incorrect which you can see in the image itself. This isn't usable in the real world.
@ahmedelkamash93235 ай бұрын
how can we download this script?
@kylebolt58618 ай бұрын
How do we join your community?
@AllAboutAI8 ай бұрын
Link in desc :) youtube member
@najafzawar81688 ай бұрын
@@AllAboutAI just subscribed to your channel but not getting GitHub code..
@joaopaulonadal84848 ай бұрын
How can i get acess to this code?
@erenkaraboga85707 ай бұрын
Can we take source code ?
@Onlyindianpj2 ай бұрын
This is Presentation not tutorial
@TonyHoangPodcast5 ай бұрын
does it support speaker diairzation?
@ShariqueAMКүн бұрын
I want to do speech to text Audio from the browser speaker and not from the mic , how can we do that in real time ?
@avgplayer8 ай бұрын
Waiting for the in deep video :) Btw your discord invite link is expired.
@slimshady91batАй бұрын
ma è gratuito?
@nusretalikok8238 ай бұрын
where can we find the code that you used?
@crazyforhyunwoo1196 ай бұрын
github linked in the description
@maxstauss48214 ай бұрын
iam a member but i cant acces the github pls HELP
@maxstauss48214 ай бұрын
this i my github maxaxaxaxxaxaxaax
@curtisnewton8957 ай бұрын
transcriPtion
@fufu93526 ай бұрын
Zero latency? I have been check your video timeline. terminal output and audio is not correspond. you must be living a world 1-2 second ahead our timeline. 😅
@AlphaScraperOne6 ай бұрын
🧡
@ramadanhasan15748 ай бұрын
Where is the link to this source code ? Thanks amazing
@nafila50847 ай бұрын
did you get the code
@ramadanhasan15746 ай бұрын
no @@nafila5084
@KaMingLeung-kk6ey5 ай бұрын
@@nafila5084 Can share the code to me as well?
@Velnio_Išpera6 ай бұрын
Can you use different languages?
@tharosen-g4q8 ай бұрын
🎈
@vaibhavmishra11007 ай бұрын
can you tell me the solution of this error : Could not load library cudnn_ops_infer64_8.dll. Error code 126 Please make sure cudnn_ops_infer64_8.dll is in your library path!
@劉育安6 ай бұрын
try "pip install nvidia-cudnn-cu12"
@vaibhavmishra11006 ай бұрын
its didnt work@@劉育安
@HungBui-r7z5 ай бұрын
I have registered as a member, please check your email
@nouriensha287313 күн бұрын
Can i convert this code to cpp and implement using Arduino without api
@harshitsingh30618 ай бұрын
where can we get the code
@crazyforhyunwoo1196 ай бұрын
github linked in the description.
@abdurrahmankeskin37163 ай бұрын
how to get the code for this?
@ScaryLasers2 ай бұрын
how do i get access to the github?? TAKE MY MONEY! lol no but seriously how
@thebigbigdaddy8 ай бұрын
how can we identify different speakers?
@ickorling73286 ай бұрын
Microsoft co-pilot in a teams call recording transcription. Cant simply call, needs to he a meeting call... subtle difference. Try 'meet now' in teams calender view, or make calendar event.
@royzac78297 ай бұрын
How does the transcription performance compare to assemblyAI?
@fredericpaillot25708 ай бұрын
Hi Kris! I love what you do, I would like to become a member of your channel, but I can't access the page to subscribe, do you have a direct link? the one in description doesn't work for me.. have a good day!
@MarxOrx8 ай бұрын
BROOOO 🎉 FIRST
@rahar60096 ай бұрын
It is bs to make an open source code monetized! So sorry for you and your kinds... unsubs.