Support the channel by buying with these links (affiliate): Official Seeed Studio Store s.click.aliexpress.com/e/_DBfAaP9 s.click.aliexpress.com/e/_DkhpJAR Raspberry Pi 4 s.click.aliexpress.com/e/_DFBtkCP Raspberry Pi 5 s.click.aliexpress.com/e/_DmMSKb5
@dad29794 ай бұрын
I have been following you for a long time and you have done such a fantastic job of crafting your style and keeping your content relevant. Great job Dmitry!
@Hardwareai4 ай бұрын
Thank you for leaving this comment! I'm still refining my style to tell the truth. One of the things I was successful recently (I think) is keeping my videos more to the point, with good flow of information. Now it looks to me I was blabbering way too much in my older videos at times. I cut a lot of stuff now on post-processing if I feel the video is overloaded. I plan to make some more storytelling-oriented robotics content next half-year, stay tuned and see how it goes.
@jackwarner54454 ай бұрын
I'm trying to make an AI voice assistant and would be completely lost without your videos. Thanks so much!
@Hardwareai4 ай бұрын
Glad I could help!
@MrTubertub3 ай бұрын
Hi there, could you please advise what is the best and easiest way to transcribe mp3 files speech recordings to text with no coding experience at all. Thank you
@Hardwareai3 ай бұрын
That's probably besides the topic, but I'd say something like MacWhisper for Mac? And try searching "whisper.cpp gui windows" for windows.
@MarkD-p2h3 ай бұрын
Thank you for sharing your knowledge. I'm trying to do "float16" STT transcription with diarization using WhisperX on an 8GB Pi5, but "the ctranslate2 package does not compile with CUDA support." Per the whisperx readme, I tried to install pytorch v11.8 from the PyTorch pip command, and then I tried the current version, before trying to install whisperx with no joy. Apologies if this is a silly question, but is there a CUDA version that works on a Pi5 GPU (Broadcom VideoCore VII), or must I only use CPU CUDA? What do you recommend? Thanks!
@Hardwareai3 ай бұрын
CUDA is Nvidia hardware specific things, so it will not work without Nvidia GPU :) float16 will not give you any performance benefit for CPU, so use either float32 or int8
@MarkD-p2h3 ай бұрын
@@Hardwareai Thank you so much for your kind reply! I'm learning much and I'm excited to make this project, which will help me greatly in my work in Geneva. ❤
@tomlynn1000Ай бұрын
Is there a written process to follow. I've followed this step by step, but have run into python telling me Illegal Instruction when I run the scripts. Using latest raspbian OS
@HardwareaiАй бұрын
Hello! I enabled issues on my fork here github.com/AIWintermuteAI/WhisperLive/issues feel free to create an issue with detailed problem description there and steps followed!
@wartem20 күн бұрын
How did you manage to get Raspberry Pi 5 to work with ReSpeaker 2-Mics Pi HAT? I'm encountering significant issues with deprecated or changed kernel APIs, channel counts, and header problems. I've tested for example seeed-voicecard's GitHub repository, which is seemly not compatible with newer kernels and APIs. A guide on making this work would be incredibly helpful, and greatly appreciated.
@Hardwareai14 күн бұрын
/sigh/ yes, you're right, almost every major update of Raspberry Pi OS breaks reSpeaker. At the very least you want to start with HinTak's fork, which is more recent github.com/HinTak/seeed-voicecard/issues/28 here is what I found out - try it out and respond on GH if this worked?
@wartem14 күн бұрын
@@Hardwareai Thank you! I'm still testing but everything except the LEDs seems to work now after too many hours of troubleshooting. I will make a fork of HinTak later and share my findings. I can't wait to try your tutorials here on KZbin when I'm done with this. I found that the shell script they refer to via your link has been moved within the same repository. I've had no luck with this script so far, it runs the installation fine but after reboot I get no sign of success when testing different things ("no soundcards found..." etc). I can't comment on GH (HinTak) since I lack the permission needed.
@simplelife44413 ай бұрын
hi there i try run the example_client.py and gives me an error: client = TranscriptionClient( TypeError: TranscriptionClient.__init__() got an unexpected keyword argument 'callback' How can fix it? thanks
@Hardwareai2 ай бұрын
It sounds like you're not actually using my code, but upstream code? Can you create an issue here github.com/AIWintermuteAI/WhisperLive
@АльбертИванов-ц4х2 ай бұрын
thanks for video. is it possible to change language ? in fork or in whisper ?
@Hardwareai2 ай бұрын
You can try with "tiny" model, without ".en" postfix. Granted, multilanguage models are not as precise, so perhaps you will need to use larger models, e.g. base. That would stretch Raspberry Pi capabilities, but should be possible with Raspberry Pi 5?
@ameetkarn3 ай бұрын
This is too good....I think this should fit in directly with one of my project. Do you have any recommendation for real time TTS ?
@Hardwareai3 ай бұрын
Hopefully! I used espeak before for other projects... it is pretty horrible by modern standards, but does its job. For this example I used piper TTS - much better quality, but not as fast as espeak.
@exploring-electronic6 ай бұрын
Thank you for making this follow up!
@Hardwareai6 ай бұрын
Appreciate your support
@justquicker50445 ай бұрын
Thank you so much! You’ve really helped me speed up my project. I normally don’t like and subscribe but I made an exception 🙃. Keep it up!!
@Hardwareai5 ай бұрын
Thank you for your support!
@sarankumarb191110 күн бұрын
Hi, I am getting the below error in server code running terminal INFO:websockets.server:connection open INFO:root:New client connected ERROR:root:Error during new connection initialization: [WinError 2] The system cannot find the file specified And getting like this in example client code running terminal page: [INFO]: * recording [INFO]: Waiting for server ready ... [INFO]: Opened connection [INFO]: Websocket connection closed: 1000: Can you please help me to fix this.
@Hardwareai10 күн бұрын
Hi there! Can you create an issue in my fork of the WhisperLive?
@bystander854 ай бұрын
I've been trying to find a way to make end of speech flag to be more intelligent than just detecting a pause. I find it common that I may have a mental blank, or misspeak, and the delay in my speech incorrectly flags end of speech. It would be interesting if STT systems can continue listening after a pause if it detects an incomplete sentence. Any ideas?
@Hardwareai4 ай бұрын
That's a hard one. I don't think this one is solved even in commercial STT engines - e.g. google assistant or siri. That would require understanding on sentence context. We might be getting somewhere with multi-modal models, such as GPT4o, but I don't think there is anything available to be run on Raspberry Pi format computer. Also, as a shortcut, perhaps it would be possible to either run a classifier or modify whisper model to output probability of sentence being finished... It's just an idea though, finding out how well will it work is another thing entirely.
@shakhizatnurgaliyev93556 ай бұрын
Like!!!Dima, awesome content, what do u think about VOSK API and compare it to Whisper? Great example of PiperTTS. Thank you!
@Hardwareai5 ай бұрын
Thanks, appreciate it! I'll try it out and compare it - I don't think I'll make a video about it, but maybe a blog article :)
@georgeknerr6 ай бұрын
Excellent work, keep it up!!! Shared on Twitter too.
@Hardwareai6 ай бұрын
Thanks for sharing!!
@glikoz3 ай бұрын
Please advise the hardware setup for offline RAG, TTS, STT
@Hardwareai3 ай бұрын
Hard to estimate without knowing the details?
@ameetkarn3 ай бұрын
hi, I am getting following error while running the fork..any ideas ? A module that was compiled using NumPy 1.x cannot be run in NumPy 2.0.0 as it may crash. To support both 1.x and 2.x versions of NumPy, modules must be compiled with NumPy 2.0. Some module may need to rebuild instead e.g. with 'pybind11>=2.12'. If you are a user of the module, the easiest solution will be to downgrade to 'numpy
@Hardwareai3 ай бұрын
Uh-oh. Can you create a Github issue for that?
@emiliosanchez_8 күн бұрын
Do you have this problem yet? I managed to solve it with "pip install "numpy
@ameetkarn7 күн бұрын
@@emiliosanchez_ yes it was fixed..thanks for the help though
@Garfunnckel5 күн бұрын
how did you fixed it @ameetkarn
@Garfunnckel5 күн бұрын
@@ameetkarnhow did you fixed it
@Garfunnckel24 күн бұрын
Would this work on raspberry pi 4B?
@seaniversongeronimo269123 күн бұрын
secret dude
@Hardwareai23 күн бұрын
Absolutely. Just a bit slower, but in my experiments in an earlier video it was still possible to get to real time transcription with tiny.en model
@Garfunnckel4 күн бұрын
Good day I created a fork at your github can you please help me?
@hjoseph7772 ай бұрын
Do you have a discord account? I need a consultation for a project I am working on
@Hardwareai2 ай бұрын
Hi! I don't use Discord a lot. For consultations I do LinkedIn or Patreon - links are in my channel description!
@sathishkumarB-h9m3 ай бұрын
How many languages does whisper will support
@Hardwareai3 ай бұрын
You can see them here github.com/openai/whisper#available-models-and-languages
@Hazar-bt6nf3 ай бұрын
Can raspberry pi5 run whisper using Python?
@Hardwareai3 ай бұрын
Yes. absolutely!
@66Tomini6 ай бұрын
do you use the 8gb raspberry pi?
@Hardwareai6 ай бұрын
Yes, Raspberry Pi 5 8 Gb - but RAM is hardly relevant here, for tiny.en model.
@isaacfranklin27124 ай бұрын
@@Hardwareai thinking of getting the Pi 4 with 1GB RAM. shouldn't be an issue to replicate hopefully.
@domesticatedviking5 ай бұрын
Hey, just wanted to say I really appreciated your last two videos. Will you please be my sensei? Thank you!!
@Hardwareai5 ай бұрын
I appreciate your appreciation! xD I'd say that I'm already a sensei of sorts... You always can support me on Patreon for some extras, but otherwise simply stay tuned for more videos!
@muhammadanan919029 күн бұрын
how can I solve this issue. (I have changed onnxruntime==1.16.0 to onnxruntime==1.17.0) (my python version is 3.12.4) if this matters in any way !! The error is given below ERROR: Cannot install piper-tts==1.1.0 and piper-tts==1.2.0 because these package versions have conflicting dependencies. The conflict is caused by: piper-tts 1.2.0 depends on piper-phonemize~=1.1.0 piper-tts 1.1.0 depends on piper-phonemize~=1.0.0
@Hardwareai29 күн бұрын
Hello! Can you create an issue in my GH fork? With exact command you were trying to run and problem description.
@muhammadanan919029 күн бұрын
@@Hardwareai tysm for the reply ❤️ can you do a video on WhisperFusion by collabora maybe with a real-time speech to text
@Onlyindianpj3 ай бұрын
Real implementation is using websocket Idea is App is transmitting PCM 16k raw audio WS Server will capture those audio packets Sent that to whisper ai to get transcription and return to app in json
@Hardwareai3 ай бұрын
This is pretty much how WhisperLive works, no?
@Onlyindianpj3 ай бұрын
@@Hardwareai you are not using whisperlive
@levbereggelezo5 ай бұрын
Thx
@Hardwareai5 ай бұрын
Appreciate it!
@emiliosanchez_8 күн бұрын
Thanks for your work! Once I have server side running, when I launch client side, I get this error: "INFO:websockets.server:connection open INFO:root:New client connected ERROR:root:Error during new connection initialization: [ONNXRuntimeError] : 1 : FAIL : Load model from /home/pi/.cache/whisper-live/silero_vad.onnx failed:/onnxruntime_src/onnxruntime/core/graph/model.cc:134 onnxruntime::Model::Model(onnx::ModelProto&&, const PathString&, const IOnnxRuntimeOpSchemaRegistryList*, const onnxruntime::logging::Logger&, const onnxruntime::ModelOptions&) ModelProto does not have a graph." Any thoughts?
@Garfunnckel4 күн бұрын
> I ran into an issue when executing the example_client.py, it would throw this error: > > > > ` > > $ python examples/example_client.py > > > > Traceback (most recent call last): > > File "/home/rpi/WhisperLive/examples/example_client.py", line 26, in > > client = TranscriptionClient( > > ^^^^^^^^^^^^^^^^^^^^ > > TypeError: TranscriptionClient.__init__() got an unexpected keyword argument 'callback' > > ` > > Using the following versions: > > - Raspberry Pi OS 64-bit Debian 12 (bookworm) > > - Python 3.11.2 > >