DIY Alexa: Create Your Own Voice Assistant with ESP32 & TensorFlow Lite!

  Рет қаралды 139,561

atomic14

atomic14

Күн бұрын

We've been building towards this project in the previous set of videos. And we're now ready to build our very own DIY Alexa!
All the code for this project is on GitHub - github.com/atomic14/diy-alexa
What are we building - 1:15
Wake Word Detection - 2:27
Command Recognition - 11:47
Digging into the code - 16:30
What's life all about Marvin? - 21:52
To detect the wake words we stream audio from either an I2S microphone or from the ADC. The wake word detector looks at a 1-second window of audio. The spectrogram of the audio is calculated and fed into a TensorFlow Lite model.
Once we detect the wake word we stream the audio up to wit.ai to recognise the user's intent.
It works surprisingly well for such a small model, there are improvements that could be made with more training data.
I'll leave the access token for wit.ai live for as long as I can, but at some point, you will need to generate your own wit.ai application.
Let me know how you get on in the comments!
Related Videos:
Audio Input
• ICS-43434 A replacemen...
• ESP32 Audio Input Show...
• ESP32 Audio Input Usin...
Audio Output
• ESP32 Audio Output wit...
And TensorFlow Lite for machine learning
• TensorFlow Lite With P...
Components you could use:
MAX98357 - amzn.to/3cg88Z5
TinyPico - amzn.to/3vVoONp
INMP441 I2S Microphone: amzn.to/3cicuiv
ICS-43434 I2S Microphone: www.tindie.com/products/21519/
ESP32 Dev board: amzn.to/3gb6fyc
Analogue Audio Amplifier: amzn.to/3pxkEJr
Speakers: amzn.to/3pjWFgq
---
Want to help support the channel? I'm accepting coffee on ko-fi.com/atomic14

Пікірлер: 225
@atomic14
@atomic14 3 жыл бұрын
Interested in ESP32 Audio: kzbin.info/aero/PL5vDt5AALlRfGVUv2x7riDMIOX34udtKD Looking for all my ESP32 projects: kzbin.info/aero/PL5vDt5AALlRdN2KyL30l8j7kLCxhDUrNw
@leventeszecsy6262
@leventeszecsy6262 Жыл бұрын
Hmmm. Alexa has a deep voice.
@ARIJITRAKSHIT_Create_Marvel_
@ARIJITRAKSHIT_Create_Marvel_ Жыл бұрын
How you are inputting the voice in the esp32
@clydealcott3379
@clydealcott3379 2 жыл бұрын
Thank you so much for this awesome and very educational video... I got my ESP32 recently... It's time to roll along.!👍
@trevorwslee
@trevorwslee Жыл бұрын
What an insightful project! I really hope to be able to adapt your idea (including some code snippets and likely the TensorFlow model) and come up with my own little ESP32 experiment.
@engrwaqas2904
@engrwaqas2904 2 жыл бұрын
Absolultely amazing, Great work.
@trueintellect
@trueintellect 3 жыл бұрын
I'm so glad I found your channel!! This is really cool. You've helped free me from my Raspberry Pi dependence.
@atomic14
@atomic14 3 жыл бұрын
The ESP32 is an amazing device. Really powerful.
@codewithdaniel-1
@codewithdaniel-1 3 жыл бұрын
Same case😊
@7Trident3
@7Trident3 3 жыл бұрын
Wow!! I didn't think the esp32 had the guts for any AI stuff! Great video!!
@atomic14
@atomic14 3 жыл бұрын
It's definitely starting to push the limits - but I think it's easy to forget just how powerful the ESP32 is. One of the problems is the size of the models which can get quite large (relative to the amount of RAM we have to play with). Processing time is also a factor especially when trying to do real time as in this project.
@JohnLauerGplus
@JohnLauerGplus 3 жыл бұрын
Wow. Nice work here.
@photopicker
@photopicker 3 жыл бұрын
Very educational. Coming for a ESP32 background I found it very helpful to create a real target for the AI modeling tools. Great introduction.
@digitronix532
@digitronix532 9 ай бұрын
How to interpret esp32 and this program
@marcush.6632
@marcush.6632 2 жыл бұрын
You are an absolute genius in my eyes.....
@tektronix475
@tektronix475 3 жыл бұрын
wow, your alexa version, got me speechless.
@digitronix532
@digitronix532 9 ай бұрын
How to upload program to esp32
@jeffzor
@jeffzor 2 жыл бұрын
Obrigado pela oportunidade de aprendizado mestre!
@jorgemota879
@jorgemota879 2 жыл бұрын
Amazing, Fantastic thank you very much, really a great project
@mariomedina
@mariomedina Жыл бұрын
Got it working! Now I need to learn how to change the activation word, and how to add multiple activation words that activate different code
@paulsimpson9544
@paulsimpson9544 3 жыл бұрын
Really fascinating. Thank you so much for sharing.
@paulsimpson9544
@paulsimpson9544 3 жыл бұрын
I've a follow on question if you don't mind.. I see you using the Arduino framework, but also have the esp IDF icon in platform Io. Do you have any particular preference? I'm considering switching to the IDF as I'm already using xtimers. I like the idea of know control, but also like the easy access to the Arduino ecosystem of libraries..
@atomic14
@atomic14 3 жыл бұрын
I've been mixing in quite a lot of functions from the IDF with my Arduino code. But it seems the IDF that comes with Arduino is now quite out of date. I've been trying to get Arduino working as a component in the IDF so I can use the latest IDF but still take advantage of the Arduino eco system but I've not had much luck. For my Asteroids game I did it all in the IDF - mainly because I wanted to use the PSRAM with malloc and there's not way to do that when using Arduino. But I really missed simple things like uploading firmware OTA - especially with my custom board not having a USB port... I think, unless there's a compelling reason (APIs that aren't available from the IDF when using Arduino) then I'd be tempted to stick with Arduino. If you aren't using any libraries or you can easily port them over then IDF is definitely worth giving a go. But, I don't think there are any huge advantages to it.
@iotan09
@iotan09 3 жыл бұрын
How kind you are ,thanks for sharing
@atomic14
@atomic14 3 жыл бұрын
No problem at all, it's a privilege to be able to give something back to the community.
@OnePunchHeizou
@OnePunchHeizou 2 жыл бұрын
this channel was really helpful to understand many edge ai related concepts, thank you @atomic14.
@atomic14
@atomic14 2 жыл бұрын
Thanks for the kind words - much appreciated!
@OnePunchHeizou
@OnePunchHeizou 2 жыл бұрын
@@sltechgalaxy1677 he is using terminal/cmd_prompt for that.
@OnePunchHeizou
@OnePunchHeizou 2 жыл бұрын
@@sltechgalaxy1677 i think anything should work for this this purpose. preferably use linux.
@OnePunchHeizou
@OnePunchHeizou 2 жыл бұрын
@@sltechgalaxy1677 i used this video for reference. these commands work on linux/windows, i dont know about mac terminal.
@OnePunchHeizou
@OnePunchHeizou 2 жыл бұрын
@@sltechgalaxy1677 bro clone this project git repository, in data u will find all the audio files. go to that directory and try using these commands.
@ChrisHalden007
@ChrisHalden007 3 жыл бұрын
Amazing!!!! Will definitely give it a try. Thank you
@atomic14
@atomic14 3 жыл бұрын
Let us know how you get on!
@digitronix532
@digitronix532 9 ай бұрын
How to integrate esp32 and this program
@sambidpradhan32
@sambidpradhan32 3 жыл бұрын
This is awesome.. thinking to implement this on a custom dataset, and this model looks light weight as well.. can be implemented in real time I guess
@atomic14
@atomic14 3 жыл бұрын
It's amazing what you can do with quite a small model. I have seen that the micro-speech example in the main TensorFlow codebase is now available for the ESP32 - might be worth taking a look at that as well.
@sasisekharmg7823
@sasisekharmg7823 3 жыл бұрын
Amazing work!
@atomic14
@atomic14 3 жыл бұрын
Thank you! Cheers!
@digitronix532
@digitronix532 9 ай бұрын
How to upload to esp32
@gitaran24
@gitaran24 3 жыл бұрын
absolutly amazing.. you do great things,.. you are smart.. its chalange me to make it one
@atomic14
@atomic14 3 жыл бұрын
You should definitely go for it - report back on how you get on.
@gxbs2318
@gxbs2318 6 ай бұрын
voy a aplicarlo en dos dispositivos IOT que tengo en funcionamiento Excelente video
@WagnerUlisses
@WagnerUlisses 3 жыл бұрын
Very cool!
@naafff1
@naafff1 3 жыл бұрын
I thought your gonna be using a Raspberry pi. Speechless... Im gonna make one like u .
@naafff1
@naafff1 3 жыл бұрын
​@Taylor Van i have got many messages like these. They ask you for money and once you give, they dont give you the account you wanted to hack
@ernstgennial7064
@ernstgennial7064 3 жыл бұрын
Very interesting!
@atomic14
@atomic14 3 жыл бұрын
Glad you think so!
@your.free.electrons
@your.free.electrons 2 жыл бұрын
Hey, this one's awesome :')
@alphoncemutabuzi6949
@alphoncemutabuzi6949 2 жыл бұрын
Thanks alot brother
@OMNI_INFINITY
@OMNI_INFINITY 3 ай бұрын
Thanks! Seems I should make a touchscreen voice AI app
@YigalBZ
@YigalBZ 3 жыл бұрын
Great video and project. This is my next project. Thank you !
@atomic14
@atomic14 3 жыл бұрын
Let us know how you get on!
@guilhermevini65
@guilhermevini65 2 жыл бұрын
Amazing !!!
@nielspaulin2647
@nielspaulin2647 Жыл бұрын
EXCELLENT!
@user-sr9ss3xd4q
@user-sr9ss3xd4q 6 ай бұрын
Excellent.. I really enjoy the contents of the channel.. I suggest you make a content about rihno picovoice on esp32
@atomic14
@atomic14 6 ай бұрын
Looks interesting, but I don't think it works on the ESP32 yet - might need a more powerful processor.
@edgull_tlt
@edgull_tlt 2 жыл бұрын
Спасибо за видео. Было интересно.
@SinanAkkoyun
@SinanAkkoyun 3 жыл бұрын
Wow wtf!!!!!! 😍😍😍😍😍😍
@Techn0man1ac
@Techn0man1ac 2 жыл бұрын
Спасибо Большое
@Gauthamphongalkar
@Gauthamphongalkar 2 жыл бұрын
Marvelous content, thank you very much!
@Gauthamphongalkar
@Gauthamphongalkar 2 жыл бұрын
@@sltechgalaxy1677 I'm not sure to which you are pointing.. to play audio.. if you are on Linux you can use aplay
@Gauthamphongalkar
@Gauthamphongalkar 2 жыл бұрын
@@sltechgalaxy1677 aplay is utility of Linux.. you can't use such in windows.. in windows you can try playing in RAW format in VLC
@Gauthamphongalkar
@Gauthamphongalkar 2 жыл бұрын
@@sltechgalaxy1677 yes.. also read about ALSA
@erikpratama7685
@erikpratama7685 3 жыл бұрын
Hello, nice project, can i use esp 32 cam??
@maul6117
@maul6117 10 ай бұрын
do you have to watch these in a certain order? is there a playlist for just the diy Alexa project?
@prof.tahseen6104
@prof.tahseen6104 2 жыл бұрын
the voice from those meme videos 😂
@ankitthealchemist
@ankitthealchemist 3 жыл бұрын
Hey! great work dude!! could we implement the simple command like "turn off the light" offline, just like the wake word detection?
@atomic14
@atomic14 3 жыл бұрын
I'm looking at this right now - it is a more difficult problem than the simple wake word detection. The model needs to have an output for each possible command word which means it is a larger model so will take longer to run on the ESP32. Hopefully, I'll be able to do another video soon showing it working - though just to be clear, this would be very limited commands - like: "on", "off", "left", "right" etc...
@maul6117
@maul6117 9 ай бұрын
is there a step by step video for the hardware build?
@jspark4171
@jspark4171 7 ай бұрын
Your answer was very helpful to me. Thank you very much.
@atomic14
@atomic14 7 ай бұрын
Thanks! Very much appreciated!
@user-ux2oq6yd2c
@user-ux2oq6yd2c 4 ай бұрын
can you integrate with ChaGPT? would be super amazing!
@Nerdsking
@Nerdsking 6 ай бұрын
It would be more interesting (and usefull) if there was a way to merge this with another ESP32 project that wuns chatgpt, so it could be not only a DIY Alexa, but also an general assistant
@aisolutions834
@aisolutions834 3 жыл бұрын
Hi There! Nice Work, Is it possible to run a TensorFlow object detection model like MobileNET on ESP32? OpenMV has this capability using TFLite library, but I am interested in running object detection on ESP32 which is very low cost compared, thanks!
@chockman3833
@chockman3833 3 жыл бұрын
I had to login to my other account to give this video another like, this was incredible! How hard would it be to extend the model to have some amount of offline NLP so we don’t have to rely on Facebook?
@atomic14
@atomic14 3 жыл бұрын
I'm having a look at that right now, got slightly sidetracked looking at building an AGC. It's possible to a limited extent, the command dataset does contain some other words that we can try using. Getting performance from a small enough model looks doable. Hopefully should have something up this week,
@pruthvirajvenkatesha6897
@pruthvirajvenkatesha6897 2 жыл бұрын
Thanks for this! Amazing work! I had few questions and would be helpful if you could reply. Can we use this procedure to build the same for esp32s3? It seems you used arduino framework which i checked and is not up yet on vscode. Any other approaches to build this firmware on esp32s3? Also , do we have info on KWS model #computations? Based on few algorithms papers which are validated on Google speech data set, it is always a trade off bw accuracy and total computations so wanted to know the procedure used to select an algorithm. Last question, can we build any tflite model using the tflm framework?
@kingsleybaros2095
@kingsleybaros2095 6 ай бұрын
At the end please what are you uploading as code in the esp 32 that will run your entire system
@ehabelbwab1783
@ehabelbwab1783 3 ай бұрын
You should mix the audios with noise background at out side than use them for training because adding _background folder with training data is bad choices.
@shufnagl
@shufnagl 3 жыл бұрын
Hi, as others already mentioned...great work, great video. BTW, would it make sense to use other ESP32 Hardware with included Mic/Speaker like Atom Echo?
@atomic14
@atomic14 3 жыл бұрын
I don't see why not - you may need to modify the code to use whatever pins and interface the Atom Echo uses for the microphone and speaker. It should work really well.
@shufnagl
@shufnagl 3 жыл бұрын
@@atomic14 My AtomEcho arrived and I will give you feedback about the results. BTW, where should we discuss the technical aspects? KZbin or Git? Thx
@atomic14
@atomic14 3 жыл бұрын
Probably best on GitHub as we can share code snippets a bit more easily.
@shufnagl
@shufnagl 3 жыл бұрын
@@atomic14 Should I create a separate branch (to avoid poluting your code)
@atomic14
@atomic14 3 жыл бұрын
@@shufnagl You'll need to fork the repository and then you can do pull requests back to my code - there's a good guide here - github.com/firstcontributions/first-contributions Looking forward to seeing what you do!
@Yakroo108
@Yakroo108 5 ай бұрын
👍👍👍
@kavishchattoor1729
@kavishchattoor1729 9 ай бұрын
sorry i know this might be late but I am replicating a similar project. Did you use the ESP32 to capture the audio signal? My esp32 doesn't have enough memory to capture enough data.
@ahlamhusni6258
@ahlamhusni6258 Жыл бұрын
What is the distance for the microphone to be able to catch the voice ?
@SonuRauniyar
@SonuRauniyar 3 жыл бұрын
Pretty cool stuff:). I want to make my own wake-up word detection system using a custom audio dataset. Let's say my wake-up word is "Hey Marvin" which is I assume is longer than 1 second? How many data points can be decent enough to train the model? and since I will use google speech dataset to add noise for better accuracy , do you think time frame of 1 second will matter here?
@spacecdr
@spacecdr 2 жыл бұрын
A linux terminal with alsa and curl installed! "software"...😂
@DJ1TJOO
@DJ1TJOO Жыл бұрын
Can this work with a normal sound sensor that just has an analog out put?
@dicle6714
@dicle6714 2 жыл бұрын
I can't compile this application with Arduino IDE. I made the necessary file edits.
@rolyantrauts2304
@rolyantrauts2304 3 жыл бұрын
Also many thanks as didn't realise tensorflow was and will do the job on an ESP32. There is a lack of opensource linux beamforming algs, which you have probably just solved. Esp32 is so relatively cheap that a distributed microphone array where the mic with highest keyword match is used for that ASR session. Vosk has a streaming API alphacephei.com/vosk/ just needs a streaming RTP protocol with current keyword match info and no beamforming needed as nearest mic automatically used...
@atomic14
@atomic14 3 жыл бұрын
Sounds interesting - the only issue you may start to hit performance issues with processing multiple microphones at once. Currently the wake word detection takes around 100ms so you may start running out of CPU time with more than one or two microphones. You might also hit memory issues with the audio buffering - though using a wrover module might fix this.
@rolyantrauts2304
@rolyantrauts2304 3 жыл бұрын
@@atomic14 I dunno thought I would ask you as a total noob with ESP32 but on linux irrespective of process power we still lack opensource beamforming. The pulseaudio addition just doesn't work, don't think it ever did prob hence why upstream its been dropped from webrtc. What I am thinking is that we are not 'processing' multiple microphones at once the I2S data for mono is just doubled and the L/R hi/lo word select is not used. A single channel would be fed into a delay buffer and then I guess just summed with the inverse of the current value of the other channel? It is really a single channel in a short delay ring buffer of the speed of sound distance and what is present on the other I2S is just subtracted. For a noob who is blankly staring blankly at a $5 aliexpress wrover and brief journey of documentation it makes curious if you could with 2x cores but to be honest yet haven't a clue how :) I can not even work out if http streams are just client or you can create a server stream or if you could present AMR-WB on a port?!? Just got my fingers crossed it might perk your interest.
@rolyantrauts2304
@rolyantrauts2304 3 жыл бұрын
@@atomic14 PS the lack of beamforming was that each ESP32 could be a streaming KWS to a central ASR. Broadcast from KW to silence with some metadata of KW hit score and a central ASR would be able to use best KW hit score so an array of esp32s could be a distributed array with best and nearest always used.
@HassanPhiri-kx1im
@HassanPhiri-kx1im Ай бұрын
Does any kind of esp32 work or does it have to be the ESP32-S2 saola 1R DEV KIT?
@TechnicalShubhamofficial
@TechnicalShubhamofficial 2 жыл бұрын
Hey can you tell me how to program the esp 32 and where is the final code
@dariovicenzo8139
@dariovicenzo8139 2 жыл бұрын
Great video! What I don’t understand (Im at the basic of TF) why we need to use a cloud service AI when we are trying to make an edge device? So in other word we are losting the advantage to realize an edge system if we need a cloud service. So I could avoid the lite model and make all the stuff in the cloud using the esp32 as audio transmitter. I hope I understood well the purpose of facebook service. Thanks.
@atomic14
@atomic14 2 жыл бұрын
Hi Dario, that is a very good question. One of the issues with using the ESP32 as an audio transmitter and doing the wake detection in the cloud is privacy concerns - you really want the user to be in charge of when the device is actively listening and sending your data to a third party service. So you really want the device doing the wake word detection and only sending audio data to the internet once the wake word has been detected. Currently, doing full intent recognition on the edge is too difficult on a device like the ESP32 - however, there is software for the raspberry pi that looks very promising - rhasspy.readthedocs.io/en/latest/
@amarjeetkumarfor
@amarjeetkumarfor 8 ай бұрын
Can I have circuit diagram, please
@AryanKapur0605
@AryanKapur0605 2 жыл бұрын
Hi! Can I use ESP 32 Cam instead of ESP32? Thanks!
@thomasob42
@thomasob42 3 жыл бұрын
Can this project be implemented using arduino BLE 33 Sense?
@devmishra4131
@devmishra4131 2 жыл бұрын
I am pursuing mechanical engineering from Stanford batch of 2023, and your video is pretty good. I had one query, can we use PAM8403 instead of MAX43434 for the output.
@debajyotidas9501
@debajyotidas9501 Жыл бұрын
no
@digitronix532
@digitronix532 9 ай бұрын
How to upload program to esp32
@devisnugroho
@devisnugroho 2 жыл бұрын
what's kind of software that you use in 3:58, the wave and spectrogram comes realtime?
@digitronix532
@digitronix532 9 ай бұрын
Kindly help me in Programming ESP32 ...how to integrate python program and ESP 32
@rachitkachhiya3458
@rachitkachhiya3458 16 сағат бұрын
If I want to make a chatbot with real time responses like update me with current weather. Could you guide through that?
@EricSouzarys
@EricSouzarys 2 жыл бұрын
Do you think it's possible to train the model so it can detect a ringtone?
@h4l050
@h4l050 Ай бұрын
I'm getting weird output values from the NN, outputs like 0.01... 0.14... and it can't detect the word Marvin. I'm using the model(model.cc) that comes in the source code but i can't get any right prediction. What's happening? Do you know what could it be? Thanks, and amazing project :D
@ajanthahimali8491
@ajanthahimali8491 2 жыл бұрын
Can you simplex the firmware codes please, it's very difficult to understand the code
@khaoulakanna4227
@khaoulakanna4227 2 жыл бұрын
can this be done in an other language other than english ?
@THEbonny95
@THEbonny95 Жыл бұрын
Can't do this on Google Assistant?
@dreyreis
@dreyreis Ай бұрын
Is it possible to use a pre-trained voice model and install it on a device (like a model of a famous person, perhaps)? If so, how would we do this?
@atomic14
@atomic14 Ай бұрын
The ESP32 isn’t really powerful enough to do that locally. But there are APIs that you can call that will do Text To Speech (TTS). And some of them offer custom voices.
@fiottovotre7202
@fiottovotre7202 Жыл бұрын
How can I navigate the dataset plz? Actually, I can't find it
@emilianotl3572
@emilianotl3572 2 жыл бұрын
do you know if i can use dialogflow to control devices that are connected to google home?
@JernD
@JernD 3 жыл бұрын
This is probably a silly question, but why did you take the log(audiodata) after audio normalization? Would it be superior to swap those operations?
@atomic14
@atomic14 3 жыл бұрын
Hey John, definitely not a silly question, the audio is normalised and then we calculate the spectrogram of the normalised audio. The log operation is applied to the spectrogram output. The spectrogram can end up with some very large values and the log operation brings them down into a more sensible range for the neural network to train against.
@devmishra4131
@devmishra4131 2 жыл бұрын
hi sir, I have one more doubt that what at this timing 16:13 you used as the terminal, I tried many ways to run the link(I have used my own recording, saved in desktop and pasted the path) which I got from my wit.ai account in my window's terminal, but it didn't work. And I also tried to find many other ways to do that, but nothing worked. So, please reply as soon as possible.
@alo1236546
@alo1236546 3 жыл бұрын
Any plan for TinyMl
@DayanandKushwaha-ef6oi
@DayanandKushwaha-ef6oi 6 ай бұрын
i am not getting audio output please help ...
@apoorvanavin3300
@apoorvanavin3300 4 ай бұрын
in which language this works on? python
@francegall-web9819
@francegall-web9819 3 жыл бұрын
Mr. atomic14 really impressive. Since you are very good at programming can you help us reprogram the HLK-V20 speech recognition? It is a very cheap chip - three dollars - which provides offline speech recognition, but its manufacturer does not explain how it is programmed. (There is also the SU-10A which is the same from a different manufacturer.)
@gsge
@gsge 3 жыл бұрын
Apart from your vast knowledge of hardware and software you are the best teacher to make quite complicated subject very easy to understand for newbie like me. Is it possible to bypass cloud service like wit.ai to host it on local Raspberry for totally local solution ? Thank you.
@atomic14
@atomic14 3 жыл бұрын
Yes - there's a solution called Rhasspy - rhasspy.readthedocs.io/en/latest - I think in theory you should be able to swap out Wit.ai for it. The code for decoding the response will probably need to change, but it looks doable.
@gsge
@gsge 3 жыл бұрын
@@atomic14 Thank you.
@digitronix532
@digitronix532 9 ай бұрын
How to upload program to esp32
@OMNI_INFINITY
@OMNI_INFINITY 3 ай бұрын
Found where rabbit AI maybe started
@55cancri_e76
@55cancri_e76 Жыл бұрын
Hi sir, Thank you for the great video. My teammates and I are trying to make similar project to yours. But I would like to ask you how did you linked the python code with the C code. Also, how did you upload the code on the ESP32? was the C code or the python code?
@davidfa7363
@davidfa7363 Жыл бұрын
Did you solve it?
@55cancri_e76
@55cancri_e76 Жыл бұрын
@@davidfa7363 No
@digitronix532
@digitronix532 9 ай бұрын
Same doubt
@rafaelmatos8754
@rafaelmatos8754 6 ай бұрын
How do you get so many examples of the word Marvin?
@atomic14
@atomic14 6 ай бұрын
Weirdly, it was in the training data. I guess the people who compiled the audio samples were fans of Douglas Adams.
@digitallifetanzania2373
@digitallifetanzania2373 2 жыл бұрын
Can it answer any questions
@faizabdulchakim8796
@faizabdulchakim8796 3 жыл бұрын
this is esp32 s2 saloa-1 right? is possible using other type of esp32?
@atomic14
@atomic14 3 жыл бұрын
Definitely - pretty much any ESP32 dev board will work - I'm not using any special features.
@Pavana_sai
@Pavana_sai 3 жыл бұрын
HI, wonderful project. im interested to build the same project. can you help me
@tryssss
@tryssss 3 жыл бұрын
Question ? did acces key still ok ?
@atomic14
@atomic14 3 жыл бұрын
I think the one in GitHub should still be valid. But if not it's pretty easy to setup a new one.
@data_resources
@data_resources 2 жыл бұрын
Hello i followed your instructions and i did almost all the project but am having trouble getting the output sound when i give the commands
@DayanandKushwaha-ef6oi
@DayanandKushwaha-ef6oi 6 ай бұрын
SAME HERE
@legal_hack5626
@legal_hack5626 2 жыл бұрын
everything is ok but...... for file_name in tqdm(get_files("_problem_noise_"), desc="Processing problem noise"): process_problem_noise(file_name, words.index("_background")) in these lines you are processing noise , but I don't have data set of problem noise , from where I can download it... I have downloaded google speech data set but there is no _problem_noise_ folder... what can I do now>>?
@atomic14
@atomic14 2 жыл бұрын
Hi there, the problem noise files are options (as are the mar sound files). I just recorded some additional audio of my office noises that seemed to be confusing the neural network. You can either add a folder yourself and record some audio or you can comment out that section of the notebook.
@legal_hack5626
@legal_hack5626 2 жыл бұрын
@@atomic14 Thanks
@devmishra4131
@devmishra4131 2 жыл бұрын
I followed all of your processes and really found it amazing and helpful!!. but I have a doubt that how are we going to upload these codes into esp 32 or esp 8266 as you don't have any .ino file so you must not be using arduino for that. so what ide are you using, if it is vscode then what settings you have did? please tell, it would really help everyone.
@atomic14
@atomic14 2 жыл бұрын
I'm using PlatformIO, just install VSCode and download the PlatformIO plugin.
@devmishra4131
@devmishra4131 2 жыл бұрын
@@atomic14 Thanks a lot sir for your reply, it means a lot to me. Looking forward to a successful test!!
@data_resources
@data_resources 2 жыл бұрын
@@devmishra4131 can you explain how you did it
@digitronix532
@digitronix532 9 ай бұрын
How to upload program to esp32
@izigoldenberg218
@izigoldenberg218 3 жыл бұрын
Is there any chance this could work with ESP8266 instead of the ESP32?
@atomic14
@atomic14 3 жыл бұрын
I think that might be difficult - it is pretty much pushing the limits of the ESP32.
@ei23de
@ei23de 3 жыл бұрын
The following question may falls below the standard of your channel, but since you introduced me to jupyter notebook, i have to know, which software you are using for presentation. This is not Powerpoint, is it?
@atomic14
@atomic14 3 жыл бұрын
I use a bit of a mix for videos - I'm on a Mac so use Keynote (the Mac equivalent of Powerpoint). I've been trying to learn the manim library which is what the guy who does 3Blue1Brown uses. I've also got my own homegrown animation library that I use for some things - but it's definitely not really ready. I've used Apple Motion for a couple of videos, there is quite a learning curve with it and I'm nowhere near proficient.
@ei23de
@ei23de 3 жыл бұрын
@@atomic14 I like your videostyle, its looks professional.
@atomic14
@atomic14 3 жыл бұрын
@@ei23de Thanks!
@keithsummers2842
@keithsummers2842 3 жыл бұрын
You didn't really mention the size of the project. What is the expected memory footprint of the Flashed program?
@atomic14
@atomic14 3 жыл бұрын
It uses about 1,1Mbytes of flash. When running memory is tight, making the HTTPS connection to Wit.ai leaves about 30K of RAM.
@keithsummers2842
@keithsummers2842 3 жыл бұрын
@@atomic14I'm working on a project right now where just Wifi and BLE implemented is soaking up about 1.5M of flash. As long as the entire project remains below about 3M then OTA continues to be possible in the WROM32 with 16M flash. I was most concerned about OTA memory space. Thank you for the response and the excellent video post here on KZbin.
@atomic14
@atomic14 3 жыл бұрын
@@keithsummers2842 No problem - thanks and good luck with your project!
@keithsummers2842
@keithsummers2842 3 жыл бұрын
@@atomic14 You seems to be very knowledgeable. Could I hire you for consultations just to keep us on track with our project? I can be reached at Keith@SSLEDLighting.com
@typingcat
@typingcat 2 жыл бұрын
I need a off-the-grid system, not using a voice recognition service from Facebook. Who knows what Zuckerberg is going to do with your data. Also, as I see in the demo, the is a quite a significant delay, like 3 seconds. One of the reasons why I want to create my own is that I don't like the delay of Google Home. I don't know how other people use the the voice assistance, but I have found that they are dumb. Not really "A.I.", but just scripted responder by some programmers. So, I don't really try to "speak" to it, but just say some fixed-structure phrases that I know it will understand, like "turn on the light", etc. In short, all I need is speech to text. If I could get a string like "turn on the light", I could parse it and turn on the light myself. Is ESP32 powerful enough to convert speech to text on its own?
@ei23de
@ei23de 2 жыл бұрын
Hey, i hope you don't mind if i mention this video (and your channel) in one of my future videos?
@atomic14
@atomic14 2 жыл бұрын
Go for it :)
@ei23de
@ei23de 2 жыл бұрын
@@atomic14 kzbin.info/www/bejne/Y3nJoKptgLJ2gZo (but it's german language...)
@yaowang4490
@yaowang4490 3 жыл бұрын
hello ~Can you tell me how to import the project into vsconde, and look forward to your reply。 think you
@atomic14
@atomic14 3 жыл бұрын
Hi Yeo, you'll need the PlatformIO extension installed and then you just open the folder the project is in.
@yaowang4490
@yaowang4490 3 жыл бұрын
@@atomic14 I have successfully run your project, but I don’t know whether the inmp441 works. How to print the data of inmp441? think you
@hokuspokus8570
@hokuspokus8570 2 жыл бұрын
Marvin tell me a joke .... OK
@ilyass_bouarasse
@ilyass_bouarasse Ай бұрын
Thats alsome
@ei23de
@ei23de 3 жыл бұрын
This is super great! We should do some kind of collaboration! Some time ago i tried out Rhasspy with a Raspberry Pi as an offline Alexa (I call it "Axel", you can see it in my "DIY Open Source Home Automation with a Raspberry Pi [EN]" Video). Rhasspy is great, but I need some kind of sattelite hardware like this, or an ESP32 Audio Kit, which I saw as quite a challenge. But you obviously did it right away!
@atomic14
@atomic14 3 жыл бұрын
I had a quick look at Rhasspy and you could easily modify my code to talk to it. I have a few projects to complete but will come back to it when I have some more time.
@ruifreitas7475
@ruifreitas7475 3 жыл бұрын
@@atomic14 This is something i was looking into when i saw your video. Great timing. Passing audio commands from ESP32 to (via MQTT or not) Rhasspy to be recognized and trigger intents or actions in Home assistant would be great. Thank you for sharing this. community.rhasspy.org/
@synesthesiam
@synesthesiam 3 жыл бұрын
@@atomic14 Rhasspy author here. Your project looks awesome! I'd be very interested in collaborating, so feel free to ping me whenever :)
@ei23de
@ei23de 3 жыл бұрын
​@@synesthesiam What great people here! Thank you for Rhasspy! I'm currently working my DIY Video Doorbell (ESP32 Cam) and Face Detection with OpenCV. You know the drill. But the video soon will be finished and after that my smart doorlock will get some spotlight... but after that! I will definitly spend time on this! This is super exiting and needs more attention. Hope i'll find time for this, soon.
How I Made My Own Smart Speaker Google + Alexa - Under $30
10:26
DD ElectroTech
Рет қаралды 2,6 МЛН
Despicable Me Fart Blaster
00:51
_vector_
Рет қаралды 17 МЛН
One moment can change your life ✨🔄
00:32
A4
Рет қаралды 13 МЛН
تجربة أغرب توصيلة شحن ضد القطع تماما
00:56
صدام العزي
Рет қаралды 52 МЛН
Building a Digital Music Player with I2S?! What is I2S! EB#45
10:24
Making the most EQUIPPED DIY Security Camera with ESP32-CAM
26:36
Max Imagination
Рет қаралды 344 М.
Try these 16 Brilliant ESP32 projects!!!
11:18
ToP Projects Compilation
Рет қаралды 552 М.
Voice Recognition Module - 254 Voice Commands + UART
11:51
Electronoobs
Рет қаралды 196 М.
Raspberry Pi AI: Picroft Voice Assistant
21:34
ExplainingComputers
Рет қаралды 161 М.
Most Interesting Addressable LEDs that I've Ever Seen
10:50
atomic14
Рет қаралды 12 М.
Running ChatGPT on ESP32 with Audio Output 🔉🔊
10:50
techiesms
Рет қаралды 61 М.
Я УКРАЛ ТЕЛЕФОН В МИЛАНЕ
9:18
Игорь Линк
Рет қаралды 114 М.
Clicks чехол-клавиатура для iPhone ⌨️
0:59
АЙФОН 20 С ФУНКЦИЕЙ ВИДЕНИЯ ОГНЯ
0:59
КиноХост
Рет қаралды 478 М.
Игровой Комп с Авито за 4500р
1:00
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 2,2 МЛН
Как удвоить напряжение? #электроника #умножитель
1:00
Hi Dev! – Электроника
Рет қаралды 449 М.