This is fantastic. I’m a Newbie to Python and neural nets, but your explanations are great and pretty straightforward. Question - what additional steps would I take to run this on my own local device (pi 4)? And what else would I need to do to introduce new commands such as as trigger word and “turn off the lights”? Would I need to create my own audio samples, save them to new folders, and retrain to retrain the model? Thanks for any guidance! (if you couldn’t tell I’m DONE w Google Home latency, recreating my own. Ambitious! Need help!)
@cornpop3340 Жыл бұрын
This is an incredibly helpful video.
@nguyent34652 жыл бұрын
The code on TensorFlow website was changed :(
@seanadin386 Жыл бұрын
Can you do a video regarding the newer version? The run interface now has a different code
@tvartalk8 ай бұрын
😊
@gokhanersoz52392 жыл бұрын
Thank you very much for the tranings. But I think there should be a more complex and more advanced voice recognition, voice classification and similar training series if you see fit. You know, trainings on sound are limited.
@Cyka_Blyatus Жыл бұрын
What did you do so the program does not picks up ambient noise or actually works with the commands given? it seems the model lacks ambient noise data sets and whenever ran it only keeps spamming the first command, but yours works perfectly, how to achieve this?
@obi6663 ай бұрын
I have the same issue, files from datasets works perfectly but when I try to use my mic like in the video or record audio with my mic using pyaudio it gives me the 1st class all the time.
@erickd4816 Жыл бұрын
Good video, excellent explanation, I have a question, can the same program be trained to recognize only a specific voice? if so, could you explain it to me? I would be very grateful.
@clumsycoder1907 Жыл бұрын
its not working for me
@geekyprogrammer48312 жыл бұрын
Can you please post building text to speech models from scratch?
@MrIlvis Жыл бұрын
On which Tensorflow version this was made? because Colab uses latest, but older one should work without problems.
@obi6663 ай бұрын
Colab has unix like OS, you can use commands like pip install, dnf install and etc
@loydvincentbutron434510 ай бұрын
is it for english voice only?
@swasthikk3655 Жыл бұрын
Can i get similar for English alphabets
@danielbogemann1598 Жыл бұрын
They changed the Code. Could u you do a quick update?
@TheSaukkio Жыл бұрын
How can it be that in the video it gives nothing with out speaking. While if i run the code from github it predicts random stuff when im not speaking.
@obi6663 ай бұрын
I have the same problem - works perfectly when using audio files from datasets but doesn't when I make some input using my mic
@obi6663 ай бұрын
I've managed to fix that problem - here are the fragments of my code: import sounddevice as sd @staticmethod def record_audio(filename: Union[str, None] = None, duration: int = Config.DURATION, fs: int = Config.INPUT_LEN) -> np.array: print(f"Record sound for {duration} seconds...") audio = sd.rec(int(duration * fs), samplerate=fs, channels=1, dtype=np.float32) sd.wait() #sd.play(audio, samplerate=fs) #sd.wait() # print(f"Recording has been completed") # Convert audio to 16-bit PCM format (in the range [-32768, 32767]) audio_pcm = np.int16(audio * 32767) if filename: write(filename, fs, audio_pcm) # print(f"Sounds has been saved to: {filename}") return np.array([]) return audio_pcm def preprocess_mic_data(self, waveform: np.array) -> EagerTensor: waveform = tf.squeeze(waveform, axis=-1) spectrogram = self.get_spectrogram(waveform=waveform) normalized_spectrogram = tf.expand_dims(spectrogram, axis=0) return normalized_spectrogram def get_prediction(self, audio_data: Union[str, os.PathLike, np.array]) -> str: if isinstance(audio_data, str) or isinstance(audio_data, os.PathLike): normalized_spectrogram = self.preprocess_file_data(file_path=audio_data) else: normalized_spectrogram = self.preprocess_mic_data(waveform=audio_data) prediction = self.model(normalized_spectrogram) # print(prediction) classid = np.argmax(prediction) return self.commands[classid] @staticmethod def calculate_rms(signal: np.ndarray) -> float: """Oblicza RMS (Root Mean Square) sygnału.""" return np.sqrt(np.mean(signal ** 2)) def is_voice_present(self, signal: np.ndarray, rms_threshold: float = 20) -> bool: """ Sprawdza, czy w sygnale znajduje się głos, na podstawie RMS. :param signal: Tablica NumPy z nagranym dźwiękiem. :param rms_threshold: Próg RMS, poniżej którego uznaje się, że nie ma mowy. :return: True, jeśli wykryto głos; False w przeciwnym razie. """ rms_value = self.calculate_rms(signal) print(f"Wartość RMS: {rms_value}") return rms_value > rms_threshold # Sprawdź, czy RMS jest większe niż próg while True: res = predictor.record_audio() print(res.shape) voice = predictor.is_voice_present(signal=res) if voice: print(predictor.get_prediction(audio_data=res)) it's not perfect but at least it works
@TheSaukkio3 ай бұрын
@@obi666 can you send me your code somehow🤔 or contact me so i can get this🤣🤣
Ай бұрын
@@obi666 masz moze jeszcze to rozwiazanie? bo widze ze z polski jestes
@obi666Ай бұрын
Dałem odp, nwm czy cię oznaczylo
@oxydol345611 ай бұрын
This tutorial is great. I find that the key to build accurate model is gathering quality data a lot. And that sounds arduous work. didn't get good result with 200 examples. Edit: I found the model's accuracy is the way poor than I expected. Maybe it's due to the microphone I'm using and it's needed to taken care of before predicting process.
@tankado_ndakota8 ай бұрын
Got the error: "Could not import the PyAudio C module 'pyaudio._portaudio'." And couldn't find the solution... Macbook M1 Pro
@tankado_ndakota8 ай бұрын
I saw a note in other video for M1 :) let me try first :D
@tankado_ndakota8 ай бұрын
i did everything that I found from web. but still i got the error: "symbol not found in flat namespace '_PaMacCore_SetupChannelMap'"
@sanjeetjha9177 Жыл бұрын
Please provide me the model i need argently I am stuck in it
@rediet.f2612 жыл бұрын
what is sample_file in here 8:38
@clumsycoder1907 Жыл бұрын
same doubt
@arqamrafay Жыл бұрын
exactly, i think their is file of recorded audio
@LukasKofler Жыл бұрын
See the first line at 5:38 🙂
@itsrairamones2 жыл бұрын
thankyou dude its a hundred percent work for me but after couple minutes it crashed :(
@Yvtq8K3n Жыл бұрын
Its a shame, you cant train your own model.
@threepe0 Жыл бұрын
of course you can
@Yvtq8K3n Жыл бұрын
@@threepe0 The last time I used this, you were unable to create a custom model and use it. Tensorflow provided you with an already trained model (0-1, left, right) and thats exactly what most people use.