100 beers to the one who can do this: Compare a new 10 second selfie video against a private encrypted database of similar videos and determine some results/outputs from video and audio. Elements: 1. The video selfie in portrait mode - person has to be decently placed within the video frame and decently illuminated. 2. The audio from reading of a random short phrase (must be readable 3-5 seconds) in the native language of the person (language can be selected from user input). Phrase can be randomly generated by AI, must never be the same phrase and the phrase prompted must match what the person reads, so audio must be analyzed. If the subject head is framed properly and illuminated properly, recording will start automatically. Within the 10 second recording, the person making the selfie will be prompted to read the random phrase out loud (in native language). The sound must be analyzed in real time so that the phrase read by the human is converted from speech to text and the output must match the sentence prompted by 90% accuracy or more, or he/she has to start all over again. RESULT: The result of each new comparison initiated when a new selfie video taken is compared against this database has to be an answer to these 2 simple cascading questions. 1. Is the subject in the video a human being? true or false - accuracy must be over 90% - cannot be fooled by manikins or by very obvious recordings played on another screen ^ this is required before saving the video to the encrypted database. 2. Is the subject a different human compared to the subjects from all other videos by analyzing both video image for face ID and sound for vocal timbre? if it's not different, must output all matches by @username value. ^ this is also required before saving the video to the encrypted database. I am open for suggestions to increase accuracy and prevent this system from being fooled/hacked. Also, let's make it open source. I can help with the front-end and hosting.
@michaeltyborski48022 жыл бұрын
Thank you for helping developers enhance humanoid robots. The hardware design and construction seems much easier than speech recognition and action generation.
@josea.ramirez8801Ай бұрын
I hate when people say “free with limits”
@mysha09782 жыл бұрын
I couldn't find a suitable API anyplace till today, but now I know where to go.!
@bouzz48402 жыл бұрын
Thank you this has helped me a lot. I couldn't find a good API anywhere, now I know where to look!
@AssemblyAI2 жыл бұрын
Great to hear!
@herlocksolmes7410 Жыл бұрын
Thanks bro . Saved my project.
@KallunWillock2 жыл бұрын
This was a surprisingly helpful video. Thank you very much! If you're looking for suggestions for videos topics, might I suggest looking at different 'use cases' and how you might go about implementing your API in those scenarios.
@KallunWillock2 жыл бұрын
Oops - I just checked out your video back catalogue - my apologies - looks like you've already got that covered. Will check them out later today. Thank you!
@AssemblyAI2 жыл бұрын
hehe glad you like the videos :) -- Patrick
@clearthinking54412 жыл бұрын
Do you provide timestamps of the transcribed audio file?
@paulos_ned62468 ай бұрын
Let's say I wanted to integrate a voice to text feature on my next js application and the voice that is going to be recorded is not in Englis (it is amharic). which of this solutions will fit best for me?
@balajicmb11322 жыл бұрын
Thanks for this video.🙏 Speech to text transcribe open source library using python? (Completely open source) if you know please let me know.
@pablod.16342 жыл бұрын
just watch the full video :)
@AssemblyAI2 жыл бұрын
yep I am mentioning open source libraries in the end :)
@khalidelgazzar6 ай бұрын
Great intro. Thank you 😊
@zRedPlays Жыл бұрын
Where are the 3 free monthly hours mentioned on the AssemblyAI website? I can't find it anywhere
@jsebastianmunch2 жыл бұрын
I was just searching for something like this! Thank you🙏🏼
@AssemblyAI2 жыл бұрын
Glad we could help!
@rohanrjoshiimakemyvid72852 ай бұрын
Can you suggest some totally free APIs that I can integrate on my website. I am looking for "videos" related API
@azhagurajaallinall1262 жыл бұрын
Thanks for this .. great we got to see this (i saw this on ad,but in future i hope it won't need) Wish to get more reach soon 😃🌟✨🙌 Man please make modulations & certain intervals while speaking and you are already doing great 😃 15.09.2022 09:59 pm ist
@spider27911 ай бұрын
do you know a powerufl speech to text with timestamp and speaker diarization for ICELANDIC i need it for class project
@aperson118111 ай бұрын
Is there a tool for videos? I downloaded videos and need to have them translated to my elderly family member from English to Ukrainian/Russian.
@147score2 жыл бұрын
I am looking for a simple basic Speech to Text File (TXT) for Windows OS which only dictates mainly numbers & letters but need to write to a text file (e.g. on a new line) on every update, can anyone tell me is such thing exist & ready made, without developing from scratch?
@viewview6687 Жыл бұрын
It seem like Assembly a lot budget than google, how about performance comparing ?
@kirkwon70062 жыл бұрын
Microsoft Azure also has Speech to text as part of their Azure Cognitive Services offerings with 5 hours free per month
@AssemblyAI2 жыл бұрын
thanks for the tip! -- Patrick
@System-hang1952 жыл бұрын
Hii
@luis96xd2 жыл бұрын
Amazing video, very informative and helpful, thanks!!
@Live_draw_today Жыл бұрын
Please make a video on how to train voice model stem by step in pycharm, please sir make video on this
@KkHostel-v4s11 ай бұрын
youtube video url not work ?
@joemarcardino46632 жыл бұрын
thank you, sir. this helps a lot.
@saurabhupadhyay1445 Жыл бұрын
can this will work on mobile Brower
@ChrisWong666 Жыл бұрын
Why the audio intelligence is $0.000583 now? Increase from 0.000167. It increase too much, right?
@diycaptions2 жыл бұрын
Hi, Assembly AI. Would you be interested in having your videos captioned? :-)
@_Saike2 жыл бұрын
Thank you, was great video and good information.. I was looking for such think to start my MVP.. Pls. Can you do video comparing different Apis including their percentage accuracy?
@AssemblyAI2 жыл бұрын
thank you! Will add this to our list :)
@joynalabiden35022 жыл бұрын
Nice video beautiful video
@F_Flex2 жыл бұрын
Im sorry i tried Assembly Ai and i dont understand a single thing! What am i suppose to do there? How can i transcript anything? In the "Tutorial" section (which actually isnt one it just shows code snippets and not much more) nothing is told about what to actaually do :) Damn guys you really need to overthink your business plan :) And the funniest thing is to say: this is the easiest one tu setup :D Oh really? Like how?
@maxelevator22982 жыл бұрын
Great video
@j0hannes52 жыл бұрын
Can you please compare current TTS solutions as well in another video?
@AssemblyAI2 жыл бұрын
good suggestion! thanks, I'll look into that...
@j0hannes52 жыл бұрын
@@AssemblyAI Maybe you can find something(or a combo) that lets you switch between offline voices and more fancy voices that require an online service.
@corteydmusic Жыл бұрын
great
@monkmonk27342 жыл бұрын
oh Metamask :)
@IndigaVP9 ай бұрын
who have timestamp foe every word for free ??? pyton have but import only word with out timesamp!!!
@AkashSharma-nr7po Жыл бұрын
You said free
@nemo187710 ай бұрын
It is a scam as many other. Actually it is free only 1 hour...
@shimmyshimmyyea2 ай бұрын
I'm using Assembly but about 50% of cases I get a network error: com.assemblyai.api.core.AssemblyAIException: Network error executing HTTP request What am I doing wrong?
@codearabawy2 ай бұрын
Try to add some delay between calls, if not, reach out to their support team.