AI LIP READING

  Рет қаралды 1,216,575

carykh

carykh

Күн бұрын

Пікірлер: 3 600
@KurtHugoSchneider
@KurtHugoSchneider 5 жыл бұрын
now we need the full bee movie uploaded, but with the actual audio replaced by your dramatic reading of the script...
@carykh
@carykh 5 жыл бұрын
omg I have the 70 minute video of my voice on my iPhone, I suppose I have no choice but to upload it! check back in 1 hour. I bet somebody will edit it all together
@TastyBaldEagle
@TastyBaldEagle 5 жыл бұрын
@@carykh please
@PlasmaSabre
@PlasmaSabre 5 жыл бұрын
@@carykh I would watch this :D Great work on the project btw, love your videos.
@kutip1027
@kutip1027 5 жыл бұрын
Please I still want this
@kutip1027
@kutip1027 5 жыл бұрын
If I need to I will volunteer as tribute
@rj9959
@rj9959 5 жыл бұрын
Only about 40% of words are able to be made out by the best lip readers. The rest of the words are assumed based on context. So this project has huge limitations to start with.
@dylanwijaya1662
@dylanwijaya1662 5 жыл бұрын
@Eric Lee you like cereals>:)?
@dylanwijaya1662
@dylanwijaya1662 5 жыл бұрын
@Eric Leeyou like mum buy cereal type >:) ?
@dylanwijaya1662
@dylanwijaya1662 5 жыл бұрын
@Eric Lee ohhhh children school they give milk like teachers to student. it good because I can eat cereal with milk it free. So teacher give milk to children. Okeh?
@GaJ42
@GaJ42 5 жыл бұрын
Okay not is it
@dylanwijaya1662
@dylanwijaya1662 5 жыл бұрын
@@GaJ42 you like cereals>:)?
@YoshTea
@YoshTea 4 жыл бұрын
Holy hecc this is useful for animation
@CA19
@CA19 2 жыл бұрын
YES
@boyinaband
@boyinaband 5 жыл бұрын
I love these videos.
@UmMeAmberE
@UmMeAmberE 5 жыл бұрын
OOF IVE FOUND YOU
@MrZkitZ
@MrZkitZ 5 жыл бұрын
@@UmMeAmberE same
@yoyochinb3742
@yoyochinb3742 5 жыл бұрын
Wow
@4ltrz555
@4ltrz555 5 жыл бұрын
Hello!
@RubenFedop
@RubenFedop 5 жыл бұрын
So thats how i found your channel
@NeedForMadnessSVK
@NeedForMadnessSVK 5 жыл бұрын
"We just need to pick the right transcript" Me: Its going to be a Bee movie isnt it? "I read the entire Bee movie script on camera" NAILED IT.
@jurremioch316
@jurremioch316 5 жыл бұрын
It just HAD to be the Bee Movie script, I cheered so hard when he said it.
@hoodlumscraggy1801
@hoodlumscraggy1801 5 жыл бұрын
kzbin.info/www/bejne/d3uml5qOnaZonMU here is his bee movie script video
@jessdoesstuff6783
@jessdoesstuff6783 5 жыл бұрын
thought the exact same thing
@slicerthe84th
@slicerthe84th 2 жыл бұрын
NAILY
@ORyan402
@ORyan402 14 күн бұрын
I only guessed it because i have watched the video before
@toasttimestwo
@toasttimestwo 4 жыл бұрын
Cary: Read the lips of this guy. Computer: *S U M M O N S S A T A N*
@jobisTheWorst
@jobisTheWorst 4 жыл бұрын
WHO SUMMONED ME
@72jysmith
@72jysmith 4 жыл бұрын
Cary:ME
@lameking2839
@lameking2839 4 жыл бұрын
God: Let me introduce myself
@wolfyowoz
@wolfyowoz 4 жыл бұрын
666 likes I'm not gonna ruin that
@reinatr4848
@reinatr4848 4 жыл бұрын
Still 666 likes
@Amaya_Fox_20
@Amaya_Fox_20 5 жыл бұрын
"so how tough are you?" "I read the entire bee movie script" "yeah, so?" "I read it in front of my camera" "come right in, sorry for the wait"
@ashleysmith8528
@ashleysmith8528 5 жыл бұрын
You got a bottle of ketchup? yeah *Fails at opening ketchup cap Could I run this in some hot water?
@azadanzans5359
@azadanzans5359 5 жыл бұрын
Kolio Pulio Why doesnt anyone know the last line?
@사다드-j6x
@사다드-j6x 5 жыл бұрын
@@azadanzans5359 , no no
@SoshJam
@SoshJam 4 жыл бұрын
AND SUBMITTED IT FOR A COLLEGE CLASS
@thatonewierdcowboy6792
@thatonewierdcowboy6792 5 жыл бұрын
Funny thing is... I actually correctly guessed “Have you got a moment?”
@ohyeahyeahimasian392
@ohyeahyeahimasian392 5 жыл бұрын
same
@tripodgamer
@tripodgamer 5 жыл бұрын
LIAR
@bensosnowski1128
@bensosnowski1128 5 жыл бұрын
I guessed it was a question, but that’s it
@ethen1772
@ethen1772 5 жыл бұрын
I guessed are you being helpful?
@isaacphase2759
@isaacphase2759 5 жыл бұрын
That was the only one I got
@legoyoda5776
@legoyoda5776 5 жыл бұрын
"Or rather, I should say *OUR* lip reading A.I" *SOVIENT ANTHEM STARTS PLAYING*
@QS1597
@QS1597 5 жыл бұрын
Antonio Sustaita ah, the sovieNt union
@QS1597
@QS1597 4 жыл бұрын
SPOTILA NAVEKI VELIKAYA RUS
@我恨我自己
@我恨我自己 4 жыл бұрын
Yes
@cailyndempster
@cailyndempster 4 жыл бұрын
Soviet
@legoyoda5776
@legoyoda5776 4 жыл бұрын
@@vvg_lol *YES!!!*
@agentstache135
@agentstache135 5 жыл бұрын
Reverse the program to animate the mouth movements EDIT: If Cary still has the animation files for some of his videos I don't think it'd be too hard to rip the mouth data from them (as a one dimensional matrix representing different mouth positions) and then use that with the audio from those videos
@iritesh
@iritesh 5 жыл бұрын
that's what China did with the news anchoring AI
@exm3266
@exm3266 5 жыл бұрын
IIRC Adobe Animate recently released a feature that would assist in lip syncing, but I'm not sure if it's anything like the logic used here.
@JeffHykin
@JeffHykin 5 жыл бұрын
You could also reverse the purpose of the AI: give it the original transcript and have it swap real words with similar-looking words. Limit it to only a few words per sentence, give it an oddly specific dictionary for substitutions, and you'd have truly automated the bad lip reading channel. Maybe that's what I'll do for my senior project.
@TheTonyMcD
@TheTonyMcD 5 жыл бұрын
That would be incredibly useful to the anime industry. And with decent enough cgi, to the entire film dubbing industry.
@michaelepica3564
@michaelepica3564 Жыл бұрын
Lol he did that
@tfairfield42
@tfairfield42 5 жыл бұрын
*OUR* LIP READING AI _Soviet anthem begins_
@benos1799
@benos1799 5 жыл бұрын
Good job comrade we need you in the soviet union
@blarg2429
@blarg2429 5 жыл бұрын
mobile.twitter.com/unusualvideos/status/1069136310600777729
@guh2908
@guh2908 5 жыл бұрын
Sounds like *_COMMUNIST PROPAGANDA_* But ok
@Kasmuller
@Kasmuller 5 жыл бұрын
@@benos1799 to bad Soviet has been gone for almost 30 years
@voltagedrop5899
@voltagedrop5899 5 жыл бұрын
Daily reminder that communism doesn't work.
@robinr2770
@robinr2770 5 жыл бұрын
as a linguist, I feel for you, you took on a task way harder than you expected, good job regardless. unfortunately we can not see inside the mouth of someone speaking and that is where so much of speech happens. you can also consider the following: if you have the same vowel after 3 different consonants, your lips will always be in a different position, thus some sounds don't have unique lip positions at all. real life lip reading is mostly context and being able to tell where those highly distinguishable consonants are.
@duck7781
@duck7781 5 жыл бұрын
13:00 super easy I memorized the bee movie script
@EmanuilGlavchev
@EmanuilGlavchev 5 жыл бұрын
Overfitting in real life :D
@OneFingerYT
@OneFingerYT 5 жыл бұрын
I actually read "have you got a moment" easily. The AI needs more training in phrases.
@theepicgamer4578
@theepicgamer4578 5 жыл бұрын
Your profile pic saids it all
@user-vn7ce5ig1z
@user-vn7ce5ig1z 5 жыл бұрын
• The takeaway from this video is to give deaf people lots of kudos. • Decimating twice isn't 20% off, it's 19% off: ((N×0.9)×0.9) Close but no zikal (I think I need more practice lip-reading). • Dubbing words onto politician's mouths has already been done. It's the audio counterpart of deep-fakes (and BadLipReading).
@matthewzeller5026
@matthewzeller5026 5 жыл бұрын
I was going to comment that but I'm not even sure what the "correct" term is. Sure you could say "20%" but does "bi-decimate" work?
@jakef8913
@jakef8913 4 жыл бұрын
"For example, after the word 'the' there should always be a noun" adjectives
@devinandcarrietotaldrama505
@devinandcarrietotaldrama505 4 жыл бұрын
The cat = The bad cat
@yourtypicalcube2830
@yourtypicalcube2830 2 жыл бұрын
@@pinkman_ Gerunds (-ing) are nouns, so you're using a noun there.
@Failzz8
@Failzz8 5 жыл бұрын
14:14 interesting, so this is what being insane feels like.
@diamondgolem6401
@diamondgolem6401 5 жыл бұрын
I'm pretty sure it's more like 3:53
@TheKillerGut
@TheKillerGut 5 жыл бұрын
*Uses headphone*...ow
@knack3381
@knack3381 5 жыл бұрын
My right headphone is broken Which makes me sane, i guess
@PixelBytesPixelArtist
@PixelBytesPixelArtist 5 жыл бұрын
A Traditional to simplified Chinese character converter would be amazing. If you guys want to try that project again I suggest trying to identify radicals and translate those instead of the characters themselves. Most differences between simplified and traditional are in the radicals
@eluisific3255
@eluisific3255 5 жыл бұрын
12:51 Jokes on you! I memorized the whole bee movie script!!!
@ShermyShroomy3101
@ShermyShroomy3101 4 жыл бұрын
what did he say then
@bastibob660
@bastibob660 4 жыл бұрын
Vannesa pull yourself together
@Fuley-la-joo
@Fuley-la-joo 4 жыл бұрын
According
@Crystal_500
@Crystal_500 3 жыл бұрын
@@Fuley-la-joo to
@rebert_reid
@rebert_reid 3 жыл бұрын
@@Crystal_500 all
@OrangeC7
@OrangeC7 5 жыл бұрын
Honestly, and I'm not sure if this is how KZbin does their captions, but I feel like a combination of lip reading and word recognition together would make very accurate captions, especially if it's tuned to be just right.
@sacripudding4586
@sacripudding4586 5 жыл бұрын
That causes an issue. It wont know if it sees lips or not. It could just see like, as an example, a fortnitw characters lips. Alot of gameplay channels dont have webcams. It may see the wrong thing as lips, issues like that may screw up subtitles.
@caseygreyson4178
@caseygreyson4178 5 жыл бұрын
Please use this to translate Jojo Siwa so we know what she’s trying to say Also, don’t worry about the project’s accuracy. I have a Deaf sibling and when they talk to me it’s fine because I learned sign language growing up with them. But they hate lip reading because it’s so hard to read lips. Apparently opinions/studies sort of agree that lip reading is an awful way to communicate cause some sounds look the same. A pretty infamous one is “Olive juice” looking like “I love you”. They say only 30% of words can be read accurately. Pretty weird right?
@badlydrawnturtle8484
@badlydrawnturtle8484 5 жыл бұрын
It's pretty obvious if you actually stop to think about it. (To quote Wikipedia for briefness) "Organs used for speech include the lips, teeth, alveolar ridge, hard palate, velum (soft palate), uvula, glottis and various parts of the tongue." Out of all of that, the only thing "lip reading" gets you information about is the lips and very occasionally the tip of the tongue; all of the rest of that critical information is invisible from the outside. It's remarkable that anybody ever thought lip reading was effective, really. Did they never stop to consider what their own mouth and throat are doing?
@caseygreyson4178
@caseygreyson4178 5 жыл бұрын
Badly Drawn Turtle Exactly! Sounds like Fa and Va look exactly the same. As well as Ga and Ka. The whole point of lip reading is that it’s just the shape of the mouth. You don’t have context or the sounds. In ASL we mouth words on most signs, but that’s just cause. If you do the sign for twins and mouth “twins”, no one is going to think you said “wins” because there is that context. But lip reading by itself (when my sibling tries to understand someone who isn’t signing) they struggle so much.
@boggers
@boggers 5 жыл бұрын
@@caseygreyson4178 yeah, there are around 40 phonemes in most languages, but traditional 2D animators use only 10 mouth shapes. eg. M B and P all use the same shape, there is one neutral looking shape that is used for about a quarter of the other sounds.
@ZombieGuts15
@ZombieGuts15 5 жыл бұрын
and, “Alligator food” looks like, “I love you”
@hoper7649
@hoper7649 5 жыл бұрын
If the computer got 47% right. Then its pretty good.
@txaggyraf
@txaggyraf 4 жыл бұрын
3:53 **Their smiles slowly turning into giant frowns**
@agentstache135
@agentstache135 5 жыл бұрын
The Gosper Glider Gun (4:20) is one of the smallest guns in Conway’s Game of Life. Like I’m not saying you needed to show a HBK Gun or anything, but at least show a Cordership Gun or something
@carykh
@carykh 5 жыл бұрын
not enough pixels in a KZbin video! And hey at least it's bigger than a queen bee
@tomryan3408
@tomryan3408 5 жыл бұрын
lol 420
@WangleLine
@WangleLine 5 жыл бұрын
Thanks for the random knowledge, stranger!
@mystery8093
@mystery8093 5 жыл бұрын
*420 blaze it*
@bornach
@bornach 5 жыл бұрын
Most disappointed that there was no 2001: A Space Odyssey reference to HAL9000's decision to murder the crew based on lip reading evidence.
@microbialdoormat
@microbialdoormat 5 жыл бұрын
I, myself, am hard of hearing. As long as I have the tiniest bit of sound, I can read lips. And with dramatic wording, like yours, I read it just fine! So hah!
@sikor02
@sikor02 5 жыл бұрын
Dave, although you took very thorough precautions in the pod against my hearing you, I could see your lips move. ~HAL 9000
@bapldap3324
@bapldap3324 5 жыл бұрын
I was looking for this.
@razvanflorea1166
@razvanflorea1166 5 жыл бұрын
A Space Oddisey fans unite!
@kryswilkins8615
@kryswilkins8615 5 жыл бұрын
I’m afraid I can’t do that, Dave.
@leehttucec-9985
@leehttucec-9985 5 жыл бұрын
You said what we were all thinking, thank you
@MarkGamed
@MarkGamed 5 жыл бұрын
We need the entire movie but with the AI instead of the actual audio EDIT: woah that’s a lot of likes
@agentstache135
@agentstache135 5 жыл бұрын
AI writes the music for the score for the Bee Movie, AI writes the script for the Bee Movie, AI animates the Bee Movie, AI makes a bad lip reading of the AI written Bee Movie, AI takes the bad lip reading of the AI written Bee Movie and writes a script to contextualize the random things, AI animates the contextualized script based on the bad lip reading of the AI written Bee Movie and animates it, and so _ad nauseam_
@alexandramuller9055
@alexandramuller9055 4 жыл бұрын
I love the conway's game of life reference "bring out the big guns" lmao For anyone wondering, the picture he slams on the table is a glider gun, it produces infinite gliders.
@H_fromDiscord_real
@H_fromDiscord_real 6 ай бұрын
timestamp?
@Calthecool
@Calthecool 5 жыл бұрын
You had a video of you reading the bee movie script for 10 months? And you didn’t post it? - respect.
@ChristianGates
@ChristianGates 5 жыл бұрын
Your neck moves too when you make certain syllables. Maybe you should incorporate that?
@Predated2
@Predated2 5 жыл бұрын
I think angles matter too. If he had done 2 angles, it probably would be able to look at the movements more precise and see where it went wrong. Then having 3-5 people reading the same thing both overly moving and normally, it should figure it out pretty quick.
@ChristianGates
@ChristianGates 5 жыл бұрын
Predated O exactly
@AB-Prince
@AB-Prince 5 жыл бұрын
decimated twice would be 19% off 100-(100/10)=90 90-(90/10)=81
@KentoNishi
@KentoNishi 5 жыл бұрын
Roses are read Violets are blue AI can read Can Cary too?
@agentstache135
@agentstache135 5 жыл бұрын
There’s a Cosmo article about the video used titled “KZbinr had one night stand with a woman, she lied afterwards about being pregnant with twins” if anyone wants to know the context of the video
@art1637
@art1637 5 жыл бұрын
Agent Stache what the fuck?
@43Jodo
@43Jodo 5 жыл бұрын
kzbin.info/www/bejne/lXualXiejtmnmLM Plug this into the Wayback Machine to actually watch the video. Asshole decided to delete it.
@agentstache135
@agentstache135 5 жыл бұрын
@@43Jodo How does that make him an asshole? Like it's something kinda personal and he probably just wanted it to be more as an update about why he wasn't gonna be a father to those who were following him at the time instead of a video for everyone to be able to see forever
@breakerboy365
@breakerboy365 5 жыл бұрын
what is going on lol
@Crudecoronet
@Crudecoronet 5 жыл бұрын
Agent Stache What are you talking about
@kumamedia123
@kumamedia123 5 жыл бұрын
I seriously thought he said “I love bobbies” 13:35
@cavemann_
@cavemann_ 5 жыл бұрын
What an absolute madlad! He actually read the whole Bee Movie script!
@KrazyKyle-ij9vb
@KrazyKyle-ij9vb 5 жыл бұрын
I hope he likes jazz...
@Zorbeltuss
@Zorbeltuss 5 жыл бұрын
If you could increase or decrease the score of words based on context you could probably reduce the amount of errors that occur, also that can be trained on separate material in the form of text transcripts from other sources, making it easier to see if it hurts or helps.
@Mastaachef
@Mastaachef 5 жыл бұрын
13:39I ACTUALLY GOT IT RIGHT OMGGG! So this is what ultra instinct feels like?
@joelbraun8584
@joelbraun8584 5 жыл бұрын
YEAH HAHA SAME "Both of you did terrible"
@araceli7604
@araceli7604 5 жыл бұрын
3:53 me trying to have a normal conversation with someone Edit: Woah, that's a lot of likes...
@thecringeking873
@thecringeking873 5 жыл бұрын
Same here
@hanac5586
@hanac5586 5 жыл бұрын
this sounds exactly like me when I haven't slept in 24 hours but still have a lot to say
@deadbread3459
@deadbread3459 5 жыл бұрын
WhEn LibEarLs sPeAk tO mE tHeY sOuNd LIke ThaT XD XD WOW they ThInk Their so Gr8 :0) 😂😂😂😂😂😂😂
@SreenikethanI
@SreenikethanI 5 жыл бұрын
06:08 i swear I expecting he was gonna read the Bee movie script… AND HE DID! I'm like "YESS!"
@AriaLunaCampbell
@AriaLunaCampbell 4 жыл бұрын
My technical mind: "This is pretty interesting." My linguistic mind, watching the section on the algorithm guessing syllables: "Please, for the love of everything, use the IPA! Ahhhhhhhh!" (To be clear, this is mostly a joke. At least he is using a standardized format for syllables. I just have this little part of my brain that's been spoiled by the IPA's unambiguous nature and figured there's probably someone else out there who'll get it.)
@ThePotatoLlamaz
@ThePotatoLlamaz 5 жыл бұрын
You should try to make a similar program that converts audio into little animated mouth movements for animators
@calebquadrio1131
@calebquadrio1131 5 жыл бұрын
Just saying I can lip read and the reason I can’t tell what ur saying is because no one talks like that
@RyBrown
@RyBrown 5 жыл бұрын
caluppy he was over pronouncing words and that made the AI confused I think.
@colex1222
@colex1222 5 жыл бұрын
@Radium X I was able to get Vanessa
@Weg002
@Weg002 4 жыл бұрын
3:54 when I try to talk/listen to someone talking in a dream
@binaryorbitals
@binaryorbitals 5 жыл бұрын
Person: Read My Lips Cary: Say No More
@janeylala
@janeylala 5 жыл бұрын
When you didn't understand anything but you still enjoyed the video. *THIS IS AMAZING! SO COOL!* Few mins later... *WHAT DOES THAT MEAN? WATEVER!*
@spikeus3570
@spikeus3570 5 жыл бұрын
14:16 Carykh: Quiet I want to talk! AI: LET ME TALK FIRST Carykh: Let me talk first, please *And then you loop this
@v.6984
@v.6984 5 жыл бұрын
carykh: *"On March the 11th, 2018, at 11 PM, I did the unthinkable."* Me: oh no, please tell me he didn't read the entire bee movie scri- carykh: *"I read the entire Bee Movie script on camera"*
@sirclashin
@sirclashin 5 жыл бұрын
Lmao
@vuxigeck5281
@vuxigeck5281 5 жыл бұрын
What a nice way to start off the year! Finding _yet another_ awesome channel I'm gonna be enjoying for a pretty long time, I think!
@jansopi6967
@jansopi6967 4 жыл бұрын
I should say *OUR* lip reading AI. Staline aproves
@samkelson7990
@samkelson7990 5 жыл бұрын
I am actually currently trying to do the opposite. Using google speech recognition API and gentle(which I found thx to ur vid so thx) I am creating a lip syncing programming that will take audio from the mic, convert it into phonemes, then animate a character. Now that itself isn’t to hard but I want to do it live(live audio) so I am kind of struggling.
@blasttrash
@blasttrash 5 жыл бұрын
is the project on github?
@npric2883
@npric2883 5 жыл бұрын
Isnt that animoji
@samkelson7990
@samkelson7990 5 жыл бұрын
@@blasttrash no not yet
@Kitulous
@Kitulous 5 жыл бұрын
@@npric2883 animoji takes your picture and maps your muscle movement to a 3D model on a screen. Their project is to get the audio without the camera part and map it to a character on a screen.
@machodong6552
@machodong6552 5 жыл бұрын
Like vrchat?
@HappyLeeHL
@HappyLeeHL 5 жыл бұрын
A really interesting idea. I had a similar idea some months ago but I couldn't do it myself. I think maybe you should focus on the link between words in order to create a meaningful sentence, like the KZbin subtitle algorithm which can correctly transcribe audio to text most of the time. Combining that kind of algorithm with your lip reading idea, it might be good lip reading instead.
@kamaljotsingh6675
@kamaljotsingh6675 5 жыл бұрын
hey what about an AI to play Super Mario afap? that may break the wr.
@HappyLeeHL
@HappyLeeHL 5 жыл бұрын
@@kamaljotsingh6675 I've already made one, that can complete SMB almost as fast as the WR. kzbin.info/www/bejne/poTMgpp-j81-oM0
@nyroysa
@nyroysa 5 жыл бұрын
Holy Moly you are that super mario TAS man
@HappyLeeHL
@HappyLeeHL 5 жыл бұрын
@@nyroysa Hi, nice to meet you here.
@galric4270
@galric4270 5 жыл бұрын
I got the “have you got a moment” right 😃
@an_annoying_cat
@an_annoying_cat 5 жыл бұрын
AI should learn to animate so Cary could be able to upload more often
@binaryorbitals
@binaryorbitals 5 жыл бұрын
We went all of 2018 without a TWOW vid *That wasn’t very cash money of you*
@carykh
@carykh 5 жыл бұрын
i have 39 minutes left tho just kidding, twow 24a coming january i hope
@kevinlel
@kevinlel 5 жыл бұрын
carykh come on if you fly to a different time zone you could still get it out by 2018!
@TheRealPunkachu
@TheRealPunkachu 5 жыл бұрын
Wait twow is actually not just cancelled? No way!
@binaryorbitals
@binaryorbitals 5 жыл бұрын
carykh You told me the next episode would be between Christmas and New Years Eve. I am more disappointed at that then the fact that nobody made one of their TWOW submissions “The the the the the the the the the the”
@maxw179
@maxw179 5 жыл бұрын
@@carykh When do you think we'll have season 2? I missed the beginning of the first season and can't stop bingeing the series.
@EduardoReyes-uz2lt
@EduardoReyes-uz2lt 3 жыл бұрын
No one is gonna talk about on how he used diffrent animation style in the beginning
@marcelinadelacruz8826
@marcelinadelacruz8826 5 жыл бұрын
COMP: LAUREL AI: YANNY I HEARD "THE EARTH IS NOT FLAT"!!!
@serglian8558
@serglian8558 5 жыл бұрын
You shouldn't reveal that you are deaf!
@greenwolf1363
@greenwolf1363 5 жыл бұрын
I hear covfefe
@paranormalstick2289
@paranormalstick2289 5 жыл бұрын
I heard commit order 66
@Lilli_B
@Lilli_B 5 жыл бұрын
this video is so last year
@krillbilly1435
@krillbilly1435 5 жыл бұрын
*C o m e d y*
@sappyme
@sappyme 5 жыл бұрын
Yeah I like the cool stuff from 2019 like the sequel to the Logan Paul suicide forest video and a sequel to fortnight
@izzypin942
@izzypin942 5 жыл бұрын
IN AN HOUR BOI
@zegamingcuber857
@zegamingcuber857 5 жыл бұрын
Izzy Pin TIMEZONES BOI
@imie-nazwisko
@imie-nazwisko 5 жыл бұрын
Way to start new year with a dad joke
@KoenDerp
@KoenDerp 5 жыл бұрын
1:38 if you make the words "heaven high poop push" the opposite you get "hell low pee pull" which sounds like "hello people". wow im suprised i noticed that.
@ToHellWithReality
@ToHellWithReality 5 жыл бұрын
9:45 Uhh... What's that censor bar supposed to be covering? Because I don't think it did what it was supposed to do.
@prokaryotesys
@prokaryotesys 5 жыл бұрын
ToHellWithReality their emails, I think.
@ToHellWithReality
@ToHellWithReality 5 жыл бұрын
@@prokaryotesys I know that, but I didn't want to spell it out for two reasons. First, I didn't want to make it obvious for people looking for that kind of info. Second, comedic effect.
@krucible4889
@krucible4889 5 жыл бұрын
@@ToHellWithReality just r/woosh them
@prokaryotesys
@prokaryotesys 5 жыл бұрын
@@krucible4889 oof i got wooshed thats one of my life goals tho
@betin731
@betin731 5 жыл бұрын
@krucible r/itswooooshwithfouros
@bggamingdeluxe5658
@bggamingdeluxe5658 5 жыл бұрын
What else does K and H mean hmm... Cary *K* omments *H* ere dangit i was close. also this is my first comment from 2019, hehe
@Iwatoda_Dorm
@Iwatoda_Dorm 5 жыл бұрын
LIES!
@sand1573
@sand1573 5 жыл бұрын
BGGAMING Deluxe komrades
@awesomevideosonyoutube
@awesomevideosonyoutube 5 жыл бұрын
he's a time traveler huzzah
@TheBlacknoodles_
@TheBlacknoodles_ 5 жыл бұрын
Cary Killed Him
@tearlach47
@tearlach47 5 жыл бұрын
Must be nice to be in 2019
@raball
@raball 5 жыл бұрын
the blurry voice actually sounds great. i would turn that into music so fast
@DanielLopez-up6os
@DanielLopez-up6os 5 жыл бұрын
How about sign language Recognition AI? and maybe translation, ASL to UK Sign language etc.?
@johnsensebe3153
@johnsensebe3153 5 жыл бұрын
ASL to English might be more useful, but I think the idea is great. The tricky part would be the dataset. ASL uses more than the hands, so you'd probably need different types of clothing to train it on, as well as different skin tones, etc.
@elllieeeeeeeeeeeeeeeeeeeeeeeee
@elllieeeeeeeeeeeeeeeeeeeeeeeee 5 жыл бұрын
@@johnsensebe3153 Just train it on a black and white dataset
@johnsensebe3153
@johnsensebe3153 5 жыл бұрын
@@elllieeeeeeeeeeeeeeeeeeeeeeeee You're still going to have a variety of shades, short sleeves, long sleeves, no sleeves, frilly cuffs, etc.
@cuckling9031
@cuckling9031 5 жыл бұрын
like the one in unfriended 2?
@DanielLopez-up6os
@DanielLopez-up6os 5 жыл бұрын
@@johnsensebe3153 once the basic Data set is created you can create the sceletal system, apply that to the person being interpreted and it should be fine. and the various dataset you could get from news brodcasts, meanwhile the transcripts are usually in the CC/subtitles even if they'res a interpreter on screen.
@NativLang
@NativLang 5 жыл бұрын
CMUdict strikes again! Looked to me like some successes here. Now you got me wondering if you'd go even further weighting words / word neighborhoods by commonness, or by taking morphosyntax into account. Oh, and so much yes to the sinking smiles at 3:54 - that slow letdown of throwing out a hopeful spike solution and watching it fail.
@hanako-kun22
@hanako-kun22 2 жыл бұрын
OH MY GOSH I GOT THE LIP READING RIGHT!!! BOTH OF THEM!! I am *GOD*
@MarcTelang
@MarcTelang Жыл бұрын
wait why isn't your channel verified
@denischikita
@denischikita 5 жыл бұрын
I think you need to train netwot not only with lips, but with throat too. Because a lot of sounds became from vocal cords only
@migs1336
@migs1336 5 жыл бұрын
0:09 cause I'm communist Edit: 2:32 he uses the URSS to convert it to spectrogram two communist references in one video
@Kitulous
@Kitulous 5 жыл бұрын
URSS = ur SS
@RichardRMM
@RichardRMM 5 жыл бұрын
@@Kitulous mein leben
@bananogamer6972
@bananogamer6972 5 жыл бұрын
In Italian that would be easier because every letter has a sound
@Womenooo
@Womenooo 5 жыл бұрын
Not necessarily. He already incorporated the IPA (International Phonetic Alphabet) this is more accurate than any native language. It truly has a Sound matching only one sign. A language with less destincive sounds and less allophones would be ideal. And italian has 30 which is an OK low number, but 7 of them are vowls. And you want more vowls (i'd think) because vowls are created by obstructing the airflow in a different manner with the same tone. So an A and an O make the same with your vocal chords, but A you stretch your lips and O you round them the last part is the tongue which you don't see, but but you lips slightly move when you go through you vowels. (also yes I know italien has A E I O U so just 5 vowels, but phonetic vowls have a nother realizazion as transcripted vowels in written language. The vowls also include diphtongues (voewls that merge into each other) or vowls with a lightly different attribution. In english bad and bat have a different form of A just for example. So if he wants to make the system more accurate he would need a language with less allphones and less sounds that voiced and unvoiced differentiation. eg G K and T D and many more are basically the same sound, but one with your vocal choards vibrating the other without. You can find that out by looking in a mirror and placing a finger on your throat. than say ATA and ADA whole you say ATA you will feel nothing but saying ADA you will feel vibration. But both look exactly the same. And in phonetics both are basically considered the same. And there are many more examples of this in the english language. And they are bearing meaning. Like Tick and Dick... thats a massive one. or simple Dog and Dock. it basically guts a sentence. So this project basically is doomed to fail by just looking at the lips. The tongue is so very important. Lip reading is hard, and it works by guessing words. In a sentence some words do not make sense, so they are tossed out, but the AI cannot differentiate between a sensible utterance and a non-sensible, it can though make guess what word was said and maybe from that one could extrapolate a probable sentence that was uttered.
@bananogamer6972
@bananogamer6972 5 жыл бұрын
@@Womenooo In the italian alphabet a letter is told in the same way anytime even if it has a specific letter before or after in the english language for example the T it is read in a way while TH is read in another way and they have different sounds in Italian we don't have this problem the letter E is always say in the same way even if it has a G or a F before of after (sorry for my English but as you can tell I'm italian)
@Womenooo
@Womenooo 5 жыл бұрын
@@bananogamer6972 no you don't understand. It is not about how true the phonetics of a language are to its alphabet. It is about how simple the phonetics are. I just have a basic knowledge of Italian at best but an example for a problem would propably be g and j. Geco an Julia would both look the same on the onset of the word. I am not certain on the example though. It is really a problem of many European languages that have many phonemes that are realized in the mouth and not on the lips thus it is impossible to read them without contextualization.
@bananogamer6972
@bananogamer6972 5 жыл бұрын
@@Womenooo now I understand thanks
@ukkomies100
@ukkomies100 5 жыл бұрын
Emanuele Bonandrini or finnish
@glanni
@glanni 5 жыл бұрын
When you said you would use the transcript of a movie i was getting very excited. When you were talking about doing the unthinkable, i knew it had to be it. When you said you read the entire bee movie script on camera, i literally started clapping before i could care about my family being in the same room. I respect you so much for this, you really gave a big sacrifice.
@ball56
@ball56 5 жыл бұрын
14:04 oh good, I have mono audio setting on.
@alphabbbe8580
@alphabbbe8580 5 жыл бұрын
HAPPY NEW YEAR!!!
@aidanstg445
@aidanstg445 5 жыл бұрын
TofuMaster83 Happy new year!!! (In 1 hour for me)
@swordchicken5629
@swordchicken5629 5 жыл бұрын
and happy birthday bfdi!
@dolloptwerpandorange402
@dolloptwerpandorange402 5 жыл бұрын
O:08 Cary: Or I should say OUR lip reading AI *Soviet anthem starts playing*
@thatoneguy6139
@thatoneguy6139 5 жыл бұрын
Welp this is what I’m watching for the first vid of 2019
@a3dg638
@a3dg638 5 жыл бұрын
Fancy Spider same
@egg4861
@egg4861 5 жыл бұрын
Same bruhh
@Zalian
@Zalian 5 жыл бұрын
I'm really curious how it would sound if the raw phoneme data was pushed into sound output instead of trying to match it up to specific words.
@official-obama
@official-obama Жыл бұрын
maybe it would play every phoneme simultaneously, and the more confident it was in a phoneme, the louder it would be.
@TheJustinator
@TheJustinator 4 жыл бұрын
"Automate their entire channel." That's another hint for your next channel: lazykh
@data5023
@data5023 5 жыл бұрын
As soon as you said, "Which movie to pick," I instantly went, "It's Bee Movie, isn't it?" I've never seen Bee Movie to be honest.
@cassie_e
@cassie_e 5 жыл бұрын
Do it the other way - generate lip shapes from the audio! Automated lip sync!
@agentstache135
@agentstache135 5 жыл бұрын
Diordnas Darkunn he’s mentioned the possibility of doing that before to save time on animation, though I personally think a more hard coded approach would work better than a neural network
@jacobfeinland7878
@jacobfeinland7878 5 жыл бұрын
@@agentstache135 I thought for sure that would be what this video was about. I would love to see how well that works, either for generating actual video from audio or for using it to animate the character's lips.
@agentstache135
@agentstache135 5 жыл бұрын
@@jacobfeinland7878 My idea for how to do it copied from my comment from the video where Cary mentions it (the dance one) because it's an essay I'm not rewriting: Why would you need an AI for animating the lips? Why not just write (or use, I’m sure it already exists) an algorithm that takes a transcript (handwritten or using existing speech recognition (which I know is probably still technically an AI)) of what you’re saying as input and then move the mouth? I’m sure there are some parts that you’d have to manually do, eg screaming, but it’d be a lot more reliable and robust than an AI based on the audio. If I were to code it I’d mine a dictionary for the International Phonetic Alphabet (or some other pronunciation respelling) representation of each word. Then just figure out what mouth shape you make and how long you make it for each sound and put it all together into an animation. Obviously you’d probably still need to tweak it some more, depending on how time-accurate your transcript is, and that might be where an AI could help. But, I still don’t think an AI would be robust enough for the whole process, especially for a pretty discrete animation where if it picks the wrong mouth shape it’s pretty noticeable. Whereas if you were to just use it to help with temporal alignment, it being wrong would only show up as a small offset, less noticeable.
@tjahjobagaaa
@tjahjobagaaa 5 жыл бұрын
Using those mouths from bfdi.
@RRRR-jr1gp
@RRRR-jr1gp 5 жыл бұрын
Wait animators actually lipsync the characters? It feels so dumb I mean who's going to care
@ne01nvader
@ne01nvader 5 жыл бұрын
4:04 Don't blame poor computer, he is just trying to summon satan, nothing special.
@StarForgers
@StarForgers 5 жыл бұрын
I think that to a large degree this whole thing was flawed simply due to the angle that you are recording your face from. People don't look at a person from below normally. This make issues with some standard information sets one might normally use I would think.
@Versaucey
@Versaucey 5 жыл бұрын
It's not the A.I fault, it's ping is too high.
@GWindows3.1
@GWindows3.1 5 жыл бұрын
Vsus what the hell
@TheNerdBird_
@TheNerdBird_ 5 жыл бұрын
. . . I want to punch you so bad. The AI is ran Locally meaning it's sub-instant reading.
@jeeeves
@jeeeves 5 жыл бұрын
@@TheNerdBird_ no its ping
@themanfromutopia4743
@themanfromutopia4743 5 жыл бұрын
Look buddy, it's is short for "it is" but if you want to signify possession it's "its", not "it's", okay?
@TheNerdBird_
@TheNerdBird_ 5 жыл бұрын
@@jeeeves If it is ran locally, the ping would be less than a millisecond. It is ran locally. Quite annoying when people who don't understand technical and networking terms completely try to make statements to sound smart.
@treedeerthethingy4812
@treedeerthethingy4812 5 жыл бұрын
I HAVE NO FUCKING IDEA WHAT HALF OF THESE WORDS MEAN, BUT I LIKE IT
@lara4268
@lara4268 5 жыл бұрын
I was so proud when I guessed "do you have a moment"
@milesprower3488
@milesprower3488 5 жыл бұрын
0:03 It's The Captain from SpongeBob "are you ready kids, aye-aye captain! I can't hear you! AYE-AYE CAPTAIN! OHHHHHHHHHH!"
@user-en7dx1qp3k
@user-en7dx1qp3k 4 жыл бұрын
here's my solution to the number theory question at the end: let each number have 2018 digits ranging from 0-24, filling all unused spaces with 0s so if you were to write 1, you would get 00000...0001. each digit is used an equal number of times in this so the average sum for each place value is ((24)(25)/2)/25 = 12, so the answer is 12*2018 = 24216. Correct me if i'm wrong
@sanjaymatsuda4504
@sanjaymatsuda4504 5 жыл бұрын
You could have used the video of the longest word, the full chemical name of titin. Instant 3 hours with a full transcript available everywhere.
@BagelBrain
@BagelBrain 5 жыл бұрын
It would contain the same 5 samples of "words," though.
@thenimalu
@thenimalu 5 жыл бұрын
I live in Germany. It's Silvester. I am drunk. It's 6 am. I am watching Carykh. I hope I spell3d everything right. Happy new year!!!!
@godofdoor6558
@godofdoor6558 5 жыл бұрын
best ai
@adamyoung6797
@adamyoung6797 5 жыл бұрын
hsppy new yere
@LLAWLlET
@LLAWLlET 5 жыл бұрын
Frohes neues!
@bobross4082
@bobross4082 5 жыл бұрын
Dude. I just started watching your videos. I don’t know what job you have. But your a genius. Your literally improving computer programming extremely. I don’t know actually terminology. But your gonna be making huge money someday if not already. Your gonna be the reason robots become a reality
@ecicce6749
@ecicce6749 5 жыл бұрын
I think the AI works pretty well for the amount of information it has. I guess you could only improve it by choosing the correct words based on grammar and context and what words most likely are next to each other. Also an additional System to output back to audio using a network that is trained on combining lip movement and the detected phonemes into input for a network(easy trained autoencoder) that outputs your voice would make the Project complete. Would loooove to see that.
@zib350
@zib350 5 жыл бұрын
I strongly agree with the word choosing idea!
@meganbennett933
@meganbennett933 5 жыл бұрын
I actually got the "have you got a moment" one right.
@Spherical_Object
@Spherical_Object Жыл бұрын
BFDI references 3:44, 5:27 "yeah i know she was so surprised" is the first line spoken in bfdi (by match) 12:40 flower's announcer crusher brief 15:39 "take the plunge" is the bfdi 1a name (yes i did watch the whole video four times [twice with captions], so what?)
@akraus53
@akraus53 5 жыл бұрын
Decimated twice means -19% not -20% *ugh*
@-epsilon6269
@-epsilon6269 5 жыл бұрын
0:08 *COMMUNISM INTENSIFIES*
@anthonycaminiti8734
@anthonycaminiti8734 5 жыл бұрын
Stalin wants to know your location!
@marlon.8051
@marlon.8051 5 жыл бұрын
It's socialism not communism!
@d0nnyr0n
@d0nnyr0n 5 жыл бұрын
@@marlon.8051 Similar thing...
@marlon.8051
@marlon.8051 5 жыл бұрын
@@d0nnyr0n socialism tries convince the population that communism is great and communism dont
@d0nnyr0n
@d0nnyr0n 5 жыл бұрын
@@marlon.8051 That is not correct. See this *www.investopedia.com/video/play/difference-between-communism-and-socialism/* .
@jondoe5323
@jondoe5323 4 жыл бұрын
Thanks for helping my project on a video that an AI makes. I need it to read a transcript and create accurate voice and face. It then creates a video off of seeing images of faces off of the internet
@FoxBlocksHere
@FoxBlocksHere 4 жыл бұрын
"Tower owe wheat and sought owe-induced height eight of lamb late"
@cali053711
@cali053711 4 жыл бұрын
[IN TENNIS BALL VOICE]: James-
@teamont5
@teamont5 5 жыл бұрын
tumbnail: automate their entire channel. NO THANK YOU.
@cavemann_
@cavemann_ 5 жыл бұрын
Ok, Hannah, stop messing with Cary. Hannah, calm down. Hannah? Hannah!
@C4illin
@C4illin 5 жыл бұрын
Decimated twice would be 19%
@56independent
@56independent 4 жыл бұрын
3:54 !!do not play at night in the docks!! a demon stole my soul. DO NOT PLAY!!
@leofisher1280
@leofisher1280 5 жыл бұрын
Decimating twice does not mean dropping by twenty percent. Decreasing by 10% two times leaves you with 90% x 90% = 81% of what you started with, meaning a decrease of 19%. Nerd.
@TheVoitel
@TheVoitel 5 жыл бұрын
Come on, this brilliant problem is not number theory, it’s trivial statistics: The problem is obviously equivalent to the expected value of a sum of X1,...X2018 iid, where Xn is distributed discretely uniformous on {0,...,24}. Since the expected value is linear, we get E(X1+...+X2018)=E(X1)+...+E(X2018) = 2018 * E(X1) = 2018 * (24+0)/2 = 2018*12
@MsHojat
@MsHojat 5 жыл бұрын
That's what I was thought as well. Pretty easy if true (it does seem true). However since I did it all in my head on the first run I somehow I accidentally messed up by dividing both 2018 by 2 and 24 by 2 (when I only had to divide one of them), getting literally half the right answer. I know technically it's even improper to divide 2018 by 2 at all, but my brain tends to just manipulate the numbers anywhere as long as it doesn't affect the order of operations. Clearly I suppose it's still a bit problematic.
@LaskyLabs
@LaskyLabs 4 жыл бұрын
I think the data you used to train the ai is very useful. Thank you for making it public.
@binaryorbitals
@binaryorbitals 5 жыл бұрын
At 6:06 I could tell it would be the bee movie script Wow Cary, ORIGINAL
AUTOMATIC LIP-SYNCING
14:00
carykh
Рет қаралды 2,4 МЛН
The worst lie Mickey Mouse has ever told
13:27
carykh
Рет қаралды 2,2 МЛН
Why no RONALDO?! 🤔⚽️
00:28
Celine Dept
Рет қаралды 54 МЛН
The Singing Challenge #joker #Harriet Quinn
00:35
佐助与鸣人
Рет қаралды 44 МЛН
FOREVER BUNNY
00:14
Natan por Aí
Рет қаралды 25 МЛН
Does my AI have better dance moves than me?
20:33
carykh
Рет қаралды 1,2 МЛН
AI Learns to Write Rap Lyrics!
16:03
carykh
Рет қаралды 1,6 МЛН
Why 100% Speedrunning Cookie Clicker Is Almost Impossible
8:29
EazySpeezy
Рет қаралды 47 М.
I Made an AI with just Redstone!
17:23
mattbatwings
Рет қаралды 1,1 МЛН
Teaching convolutional neural networks to give me friends
15:29
Creating my own customized celebrities with AI
14:56
carykh
Рет қаралды 563 М.
The Dangerous Rise of AI "Authors"
38:39
Pinely
Рет қаралды 400 М.
Computer tries to replicate my voice!
15:41
carykh
Рет қаралды 1,8 МЛН
r/Softwaregore | artificial unintelligence
15:41
EmKay
Рет қаралды 476 М.
The first artificial intelligence I ever made
17:05
carykh
Рет қаралды 828 М.