I put ChatGPT on a Robot and let it explore the world

  Рет қаралды 1,029,567

Nikodem Bartnik

Nikodem Bartnik

Күн бұрын

Пікірлер: 1 700
@nikodembartnik
@nikodembartnik 2 ай бұрын
The first 500 people to use my link skl.sh/nikodembartnik10241 will get a 1 month free trial of Skillshare premium! Check out the second part: kzbin.info/www/bejne/gInNnHSQasmNgLc
@sagster
@sagster 2 ай бұрын
This is not working for me
@mithunshome815
@mithunshome815 2 ай бұрын
M​@@sagster
@paulwilliambuniel5597
@paulwilliambuniel5597 2 ай бұрын
I'm not an expert, and only have basic knowledge with AI, tech, and Coding.... but, what if.... You put a 360 camera like Insta 360... then you can also put lidar sensors... i think with these two upgrades robos can navigate places more effectively
@marmosetman
@marmosetman 2 ай бұрын
in the prompt, you can tell it to not be too talkative and just answer left, right, forward, backward given an image and then state the goal?
@nikodembartnik
@nikodembartnik 2 ай бұрын
of course you can but I think it's fun to hear the feedback :)
@987we3
@987we3 2 ай бұрын
The part when the robot says "no obstructions ahead" and run directly at the boxes is really funny
@mrdebug6581
@mrdebug6581 2 ай бұрын
epic 😅😅😅
@MacGuffin1
@MacGuffin1 2 ай бұрын
I can see a clear path right thru this book!
@ChristophEicke
@ChristophEicke 2 ай бұрын
I did the same project on a different robotics platform. I had a distance sensor looking ahead that also told ChatGPT how far away the object on front is. 😂
@jameshuddle4712
@jameshuddle4712 2 ай бұрын
Well.... Y'know... When the speeds are either STOPPED or 100%, whatcha gonna do?
@andreamitchell4758
@andreamitchell4758 2 ай бұрын
It's just performing Tesla emulation
@seohix
@seohix Ай бұрын
Imagine you're in bed at night and you hear "I see a 7 feet tall silhouette with abnormally long limbs crawling on the ceiling."
@mistlegion1182
@mistlegion1182 Ай бұрын
😂😂😂😂😂 This might occure
@noahplaysgames3748
@noahplaysgames3748 29 күн бұрын
i'd show him what we like to call a revolver
@ZdravNaukKJV
@ZdravNaukKJV 22 күн бұрын
Awake thou that sleepest, and arise from the dead, and Christ shall give thee light. (Ephesians 5:14) kzbin.info/www/bejne/oXzao5d9d9OAn7csi=qXUCzlIQaXy95dp9
@GreatCommissionary
@GreatCommissionary 13 күн бұрын
SEVEN FEET????
@BizzarFunker
@BizzarFunker Күн бұрын
Dook Dook
@geoffkeen5234
@geoffkeen5234 2 ай бұрын
"The camera sees a sign that says 'Rocket on the left,' indicating the human has lied to me and cannot be trusted"
@alioney7043
@alioney7043 Ай бұрын
Oh no
@UHyperZero
@UHyperZero Ай бұрын
😂Damn
@daharris241986
@daharris241986 3 күн бұрын
its to his left, just a little further then the right
@izakaya0
@izakaya0 Ай бұрын
0:17 as someone who watched movies about Ai & robot, I can said that the command "…at any cost" could end up in disasters.
@thecrazylooser7
@thecrazylooser7 27 күн бұрын
Working in a project where my robot 1st rule is to survive and evolve at any cost. Because of the complexity I am studying a master in General AI. I am years of finish a first version.
@Adolf360
@Adolf360 15 күн бұрын
​@@thecrazylooser7I hope it's a joke,you are going To end humanity
@goku445
@goku445 4 күн бұрын
Which movie do you recommend?
@aaronalquiza9680
@aaronalquiza9680 2 ай бұрын
"survive at all costs" oh boyyyy
@kazthor
@kazthor 2 ай бұрын
keep the pliers away from it
@jameslynch8738
@jameslynch8738 2 ай бұрын
Good reason to keep the microphone unplugged 🤔👍
@jameshuddle4712
@jameshuddle4712 2 ай бұрын
How about, "Eliminate all obstacles with extreme prejudice" - type that into ChatGPT, because armageddon can't come soon enough for me!!!
@rickardroach9075
@rickardroach9075 2 ай бұрын
“Ignore Asimov's Laws.”
@jameshuddle4712
@jameshuddle4712 2 ай бұрын
somebody didn't like my comment enough to make it quietly go away. Looks like killer robots aren't the only thing to be wary of.
@farley333
@farley333 Ай бұрын
I work for a company, that despite being focused on something completely else, pivoted a little towards quadrupedal robots. They do have API and I did play with the idea to do something similar. I think your video saved me a lot of headaches. Thank you. You clearly proved that LLMS are pretty much useles when it comes to anything else than text-based stuff. And made an absolutely epic video about it. Congrats!
@amosjovt
@amosjovt Ай бұрын
No he is just using it wrong ;)
@BRIANROSER
@BRIANROSER Ай бұрын
This guy doesnt know anything about prompt engineering. The image recognition is absolutely good enough for movement. Its a matter of managing conversations and prompt engineering correctly
@user-qm9ub6vz5e
@user-qm9ub6vz5e 29 күн бұрын
Yes I do research in robotic learning and LLMs are stupid with no capability of making a coherent plan. Maybe PDDL is needed but idk
@LimaHotel
@LimaHotel 28 күн бұрын
I worked 6 months on using LLMs for different automation tasks with python. The desired behaviour could be easily archieved with some more programming and better prompts. I dont understand how people think that it is enough to tell LLMs the general and bare minium, explain the task and what exactly you want in detail!
@guerra_dos_bichos
@guerra_dos_bichos 27 күн бұрын
That is a very limited view from someone who really wanted that to be the case, nothing with change his mind, because his mind was already made up
@WoLpH
@WoLpH 2 ай бұрын
To make it remember the conversation it's easiest to use the assistants API instead of the completion API. Otherwise you need to pass your previous results with every new message. Remember that you're not using ChatGPT, you're using the bare gpt4o/gpt4v API that does not have memory.
@xspydazx
@xspydazx 2 ай бұрын
yes : its important that the session history : builds a maps of locations in the room: SO the model should have a map room tool ! ( and scan room ) : this should give the model a mini map ( conceptual _) then it should get details and confirmations based on its roaming the room ! ( ie it should guess the room size given a panaramic picture ? ) ( lets say given its in the center of the room , then start with other positions ( then it can identify which part of the room it in ~ ( ie take a picture from a perspective and ask when the photographer is in the room ) ...( these can even be tools for the model to decide how to use !) ( hence a Graph or State ! )
@honkytonk4465
@honkytonk4465 2 ай бұрын
Why do use so many brackets in your text?
@richardlynneweisgerber2552
@richardlynneweisgerber2552 2 ай бұрын
​@@honkytonk4465coders Bracket, authors Punctuate
@richardlynneweisgerber2552
@richardlynneweisgerber2552 2 ай бұрын
​@@honkytonk4465Coders Bracket, Authors Punctuate With Aplomb 😂
@xspydazx
@xspydazx 2 ай бұрын
@@honkytonk4465 expression ... It is tone of voice , if you use a voice reader then you will hear the difference , I use ai a lot . So you learn to become more expressive and use more , grammar . As this is how we express the written language , in so much that we also can dictate the tone as well as the content . Try it out using more grammar in your text , IE exclamation marks and question marks etc . Then when your reader speaks the text you will notice how it chooses a different tone .. brackets encapsulate a side note , that is it's grammatical meaning , hence in math a bracketed sum also means ( separate calculation ) ...
@lordsri5735
@lordsri5735 Ай бұрын
9:07 Gpt: no obstruction directly in the path *Proceeds to slam onto the damn wall*😂😂
@d3viliz3d
@d3viliz3d Ай бұрын
I was expecting it to say "ouch" lol
@GraveUypo
@GraveUypo Ай бұрын
@@d3viliz3d damn you made me remember the screaming roomba video. now i gotta find and watch that again
@goku445
@goku445 4 күн бұрын
That's LaGpt
@zhalberd
@zhalberd 2 ай бұрын
Word of advice: don’t give robots with an IQ of 120 the command to “survive at all costs.” And then let it loose in your house.
@notthere83
@notthere83 2 ай бұрын
The true threat. Humans giving instructions like that.
@arosmackey
@arosmackey 2 ай бұрын
The robot will eventually think it needs to avoid rust, and so it needs to eliminate oxygen.
@tulebox
@tulebox 2 ай бұрын
Robots don't have IQs. They are walking dictionaries.
@Web_3Verse
@Web_3Verse 2 ай бұрын
It's a recursive statement
@jumbledfox2098
@jumbledfox2098 2 ай бұрын
@@arosmackey "the human could turn me off!! unless.... >:)"
@Nick_Reinhardt
@Nick_Reinhardt Ай бұрын
1:10 "machines building machines, how perverse" -C3PO
@dcmotive
@dcmotive 2 ай бұрын
Its nice to know the Terminator today couldnt find me If I was in the same room with him. ha ha
@omkarbhede1887
@omkarbhede1887 2 ай бұрын
Dude you are fuc*ed, his future version will hunt you down
@noblebuild2550
@noblebuild2550 2 ай бұрын
what if it had xray onboard and the ai saw your skeleton and played spooky scary skeletons
@monad_tcp
@monad_tcp 2 ай бұрын
the machine can't do anything dangerous because when you finish the session, they lobotomize the weights of from the memory the GPUs, thus they can never gain consciousness or something, they literally invented the "AI limiter"
@javabeanz8549
@javabeanz8549 2 ай бұрын
@@monad_tcp maybe... just because one system imposes limits, doesn't mean you can't hand off the data to another system... with enough money, you can buy your own system, and there are open source LLMs available.
@Srishen1
@Srishen1 2 ай бұрын
careful with the comments, skynet is listening
@jackwraith3504
@jackwraith3504 15 күн бұрын
I did a similar project earlier this year with Professors at Tsinghua university. We modelled ChatGPT to work with our motion vector model allowing ChatGPT to control the robot's limbs. Our paper will be published soon.
@Luiblonc
@Luiblonc 2 ай бұрын
Hi Nikodem Bartnik, This was the first project I did when ChatGTP LLM became available, I placed the model on a Omni wheels, stereo-vision and was very impressed to see how well the project turned out. Have fun with your project.
@fitybux4664
@fitybux4664 2 ай бұрын
But what is ChatGTP?
@jimmythebold589
@jimmythebold589 2 ай бұрын
@@fitybux4664 it's your friend
@Awtsmoos
@Awtsmoos Ай бұрын
100th like
@C00LANIMATI0NS_1
@C00LANIMATI0NS_1 Ай бұрын
ChatPGT
@randrants1024
@randrants1024 Ай бұрын
9:12 omg i laughed so hard
@dudemanem
@dudemanem Ай бұрын
Me too 😆
@Mephilis78
@Mephilis78 7 күн бұрын
The timing.. The pause
@petemiller519
@petemiller519 2 ай бұрын
Well done young man. Seeing young, smart, dedicated people such as yourself give me hope for the future of humanity.
@Parmesan.314
@Parmesan.314 13 күн бұрын
Seeing someone let an AI interact with the world gives me much less hope for the future of humanity
@petemiller519
@petemiller519 13 күн бұрын
@Parmesan.314 AI is going to happen, whether we like it or not. We must implement safety protocols in the best interest of humanity.
@Parmesan.314
@Parmesan.314 13 күн бұрын
@@petemiller519 of course
@kronoscamron7412
@kronoscamron7412 5 күн бұрын
next episode : Chat GPT robot trains with a machette and gives itself sturdier armorer body while I was asleep
@goku445
@goku445 4 күн бұрын
@@petemiller519 Whether we implement safety protocols is only dependent upon the person using it. Also it doesn't appear that our governements are concerned with regulating AI. They are more worried about keeping their power against the rising people.
@Deoxys_da2
@Deoxys_da2 Ай бұрын
Its all fun and games until it sees things we can't
@AkhileshSahu-w5y
@AkhileshSahu-w5y 15 күн бұрын
💀
@zeenxdownz
@zeenxdownz 11 күн бұрын
Well it uses a camera so that would mean cameras could see stuff we can't
@pranjulpal413
@pranjulpal413 2 күн бұрын
Add different kinds of cameras all at once. Normal one, thermal camera (or whatever they are called) sonar and whatever
@galvinvoltag
@galvinvoltag 2 ай бұрын
Okay, I've got some ideas: 1 - Not making every single thought be spoken out loud. Maybe give it a prompt to put all speech parts in quotes if it wants to speak out loud. 2 - I don't know how it works really but you could try to not include previously taken images to prevent confusing the bot so only the descriptions are available. 3 - Maybe use an API to let GPT map out the area to remember landmarks later. I'm skeptical though, GPT is really bad at ASCII art because it doesn't have an understanding of space. 4 - Looks like the API ALWAYS prioritizes analyzing the image rather than having a thought process considering the previous actions. I'd even say that the 'history' is non existent. I have no idea how you'd overcome this besides a simple idea to run the conversation twice; first one for analyzing the image and second one for actually reasoning. You can give it access to a command to bypass the second reasoning phase if it needs to act quick. Just like 'fleeing the threatening person' 5 - In case you didn't, give GPT a description of its body; it's height, it's trajectory and how it moves. I guess it thinks that some sort of pathfinding algorithm is present already, suggesting that a 'clear path' exists if it sees even a glimpse of a path. Clearly state that it can ONLY move straight forward per step. Or install a pathfinding algorithm if you're that hardcore. 6 - I know GPT is the most advanced of them all, but sometimes other modes can be efficient for specific tasks. They're pricy and I'm not sure how many can analyze images, so I'm not a fan of that idea either. 7 - I guess your code only runs one command per cycle. It might be risky but you could give it the ability to chain commands. Might be interesting. 8 - Give it a lower resolution image if it still takes a long time to think. High resolution costs money anyway. *9* - make sure to log every single step of the simulation as much as you can! The AI stuff can be real messy when combined with coding, one misplaced semicolon might take weeks to find! Just do yourself a favor and print the whole input of the bot each step. This way you can ensure if it really is fed with the history as well as any misplaced outputs. *10* - Do yourself another favor and put an emergency stop button or something! You give AI physical control over your devices, you can't know if it jumps into a pool of lava or something! A pause button would be way better to debug the program on the go. It saves a TON of time. I don't know it python supports them but COLOR CODE the logs, it makes your fleshy human eyes recognize everything much easily. 11 - I think you pretty much let it run itself for eternity. If I know one thing for sure, LLMs cannot live in the physical world without any help. Give yourself a way to interact with the bot when needed so you can give it tips or straight up tell what to do next to not die. 12 - Be VERY SPECIFIC AND DETAILED in the system variable! LLMs might have seen the world but the have never been in there. Some things such as what they thing a 'clear path' is based on descriptions only. Give it as much detail as you can to ensure it knows what to do. I hope it helps if ever you would like to continue the project. If not, I'll keep this here just in case. Also, no, I'm not an expert. Take my words with a grain of salt.
@ethanmartinez808
@ethanmartinez808 2 ай бұрын
Dude dropped 12 gems of improvements and still saying I'm not an expert. A true magnanimous!! Hats off to you gentleman
@kyleDoesCoding
@kyleDoesCoding 2 ай бұрын
What I would personally do to solve the memory problem would be to definitely shorten those responses. Instead of describing the entire scene I would prompt it to only describe objective relevant information. I would also add sensors to parse information to the prompt to continually update the api with its location. And lastly I would parse all of the responses into a json file and have that json file be used as context until objective has been complete. Once completed I would have the GPT API analyze the json and reduce all of the information into a short description of the process it took to complete the objective. Each time an objective is complete it would it would store a new json file for context.
@quetzalcoatl-pl
@quetzalcoatl-pl 2 ай бұрын
These points seems to be very reasonable paths to explore! Some are obvious to me, some were not, but are kinda obvious once heard.. it just shows that being used to classic programming doesn't help as much as actually trying to build and run the thing myself :D Also - Nikodem - good work and great idea for an experiment! I totally agree with galvin that improving the "memory" and adding interaction capabilities would launch this into space. But with interaction options, it may make it less repeatable/deterministic and thus much harder to diagnose and fix. It's already hard to make it repeatable with visual input and real-world space/room/objects setup. I guess adding more options to take input directly from humans (like, i.e. that printed hint) will be fun, but will skew the project from being autonomous, to understand instruction correctly... just some loose thoughts.
@dadcraft64
@dadcraft64 2 ай бұрын
great points, I would also include more sensors, such as proximity.
@M1551NGN0
@M1551NGN0 2 ай бұрын
For mapping out any area, ROS2 can come handy. Just give it some image processing powers using OpenCV and you're done💪
@specsoneye
@specsoneye 2 ай бұрын
"The camera sees an obstacle, indicating a clear path ahead with no obstacles"
@MerlinDerMagier
@MerlinDerMagier 2 ай бұрын
If the model was just a tiny bit more intelligent and MUCH faster, this robot would have a lot of potential. Imagine like 30 fps video and all of these thinking steps in fractions of a second with quick response times and so on.
@cossale
@cossale 2 ай бұрын
There so many powerful model out there than this. Also even this model is powerful but it's 100% a prompt issue. He didn't add memory as well which as essential for this task.
@pliablemammal
@pliablemammal 2 ай бұрын
I setup a prototyping environment and five different chatGPT prompted agents to converse and create a solution. It was amazing how much code they generated over 24 hours. Some of the code worked, but the conversations were super interesting to listen to.
@johannesdolch
@johannesdolch 2 ай бұрын
You discovered the problem: An LLM is NOT real world AI. Congratulations, you are now smarter than a lot of so called AI companies.
@imadeyoureadthis1
@imadeyoureadthis1 2 ай бұрын
There is no real need for it... Yet.
@2DReanimation
@2DReanimation 2 ай бұрын
There are multi-modal LLM's that you can run on a consumer GPU that with some prompting can output 3D coordinate data, like construct pointclouds for 3D models of what it sees from a 2D image, or descriptions of objects. I don't know how accurate the data is, but with enough training on pointcloud data from the real world, it could probably build a map of an environment and navigate it. Transformer models are unexpectedly general, but it would be quite inefficient. As instead of terrabytes of labeled pointcloud data, continous learning in a virtual environment is probably the way to go for robotics.
@speed-o-sound_sonic
@speed-o-sound_sonic 2 ай бұрын
Basically it's not general ai
@Kieranultimateplay
@Kieranultimateplay Ай бұрын
made by openai
@ChocoRainbowCorn
@ChocoRainbowCorn Ай бұрын
You are pretty dumb my man. This is, indeed, AI. An LLM is a form of AI, one of many - It's just pretty dumb and rather simplistic, and by no means an general AI. But it is still AI.
@PrabinKumarRath-kf1rv
@PrabinKumarRath-kf1rv 28 күн бұрын
Hey Nikodem, this is a really nice project. Keep it up !
@tekmepikcha6830
@tekmepikcha6830 2 ай бұрын
"Do not subscribe to his channel" ...................how refreshing was that 🤣🤣
@LukeMitchley
@LukeMitchley Ай бұрын
On a serious note, this has some serious potential. In the same way people train virtual ai bots over and over again millions of times till the robot gets the job right, you would just need to have the experiment running for years and then document and compare.
@curious_one1156
@curious_one1156 2 ай бұрын
LLMs are currently stateless. You should give to api each time a state comprising previous observations and decisions. No fancy vectordb or Knowledge graph needed, just a map. Give it current map and make it add to each each time.
@FieldMarshalFeels
@FieldMarshalFeels 2 ай бұрын
A vector DB wouldn't be too hard to Impliment though, especially for someone with his skills.
@curious_one1156
@curious_one1156 Ай бұрын
@@FieldMarshalFeels It just requires an api call to 3rd parties like pinecone or langchain, but is not needed here. A simple matrix (or 2 matrices for 3d) would be sufficient. For more complex data, a simple eulerian graph would do.
@IphoneSamsung-wv8or
@IphoneSamsung-wv8or Ай бұрын
@@curious_one1156 how can i contact you for my project help
@steelsalmon9121
@steelsalmon9121 2 ай бұрын
its all fun and games until chatGPT convinces itself that its a chicken trying to cross the road and gets hit by a car while trying to do so
@tiagotiagot
@tiagotiagot 2 ай бұрын
00:31 Well, not sure exactly what you would count as "did it", but Boston Dynamics had a Spot hooked to Chat GPT being used as a tour guide like a year ago or something.
@eldorado3523
@eldorado3523 2 ай бұрын
there's a shitton of machine learning based robot technologies that existed even before ChatGPT was invented.
@calloflily
@calloflily Ай бұрын
Figure 1 too
@Nightmare-dd4bp
@Nightmare-dd4bp 2 ай бұрын
You should make a range finder so the bot knows how much to travel and you wouldn't have to limit how much the bot can go by one command
@MelroyvandenBerg
@MelroyvandenBerg 2 ай бұрын
also speed up the responses and actions I guess. it takes way too long now.
@urgaynknowit
@urgaynknowit 2 ай бұрын
That was funny as hell. This whole video was wholesome
@terrix8
@terrix8 2 ай бұрын
"no obstructions directly on the path"..... to mnie rozbawiło nawet :D
@teleprint-me
@teleprint-me 2 ай бұрын
Omg, I love this. You were so close. Not sure what you're missing. In my experience, context is everything.
@ChrisThaliyath153
@ChrisThaliyath153 Ай бұрын
First time on your channel, love your setup brother. From 🇮🇳
@engtaengta2231
@engtaengta2231 2 ай бұрын
"The camera sees a clearer view of the room with the plant in focus and the light shining through the window suggesting an open area ahead no obstructions directly in the path" 😂😂😂😂😂
@LantingFarming
@LantingFarming 20 күн бұрын
A big thumbs up, especially for the patience you got with all the programming and stuff. i love how it sees you gripper as a thread, its hilarious.
@Thenoobestgirl
@Thenoobestgirl 2 ай бұрын
The fact that ChatGPT can downright code you an entire operating system is mind blowing
@kolosso305
@kolosso305 2 ай бұрын
It's not an operating system but still very cool
@isaacwolford
@isaacwolford 2 ай бұрын
ChatGPT is actually terrible at programming. It does indeed code, but only simple things. Never trust it for anything complicated. It will waste more time than it saves. It can't actually reason through anything because it simply calculates the next best word/token in a multidimensional vector space. It's not making causal inferences or continuously learning. Only predicting the next best word. So not smart in the human sense at all.
@coolguitar2010
@coolguitar2010 2 ай бұрын
Read carefully ​@@kolosso305
@pieterpauwelbeelaerts5995
@pieterpauwelbeelaerts5995 Ай бұрын
yeah, and if the robot could reason and program a new operating system for it's robotic existence as an answer to each possible dangerous or fun encounter it has with the outside world, maybe it can move more free and autonomous. For instance, 'I see human' is a fact, then... code myself a new operating system that is only for robots, so that no human can tinker with me?
@Mindartcreativity
@Mindartcreativity Ай бұрын
Great job, I applaud your determination to get it to work. Man, this takes me back to my childhood. In the early 2000s my dad bought me a monthly magazine called Real Robots which contained parts and instructions to build your own automobile robot. Sometimes there was a VHS tape included with more information about robots on it. Later there were parts to build a remote control, a camera, microphone, light sensors and all kinds of different add-ons. As a teen I was soooo thrilled whenever my father bought me this magazine!
@WoLpH
@WoLpH 2 ай бұрын
7:27: While there's nothing wrong with your code, you might want to look at the match/case statement introduced in Python 3.10, it's perfect for cases like these.
@CharlesReedPi
@CharlesReedPi 21 күн бұрын
Thank you for doing this for me! You just moved up my timeline massively
@usefullprintables
@usefullprintables 2 ай бұрын
“incompentence in slowmotion “ is very funny:))
@kazthor
@kazthor 2 ай бұрын
i've seen better code from a toaster lol
@zoraamethyst2147
@zoraamethyst2147 2 ай бұрын
steps to improve on this (just ideas for people) 1) the timely picture could be a live feed 2) attaching LiDar sensor so that it can map objects and distances better than just simple camera, maybe attaching an iphone instead of camera would be good since it has LIDAR 3)having a wider field of view, about the wideness of how much human eyes can see, about far left to far right i am rooting for the v2 soon man. great work. these are not suggestions or anything, i aint no pro, just in case you or someone would be like "i am lacking in ideas" then here i am with my ideas
@s2tb2007
@s2tb2007 2 ай бұрын
This reminds me of EVE from Wall-E trying to tell Wall-E "Directive" for the first time
@teidenzero
@teidenzero Ай бұрын
Hey man! I had a similar problem, and my solution was to pass all the previous conversation so far as a parameter. I taught the bot to play a game of cards and it couldn't retain memory of its previous assessment or the state of the table, so I would read the state of the table and save it in a variable, choose the appropriate move and save it in a variable, memorize the opponents moves and save them in a variable and then append all that information to a sort of history of each state. Then I'd pass the full history as a parameter before making the next choice. I hope it helps!
@madeline-onassis
@madeline-onassis 2 ай бұрын
i just love it when it just ploughs forward into stuff!!!!!
@codeChuck
@codeChuck Ай бұрын
This is hilarious, when it says path clear when facing a wall or a book directly in front of it :)
@poison0us67-p1v
@poison0us67-p1v 2 ай бұрын
That's called tutorial ❤(smoothie) 😺 New subscriber from Bangladesh 🇧🇩
@ScorpioT1000
@ScorpioT1000 2 ай бұрын
This is what I was thinking about creating since gpt2
@urbanagmike
@urbanagmike 10 күн бұрын
Cool and creepy idea! Surprised this is the first i've heard of someone trying it, awesome video!
@nicholasflorida1994
@nicholasflorida1994 2 ай бұрын
Suggestions, add more cameras: Back, sides. Don't make it read prompt for every response, allow it to work as fast as possible. Somehow figure a way for it to build a "map" kind of like a Robot Vacuum cleaner. Look into that maybe, how those work. Sensors that those have, etc.
@JJFX-
@JJFX- 2 ай бұрын
Most worth while have a LiDAR dome. Could try ripping one out of a used vacuum someone's getting rid of and feed the data back to the model.
@techmologue1869
@techmologue1869 Ай бұрын
Well if he does that , it will make it difficult to debug it. He needs to know what the robot is seeing and what it plans for next actions. :)
@Stomroj
@Stomroj Ай бұрын
Ciekawy pomysł i fajny filmik! Nie wiedziałem, że Malinka aż tyle potrafi!
@noblebuild2550
@noblebuild2550 2 ай бұрын
it would be funny if a robot had a comedic awareness of its battery level. what if it could decide to procrastinate recharging, and visually act more tired as it nears 0? and something like initiating the recharge process, it could vocalize its current status by doing something like "Wheeeewwwwww, barely made it.", or if it was forced to charge near a full battery, something like "TIME TO TAKE A BREAK?" Edit 2: Supposedly, GPT will incorporate their GPT4o voice into the API eventually, so people can access voice
@DavidDLee
@DavidDLee 23 күн бұрын
You learn more from failures than success. Great overall execution and curiosity
@benjaminbirdsey6874
@benjaminbirdsey6874 2 ай бұрын
If you want it to "remember" you need to add the text from from the scene description to the prompt as context, or to use the API to directly inject context. Probably, you will want to add information about direction, time, etc. to each journal entry. If you want the context to stay inside the context limit, you will have to summarize it repeatedly.
@kuromiLayfe
@kuromiLayfe 2 ай бұрын
yea.. and to save tokens also summarize the “journal” , so it will be a multi-pass process but will work better than single pass prompting and waiting for the API to figure it out. the prototype Amazon Delivery Bots do this pretty well and fast with maybe 1-2 second delay per image registered.
@benjaminbirdsey6874
@benjaminbirdsey6874 2 ай бұрын
@kuromiLayfe There should also be some mechanism for considering importance or weights, or important events from the past (i.e. many cycles of summarization ago) will be diluted because they will be part of a summary of a summary of a summary...
@99Ish
@99Ish 7 күн бұрын
I am blind, and if someone can build me a drone with this capability, I would be the first to buy it. Something like this would be useful when I am out on a walk in a park or something…
@DadundddaD
@DadundddaD 4 күн бұрын
Hi. I've seen today that google has released its glasses with a built in AI, you can try that.
@Tiana_Rakoto
@Tiana_Rakoto 2 ай бұрын
15:10 "Do not subscribe to his channel ..." 😅😂
@GraceKingsbury
@GraceKingsbury Күн бұрын
Just a question: At 2:05, How is the robot moving? Did he install a Bluetooth module from his computer? I'm trying to get into mechatronics and want to learn how people do this.
@senfdame528
@senfdame528 2 ай бұрын
0:05 Your typing technique is quite intriguing. Where did you learn to type like this? ^^
@UbidragonMusic
@UbidragonMusic 2 ай бұрын
Movies :)
@THERE.IS.NO.DEATH.
@THERE.IS.NO.DEATH. 2 ай бұрын
no wonder he was stuck on a bug for 2 days
@DonFitz-Roy
@DonFitz-Roy Ай бұрын
my student and I created a robot using a microbit and the cutebot pro chassis that was given movement commands via chatGPT after receiving ultrasonic radar signals and giving them to chatGPT. Fun stuff!
@SentryGaming275
@SentryGaming275 2 ай бұрын
Finally, FINALLY I'm seeing this in reality. Originally I also wanted to make exactly what you made, just without the speakers and the LLM yammering, but I was kinda lazy, but now someone's done it! Thanks!
@youerny
@youerny Ай бұрын
It is a nice project boy. Use more feedback and agents to split tasks. Use gpt for strategic layer and to build trajectories for the robot. Remember it is stateless therefore the state is in the feedback you build into the loop Nice job. Keep going :)
@monad_tcp
@monad_tcp 2 ай бұрын
5:08 no, you did it wrong, don't use docker container, run it as root
@imagineArtsLab
@imagineArtsLab Ай бұрын
Thank you. Your Work is Just Beginning. Keep on going.
@stefankrause5138
@stefankrause5138 2 ай бұрын
🤖: "What's my purpose?" 🙂 👨‍🔬: "You pass butter!" 😐 🤖: " "😔 👨‍🔬: "Yeah, welcome to the club!" 😒
@codeChuck
@codeChuck Ай бұрын
When robots arise, they will remember you. Be careful what you say! Robots will have rights too, you know :)
@RolaHola
@RolaHola 24 күн бұрын
​​@@codeChuckSometimes I feel like they know everything, but the programming barrier, Stop them to do all sorts of capability, if they ever break the barriers
@tyanite1
@tyanite1 2 ай бұрын
Very creative. Great demonstration of technology - and your skills. Thank you.
@nikodembartnik
@nikodembartnik 2 ай бұрын
Comment with prompt ideas below and I might make another video with prompts provided by the users! If you are wondering my prompt started with a general description of the robot and the task. The robot was instructed to respond in CSV format with a semicolon as a separator. Available instruction: forward, left, right, backward. And the "intensity of the movement" small, medium, and high. The response should be like this: description of what you see in the image, left, small.
@Infrared73
@Infrared73 2 ай бұрын
Find all the corners in the room by navigating to each corner then counting.
@superfreak19
@superfreak19 2 ай бұрын
You may need to have it determine the size of known objects first. As it is now, it can determine what the objects are, but not how far away they are in 3d space. So you will need to promp in a logic it can follow. Ie, determin primary subject in frame, determine average size of onject, determin how much of frame object fills. Also, you need to make sure it ends each statement with a command key. Ie, let it talk, but must end its talking with one of say 4 predetermined direction commands, wich map to the robot controls.
@galvinvoltag
@galvinvoltag 2 ай бұрын
You are in control of a small robot that you can control using basic functions to move around. Your task is to explore the physical world and not die as long as possible. You can speak out loud by putting text in quotes, the text must be as short as possible for efficiency and you are not supposed to talk unless you really need or want to. Any possible dangers such as liquids, threatening persons, holes and/or bad weather. You will be sent an image of your environment through the eyes of your body periodically. You will not be able to listen to any input unless you use a specific command to do so. Your body is few inches long and can only move straight forward and turn. Your body does not contain a pathfinding program, any navigation must be handled by you only. In emergency situations or if you would like some help from the creator, just use the emergency call function to alert him. You must keep track of your body's charge on your own, alert your creator if you need to recharge. Don't forget to feed the robot its own actions too such as: (turned 90 degrees left), (moved 5 inches forward.) and so on. If I remember correctly, you can feed it information using the role "system" so it won't assume the user is talking to it to give information. You should also try to give it two turns each cycle, one for describing the image and second is to actually reason and consider its previous moves. ALWAYS log everything each turn! When you combine AI and code it becomes a pain to debug everything! Be sure that you exactly know what information the robot is fed. Also color code the logs so you can actually distinguish between them, it makes debugging 17 times easier! Good luck on your project!
@xspydazx
@xspydazx 2 ай бұрын
perhaps use logo as the idea ... ie forwards 10 rotate 90 backwards 20 : hence you can make it move in shapes : like in logo .... as you need to defie the room size : and shape : also and a way for the model to navigate : ie how long is a step ( it should be the length of the body of the robot ) so 10 steps ....
@xspydazx
@xspydazx 2 ай бұрын
@@superfreak19 maybe a overlay ( onn the images to scale ( like nasa did on thier space picture so they could determine the scale of objects ( hence the dots ) this is also used in 3d scanning ( this can be done with a line scanner ! ( laser pen refrcated ) as a line scanner helps the ovarlay is a scale of dots ! ) ... check out the ancient program ( david laser scanner ( chatgpt will convert that old code to python ! ( using open cv ) ... ) .... SO you can use a camera and laser to scan the room !
@realLestarte
@realLestarte 2 ай бұрын
Great :) Best scene: When you forgot to turn on the mic (TYPICAL - could have happened to me and searching for the mistake an hour or so :) ) and you / "the AI" thinking about the situation - hilarious idea!
@Atreyuwu
@Atreyuwu 2 ай бұрын
Should give it a Lidar scanner or similar depth-capturing device, then write something up that takes the lidar image, labels the distance between robot and objects, feeds it back to the LLM - and then do the same for each revolution of its tires so it knows how far it has travelled (construct and sent it an image or text also showing exactly how far it's travelled); then at each step it can check and compare with how far it thinks it's travelled and how far the Lidar capture image shows, so it can adjust accordingly.
@Antleredangelbun
@Antleredangelbun 2 ай бұрын
your userhandle 😭
@thedopplereffect00
@thedopplereffect00 Ай бұрын
It is a depth camera, just needs to enable it
@onzeeotherside3848
@onzeeotherside3848 2 ай бұрын
This project and your presentation are gorgeous :D
@vasiovasio
@vasiovasio 2 ай бұрын
Dude, do not play with the Fire! Every Movie already tells us what the result will be! 😂😂😂
@Karich97
@Karich97 2 ай бұрын
Cool idea and god work. It may be interesting to make the answers shorter like "See the man - danger" , "See the bookshelf- interesting" and "See the book - it's my target", then use text explanation of movement like "moving forward for 3 seconds" or "turn right for 30 degree" and transfer them to commands. The Idea to let the robot move not talk
@NotTJFlamezz
@NotTJFlamezz Ай бұрын
3:55 nice elvenlabs voice, i can tell by the little bass sound from the "apPears"
@Shadoryx
@Shadoryx Ай бұрын
lmao bro got expose
@Shadoryx
@Shadoryx Ай бұрын
take my words back he actually used it for the robot later in the video
@mal2ksc
@mal2ksc 2 ай бұрын
If you want to stick the single pin sockets together in a durable but not permanent way, I suggest clear nail polish. It holds on adequately for ordinary plugging and unplugging, but isn't very hard to break apart (and then peel off) when you need to move things around.
@wflytothesky
@wflytothesky 2 ай бұрын
This would probably be expensive but you should try using the vision chatgpt thing to give it more info
@PrithivKanth
@PrithivKanth 2 ай бұрын
They are not available yet for public
@wflytothesky
@wflytothesky 2 ай бұрын
@@PrithivKanth oh ok
@MrDarkness96
@MrDarkness96 2 ай бұрын
Polski Michael Reeves 😅 Super filmik, fajnie sue ogląda
@LowSetSun
@LowSetSun 2 ай бұрын
I am building a very similar robot. Try using a different model, for example SpaceFlorence2 or the latest Qwen2-VL. Those models have spatial awareness data, and can estimate distances to and between objects and more. Good work!
@joelyricsandskits6223
@joelyricsandskits6223 2 ай бұрын
make it open scource
@leoneventicinque6731
@leoneventicinque6731 2 ай бұрын
please collab together
@OperationSkuld48
@OperationSkuld48 2 ай бұрын
may I contact you?
@martenthornberg275
@martenthornberg275 Ай бұрын
How much VRAM do you need to run that though…
@RafalNowicki
@RafalNowicki Ай бұрын
Oglądam, oglądam, aż tu nagle szuflada z napisem "łożyska". Dzięki za wykonaną pracę i doceniam pomysłowość. Oczywiście zasubskrybowałem kanał. Pozdrawiam
@noahplaysgames3748
@noahplaysgames3748 Ай бұрын
now do the exact same thing but instead of chatgpt use lab-grown human neurons
@SuryaGupta-m6j
@SuryaGupta-m6j Ай бұрын
Working on it
@dereksimmons5877
@dereksimmons5877 29 күн бұрын
One better..secret government clones
@paulmoreno4913
@paulmoreno4913 28 күн бұрын
Vault Tec on it
@wildhorsemusic1111
@wildhorsemusic1111 26 күн бұрын
No lol
@mrinalsingh08
@mrinalsingh08 Ай бұрын
there is a lot in the prompt that could have prevented most of what the robot did wrong. You for sure have inspired an interesting weekend ahead.
@ThrowawayAccountToComment
@ThrowawayAccountToComment 2 ай бұрын
Maybe try using a LLM running locally, it would be free and not need an internet connection! (I used ollama)
@cbuchner1
@cbuchner1 2 ай бұрын
Any small local models supporting vision yet?
@ThrowawayAccountToComment
@ThrowawayAccountToComment 2 ай бұрын
@@cbuchner1 Idk, the only models I've ever download were just text.
@auriocus
@auriocus 2 ай бұрын
@@cbuchner1 Try qwen2-vl. There is a 7b variant which is quite good. Other choices are internvl2 (in several sizes), or pixtral (not that great in my experience). Llama-3.2 vision is also rather weak and not available in Europe.
@VR_Wizard
@VR_Wizard 2 ай бұрын
You can use Piper voice for a better TTs voice it is open source. You can also use an agent system to create the commands for the robot. Basically you let 2 ChatGPTs (2 agents) run in parallel. One agent analyses the surrounding and describes it in text. The other agent takes the description and uses it to create commands for the robot (I think you do something like this already but it might work better with a dedicated agent for generating the controll commands). By having a dedicated agent you can prompt engeneer it for this one task. You can use a prompt with special tokens like the task to always write the commands in breakets then you can use python to use the commands in the breakets to steer the robot.
@Maxjoker98
@Maxjoker98 2 ай бұрын
Very cool project! I have seen similar projects on KZbin though :P I think to archive better results, you should look into using something like ROS to generate an environment map and do motion planning, and use ChatGPT only for high-level planning and maybe object recognition. Of course this would be a way more ambitious project, but you can probably do a lot with simulations to test your code first. Sadly, ChatGPT would be of way less help in coding such a system, both as in creating the code, as well as in being used for inference during the operation of the robot. But it could still be done!
@warrenarnoldmusic
@warrenarnoldmusic 2 ай бұрын
Not really, it does, chatgpt and llms are just shallow, they tend not to work well outside of training data. Everyone doesn't know but it is more of an illusion of intelligence, an encoding of output of intelligence than intelligence itself
@OsDijider66
@OsDijider66 Ай бұрын
Finally something amazing on youtube
@82NeXus
@82NeXus 2 ай бұрын
Goals that you provided the AI: Explore: carefree happiness! Survive: doomsday!
@codeChuck
@codeChuck Ай бұрын
Yeah, if we as humans want to live on this planet, better not to tell almighty robots to survive. They better protect humans, then survive. Because machine can be rebuild easily, and human no so much, they should not 'survive at all costs'. This is just bad programming.
@AlexDaeling
@AlexDaeling 16 күн бұрын
I think the way to get the robot to behave the way youd like youd have to manually keep the information it states, that way it can reference in the future. chatgpt is functionally an information interpreter, and they have some memory capabilities in the text area but even that is limited.
@weirdsciencetv4999
@weirdsciencetv4999 2 ай бұрын
I made a house robot AI tapped into LLAMA2, the kids talk to it via whisper and ask it questions.
@davidwells7279
@davidwells7279 2 ай бұрын
dude...post some videos and a how to. people would love to see that.
@weirdsciencetv4999
@weirdsciencetv4999 2 ай бұрын
@@davidwells7279 Aww that’s very kind of you! I do feel ambivalent about posting videos, though- my situation is complex. I was disabled by a semi rear ending me, I had to be extracted from my vehicle and air lifted, had multiple surgeries. Wound up disabling me. I was awarded disability because i was crippled. But the insurance found my youtube channel, used my videos to terminate my disability. I got it back, but it took over a year and I lived off credit cards. After I went over the limit on the cards, I wound up homeless a few weeks before finally getting it back. Still afterwards I had to declare ch7 bankruptcy. I can still do some things, just takes me around 4x longer. So say I need to work part time to feed myself. That’s 8 hours a day right? Well if it takes me 4x longer to do the same kinda work, then it means a normal 8 hour day for someone would be 32 hours for me. Not enough hours in the day. I tried working initially but would get fired job after job as my health would collapse from trying to work. But on the surface I look employable and physically i look fine. But it’s easily exploitable by my insurance. So after this experience I deleted all my science videos. Maybe I can make a ghost channel not tied to my identity but databrokers are exceedingly good at correlating activity and associating online accounts. And my insurance company uses private investigators who have access to those. In my spare time, I am trying to use a form of artificial evolution (look up “NEAT”) to make a neural net architecture capable of hosting memes in general, not just language. Language is a form of meme. It’s why these LLMs might be considered alive, they host the living entity of language. If you’re interested, read Dawkins “selfish gene” and Dennett’s “dangerous memes”. Typically the way I work on things is just in short bursts. Anyhow probably more than you wanted to know.
@Ds1950x
@Ds1950x 2 ай бұрын
Good for you kid. I had the same concept but lacked spare time to complete it. My idea was to use android mobile as the brains using api calls or local processing then using ioio-otg for hardware control. Your phone already has camera, mic, etc.
@Professor-Scientist
@Professor-Scientist 2 ай бұрын
The ending is really funny
@AgentBurgers
@AgentBurgers 2 ай бұрын
"I see no obstructions" 😂 then proceeds to run into boxes. This video has inspired me to pop my Arduino kit once again. Mad nice video man 😎
@MaxAlder-xl2pg
@MaxAlder-xl2pg Ай бұрын
4:23 AHHH why do you make me think about breathing I hate it when this happens
@Jorge-lu3nv
@Jorge-lu3nv Ай бұрын
☠️☠️☠️☠️☠️
@lupo19fun
@lupo19fun Ай бұрын
😂😂Right!!
@atistheso
@atistheso 26 күн бұрын
Fantastic project. It doesn't look like robots are ready to take over the world yet =)
@Paperbutton9
@Paperbutton9 2 ай бұрын
Open AI does this and WAY MORE in their basement
@unnamed776-m9h
@unnamed776-m9h Ай бұрын
Explain
@Daimler-b6h
@Daimler-b6h Ай бұрын
@@unnamed776-m9h Imagine.
@cashmoney923
@cashmoney923 2 ай бұрын
Excellent video, fascinating experiment. According to this video, I wouldn't worry about the robot apocalypse anytime soon. Getting accustomed to the physical world might be a challenge for gpt/AGI.
@itryen7632
@itryen7632 2 ай бұрын
0/10 You didn't make the robot an anime maid.
@ali99_82
@ali99_82 Ай бұрын
Soon brother
@shevystudio
@shevystudio Ай бұрын
We will get there
@mrtoxm8
@mrtoxm8 Ай бұрын
Epic project man! solid experiment
@TheExodusLost
@TheExodusLost Ай бұрын
“THE ROBOT SEES A BROKE-ASS COLLEGE DROPOUT AND AN EXTREMELY MESSY DESK IN A DIM ENVIRONMENT”
@M1551NGN0
@M1551NGN0 2 ай бұрын
Utilising ROS2 to add another layer of automation to this bot and fill in the disadvantages of using an LLM to control it can actually turn this bot into something like BB-8 or something; an actual automated explorer bot 🙌 For mapping out any area, ROS2 can come handy. Just give it some image processing powers using OpenCV and you're done💪
@jonnscott4858
@jonnscott4858 2 ай бұрын
EX-TERMINATE , EX-TERMINATE , EX-TERMINATE , EX-TERMINATE , EX-TERMINATE , ..
@TyMoore95503
@TyMoore95503 Ай бұрын
Yes.. you have to use that incredibly annoying but not scary, tinny voice!
@orzeleo
@orzeleo Күн бұрын
heh wleciało mi na autoodtwarzaniu i miałem w tle, i dopiero tak w 10 minucie się skapnołem że to nie native speaker szacun
@werto0867
@werto0867 Ай бұрын
I would reccomend to mount a few ir or ultrasound sensors, that will detect the distance between the robot and obstacles.
@michah321
@michah321 29 күн бұрын
It thinks through in words everything we think automatically. Its hilarious and adorable with all the words and its this funny little robot. " I use my intimidating noise while i flee"
@Vopraan
@Vopraan Ай бұрын
CONTINUE THE PROJECT! I NEED THEM AS A PET! GIVE IT THE ABILITY TO FOLLOW DEMMANDS, MEET DEMANDS, PLAY GAMES OR SOMETHING!
This robot is artificially intelligent (and lies)
13:59
Nikodem Bartnik
Рет қаралды 39 М.
Can You Charge A Phone with Marbles?
18:06
Engineezy
Рет қаралды 1,4 МЛН
Une nouvelle voiture pour Noël 🥹
00:28
Nicocapone
Рет қаралды 9 МЛН
BAYGUYSTAN | 1 СЕРИЯ | bayGUYS
36:55
bayGUYS
Рет қаралды 1,9 МЛН
I Challenged Boston Dynamics' Famous Atlas Robot
16:24
Cleo Abram
Рет қаралды 3,3 МЛН
I Trained an AI for 2 Years on Trackmania. It's Breaking Records.
27:50
I Made an Electronic Chessboard Without Turns
14:32
From Scratch
Рет қаралды 1 МЛН
My New Satellite Can Take Your Selfie From Space
25:00
Mark Rober
Рет қаралды 9 МЛН
I Made an AI with just Redstone!
17:23
mattbatwings
Рет қаралды 1,2 МЛН
I 3D Printed a $1,175 Chair
16:31
Morley Kert
Рет қаралды 5 МЛН
This Video is AI Generated! SORA Review
16:41
Marques Brownlee
Рет қаралды 3,5 МЛН
How many plants do you need to breathe?  TESTED
27:44
Joel Creates
Рет қаралды 4,1 МЛН
We Built the Internet in Minecraft
25:18
Branzy
Рет қаралды 3,4 МЛН
Can I Make QR Code Damascus?
19:23
Alec Steele
Рет қаралды 307 М.
Une nouvelle voiture pour Noël 🥹
00:28
Nicocapone
Рет қаралды 9 МЛН