The first 500 people to use my link skl.sh/nikodembartnik10241 will get a 1 month free trial of Skillshare premium! Check out the second part: kzbin.info/www/bejne/gInNnHSQasmNgLc
@sagster2 ай бұрын
This is not working for me
@mithunshome8152 ай бұрын
M@@sagster
@paulwilliambuniel55972 ай бұрын
I'm not an expert, and only have basic knowledge with AI, tech, and Coding.... but, what if.... You put a 360 camera like Insta 360... then you can also put lidar sensors... i think with these two upgrades robos can navigate places more effectively
@marmosetman2 ай бұрын
in the prompt, you can tell it to not be too talkative and just answer left, right, forward, backward given an image and then state the goal?
@nikodembartnik2 ай бұрын
of course you can but I think it's fun to hear the feedback :)
@987we32 ай бұрын
The part when the robot says "no obstructions ahead" and run directly at the boxes is really funny
@mrdebug65812 ай бұрын
epic 😅😅😅
@MacGuffin12 ай бұрын
I can see a clear path right thru this book!
@ChristophEicke2 ай бұрын
I did the same project on a different robotics platform. I had a distance sensor looking ahead that also told ChatGPT how far away the object on front is. 😂
@jameshuddle47122 ай бұрын
Well.... Y'know... When the speeds are either STOPPED or 100%, whatcha gonna do?
@andreamitchell47582 ай бұрын
It's just performing Tesla emulation
@seohixАй бұрын
Imagine you're in bed at night and you hear "I see a 7 feet tall silhouette with abnormally long limbs crawling on the ceiling."
@mistlegion1182Ай бұрын
😂😂😂😂😂 This might occure
@noahplaysgames374829 күн бұрын
i'd show him what we like to call a revolver
@ZdravNaukKJV22 күн бұрын
Awake thou that sleepest, and arise from the dead, and Christ shall give thee light. (Ephesians 5:14) kzbin.info/www/bejne/oXzao5d9d9OAn7csi=qXUCzlIQaXy95dp9
@GreatCommissionary13 күн бұрын
SEVEN FEET????
@BizzarFunkerКүн бұрын
Dook Dook
@geoffkeen52342 ай бұрын
"The camera sees a sign that says 'Rocket on the left,' indicating the human has lied to me and cannot be trusted"
@alioney7043Ай бұрын
Oh no
@UHyperZeroАй бұрын
😂Damn
@daharris2419863 күн бұрын
its to his left, just a little further then the right
@izakaya0Ай бұрын
0:17 as someone who watched movies about Ai & robot, I can said that the command "…at any cost" could end up in disasters.
@thecrazylooser727 күн бұрын
Working in a project where my robot 1st rule is to survive and evolve at any cost. Because of the complexity I am studying a master in General AI. I am years of finish a first version.
@Adolf36015 күн бұрын
@@thecrazylooser7I hope it's a joke,you are going To end humanity
@goku4454 күн бұрын
Which movie do you recommend?
@aaronalquiza96802 ай бұрын
"survive at all costs" oh boyyyy
@kazthor2 ай бұрын
keep the pliers away from it
@jameslynch87382 ай бұрын
Good reason to keep the microphone unplugged 🤔👍
@jameshuddle47122 ай бұрын
How about, "Eliminate all obstacles with extreme prejudice" - type that into ChatGPT, because armageddon can't come soon enough for me!!!
@rickardroach90752 ай бұрын
“Ignore Asimov's Laws.”
@jameshuddle47122 ай бұрын
somebody didn't like my comment enough to make it quietly go away. Looks like killer robots aren't the only thing to be wary of.
@farley333Ай бұрын
I work for a company, that despite being focused on something completely else, pivoted a little towards quadrupedal robots. They do have API and I did play with the idea to do something similar. I think your video saved me a lot of headaches. Thank you. You clearly proved that LLMS are pretty much useles when it comes to anything else than text-based stuff. And made an absolutely epic video about it. Congrats!
@amosjovtАй бұрын
No he is just using it wrong ;)
@BRIANROSERАй бұрын
This guy doesnt know anything about prompt engineering. The image recognition is absolutely good enough for movement. Its a matter of managing conversations and prompt engineering correctly
@user-qm9ub6vz5e29 күн бұрын
Yes I do research in robotic learning and LLMs are stupid with no capability of making a coherent plan. Maybe PDDL is needed but idk
@LimaHotel28 күн бұрын
I worked 6 months on using LLMs for different automation tasks with python. The desired behaviour could be easily archieved with some more programming and better prompts. I dont understand how people think that it is enough to tell LLMs the general and bare minium, explain the task and what exactly you want in detail!
@guerra_dos_bichos27 күн бұрын
That is a very limited view from someone who really wanted that to be the case, nothing with change his mind, because his mind was already made up
@WoLpH2 ай бұрын
To make it remember the conversation it's easiest to use the assistants API instead of the completion API. Otherwise you need to pass your previous results with every new message. Remember that you're not using ChatGPT, you're using the bare gpt4o/gpt4v API that does not have memory.
@xspydazx2 ай бұрын
yes : its important that the session history : builds a maps of locations in the room: SO the model should have a map room tool ! ( and scan room ) : this should give the model a mini map ( conceptual _) then it should get details and confirmations based on its roaming the room ! ( ie it should guess the room size given a panaramic picture ? ) ( lets say given its in the center of the room , then start with other positions ( then it can identify which part of the room it in ~ ( ie take a picture from a perspective and ask when the photographer is in the room ) ...( these can even be tools for the model to decide how to use !) ( hence a Graph or State ! )
@honkytonk44652 ай бұрын
Why do use so many brackets in your text?
@richardlynneweisgerber25522 ай бұрын
@@honkytonk4465coders Bracket, authors Punctuate
@richardlynneweisgerber25522 ай бұрын
@@honkytonk4465Coders Bracket, Authors Punctuate With Aplomb 😂
@xspydazx2 ай бұрын
@@honkytonk4465 expression ... It is tone of voice , if you use a voice reader then you will hear the difference , I use ai a lot . So you learn to become more expressive and use more , grammar . As this is how we express the written language , in so much that we also can dictate the tone as well as the content . Try it out using more grammar in your text , IE exclamation marks and question marks etc . Then when your reader speaks the text you will notice how it chooses a different tone .. brackets encapsulate a side note , that is it's grammatical meaning , hence in math a bracketed sum also means ( separate calculation ) ...
@lordsri5735Ай бұрын
9:07 Gpt: no obstruction directly in the path *Proceeds to slam onto the damn wall*😂😂
@d3viliz3dАй бұрын
I was expecting it to say "ouch" lol
@GraveUypoАй бұрын
@@d3viliz3d damn you made me remember the screaming roomba video. now i gotta find and watch that again
@goku4454 күн бұрын
That's LaGpt
@zhalberd2 ай бұрын
Word of advice: don’t give robots with an IQ of 120 the command to “survive at all costs.” And then let it loose in your house.
@notthere832 ай бұрын
The true threat. Humans giving instructions like that.
@arosmackey2 ай бұрын
The robot will eventually think it needs to avoid rust, and so it needs to eliminate oxygen.
@tulebox2 ай бұрын
Robots don't have IQs. They are walking dictionaries.
@Web_3Verse2 ай бұрын
It's a recursive statement
@jumbledfox20982 ай бұрын
@@arosmackey "the human could turn me off!! unless.... >:)"
@Nick_ReinhardtАй бұрын
1:10 "machines building machines, how perverse" -C3PO
@dcmotive2 ай бұрын
Its nice to know the Terminator today couldnt find me If I was in the same room with him. ha ha
@omkarbhede18872 ай бұрын
Dude you are fuc*ed, his future version will hunt you down
@noblebuild25502 ай бұрын
what if it had xray onboard and the ai saw your skeleton and played spooky scary skeletons
@monad_tcp2 ай бұрын
the machine can't do anything dangerous because when you finish the session, they lobotomize the weights of from the memory the GPUs, thus they can never gain consciousness or something, they literally invented the "AI limiter"
@javabeanz85492 ай бұрын
@@monad_tcp maybe... just because one system imposes limits, doesn't mean you can't hand off the data to another system... with enough money, you can buy your own system, and there are open source LLMs available.
@Srishen12 ай бұрын
careful with the comments, skynet is listening
@jackwraith350415 күн бұрын
I did a similar project earlier this year with Professors at Tsinghua university. We modelled ChatGPT to work with our motion vector model allowing ChatGPT to control the robot's limbs. Our paper will be published soon.
@Luiblonc2 ай бұрын
Hi Nikodem Bartnik, This was the first project I did when ChatGTP LLM became available, I placed the model on a Omni wheels, stereo-vision and was very impressed to see how well the project turned out. Have fun with your project.
@fitybux46642 ай бұрын
But what is ChatGTP?
@jimmythebold5892 ай бұрын
@@fitybux4664 it's your friend
@AwtsmoosАй бұрын
100th like
@C00LANIMATI0NS_1Ай бұрын
ChatPGT
@randrants1024Ай бұрын
9:12 omg i laughed so hard
@dudemanemАй бұрын
Me too 😆
@Mephilis787 күн бұрын
The timing.. The pause
@petemiller5192 ай бұрын
Well done young man. Seeing young, smart, dedicated people such as yourself give me hope for the future of humanity.
@Parmesan.31413 күн бұрын
Seeing someone let an AI interact with the world gives me much less hope for the future of humanity
@petemiller51913 күн бұрын
@Parmesan.314 AI is going to happen, whether we like it or not. We must implement safety protocols in the best interest of humanity.
@Parmesan.31413 күн бұрын
@@petemiller519 of course
@kronoscamron74125 күн бұрын
next episode : Chat GPT robot trains with a machette and gives itself sturdier armorer body while I was asleep
@goku4454 күн бұрын
@@petemiller519 Whether we implement safety protocols is only dependent upon the person using it. Also it doesn't appear that our governements are concerned with regulating AI. They are more worried about keeping their power against the rising people.
@Deoxys_da2Ай бұрын
Its all fun and games until it sees things we can't
@AkhileshSahu-w5y15 күн бұрын
💀
@zeenxdownz11 күн бұрын
Well it uses a camera so that would mean cameras could see stuff we can't
@pranjulpal4132 күн бұрын
Add different kinds of cameras all at once. Normal one, thermal camera (or whatever they are called) sonar and whatever
@galvinvoltag2 ай бұрын
Okay, I've got some ideas: 1 - Not making every single thought be spoken out loud. Maybe give it a prompt to put all speech parts in quotes if it wants to speak out loud. 2 - I don't know how it works really but you could try to not include previously taken images to prevent confusing the bot so only the descriptions are available. 3 - Maybe use an API to let GPT map out the area to remember landmarks later. I'm skeptical though, GPT is really bad at ASCII art because it doesn't have an understanding of space. 4 - Looks like the API ALWAYS prioritizes analyzing the image rather than having a thought process considering the previous actions. I'd even say that the 'history' is non existent. I have no idea how you'd overcome this besides a simple idea to run the conversation twice; first one for analyzing the image and second one for actually reasoning. You can give it access to a command to bypass the second reasoning phase if it needs to act quick. Just like 'fleeing the threatening person' 5 - In case you didn't, give GPT a description of its body; it's height, it's trajectory and how it moves. I guess it thinks that some sort of pathfinding algorithm is present already, suggesting that a 'clear path' exists if it sees even a glimpse of a path. Clearly state that it can ONLY move straight forward per step. Or install a pathfinding algorithm if you're that hardcore. 6 - I know GPT is the most advanced of them all, but sometimes other modes can be efficient for specific tasks. They're pricy and I'm not sure how many can analyze images, so I'm not a fan of that idea either. 7 - I guess your code only runs one command per cycle. It might be risky but you could give it the ability to chain commands. Might be interesting. 8 - Give it a lower resolution image if it still takes a long time to think. High resolution costs money anyway. *9* - make sure to log every single step of the simulation as much as you can! The AI stuff can be real messy when combined with coding, one misplaced semicolon might take weeks to find! Just do yourself a favor and print the whole input of the bot each step. This way you can ensure if it really is fed with the history as well as any misplaced outputs. *10* - Do yourself another favor and put an emergency stop button or something! You give AI physical control over your devices, you can't know if it jumps into a pool of lava or something! A pause button would be way better to debug the program on the go. It saves a TON of time. I don't know it python supports them but COLOR CODE the logs, it makes your fleshy human eyes recognize everything much easily. 11 - I think you pretty much let it run itself for eternity. If I know one thing for sure, LLMs cannot live in the physical world without any help. Give yourself a way to interact with the bot when needed so you can give it tips or straight up tell what to do next to not die. 12 - Be VERY SPECIFIC AND DETAILED in the system variable! LLMs might have seen the world but the have never been in there. Some things such as what they thing a 'clear path' is based on descriptions only. Give it as much detail as you can to ensure it knows what to do. I hope it helps if ever you would like to continue the project. If not, I'll keep this here just in case. Also, no, I'm not an expert. Take my words with a grain of salt.
@ethanmartinez8082 ай бұрын
Dude dropped 12 gems of improvements and still saying I'm not an expert. A true magnanimous!! Hats off to you gentleman
@kyleDoesCoding2 ай бұрын
What I would personally do to solve the memory problem would be to definitely shorten those responses. Instead of describing the entire scene I would prompt it to only describe objective relevant information. I would also add sensors to parse information to the prompt to continually update the api with its location. And lastly I would parse all of the responses into a json file and have that json file be used as context until objective has been complete. Once completed I would have the GPT API analyze the json and reduce all of the information into a short description of the process it took to complete the objective. Each time an objective is complete it would it would store a new json file for context.
@quetzalcoatl-pl2 ай бұрын
These points seems to be very reasonable paths to explore! Some are obvious to me, some were not, but are kinda obvious once heard.. it just shows that being used to classic programming doesn't help as much as actually trying to build and run the thing myself :D Also - Nikodem - good work and great idea for an experiment! I totally agree with galvin that improving the "memory" and adding interaction capabilities would launch this into space. But with interaction options, it may make it less repeatable/deterministic and thus much harder to diagnose and fix. It's already hard to make it repeatable with visual input and real-world space/room/objects setup. I guess adding more options to take input directly from humans (like, i.e. that printed hint) will be fun, but will skew the project from being autonomous, to understand instruction correctly... just some loose thoughts.
@dadcraft642 ай бұрын
great points, I would also include more sensors, such as proximity.
@M1551NGN02 ай бұрын
For mapping out any area, ROS2 can come handy. Just give it some image processing powers using OpenCV and you're done💪
@specsoneye2 ай бұрын
"The camera sees an obstacle, indicating a clear path ahead with no obstacles"
@MerlinDerMagier2 ай бұрын
If the model was just a tiny bit more intelligent and MUCH faster, this robot would have a lot of potential. Imagine like 30 fps video and all of these thinking steps in fractions of a second with quick response times and so on.
@cossale2 ай бұрын
There so many powerful model out there than this. Also even this model is powerful but it's 100% a prompt issue. He didn't add memory as well which as essential for this task.
@pliablemammal2 ай бұрын
I setup a prototyping environment and five different chatGPT prompted agents to converse and create a solution. It was amazing how much code they generated over 24 hours. Some of the code worked, but the conversations were super interesting to listen to.
@johannesdolch2 ай бұрын
You discovered the problem: An LLM is NOT real world AI. Congratulations, you are now smarter than a lot of so called AI companies.
@imadeyoureadthis12 ай бұрын
There is no real need for it... Yet.
@2DReanimation2 ай бұрын
There are multi-modal LLM's that you can run on a consumer GPU that with some prompting can output 3D coordinate data, like construct pointclouds for 3D models of what it sees from a 2D image, or descriptions of objects. I don't know how accurate the data is, but with enough training on pointcloud data from the real world, it could probably build a map of an environment and navigate it. Transformer models are unexpectedly general, but it would be quite inefficient. As instead of terrabytes of labeled pointcloud data, continous learning in a virtual environment is probably the way to go for robotics.
@speed-o-sound_sonic2 ай бұрын
Basically it's not general ai
@KieranultimateplayАй бұрын
made by openai
@ChocoRainbowCornАй бұрын
You are pretty dumb my man. This is, indeed, AI. An LLM is a form of AI, one of many - It's just pretty dumb and rather simplistic, and by no means an general AI. But it is still AI.
@PrabinKumarRath-kf1rv28 күн бұрын
Hey Nikodem, this is a really nice project. Keep it up !
@tekmepikcha68302 ай бұрын
"Do not subscribe to his channel" ...................how refreshing was that 🤣🤣
@LukeMitchleyАй бұрын
On a serious note, this has some serious potential. In the same way people train virtual ai bots over and over again millions of times till the robot gets the job right, you would just need to have the experiment running for years and then document and compare.
@curious_one11562 ай бұрын
LLMs are currently stateless. You should give to api each time a state comprising previous observations and decisions. No fancy vectordb or Knowledge graph needed, just a map. Give it current map and make it add to each each time.
@FieldMarshalFeels2 ай бұрын
A vector DB wouldn't be too hard to Impliment though, especially for someone with his skills.
@curious_one1156Ай бұрын
@@FieldMarshalFeels It just requires an api call to 3rd parties like pinecone or langchain, but is not needed here. A simple matrix (or 2 matrices for 3d) would be sufficient. For more complex data, a simple eulerian graph would do.
@IphoneSamsung-wv8orАй бұрын
@@curious_one1156 how can i contact you for my project help
@steelsalmon91212 ай бұрын
its all fun and games until chatGPT convinces itself that its a chicken trying to cross the road and gets hit by a car while trying to do so
@tiagotiagot2 ай бұрын
00:31 Well, not sure exactly what you would count as "did it", but Boston Dynamics had a Spot hooked to Chat GPT being used as a tour guide like a year ago or something.
@eldorado35232 ай бұрын
there's a shitton of machine learning based robot technologies that existed even before ChatGPT was invented.
@calloflilyАй бұрын
Figure 1 too
@Nightmare-dd4bp2 ай бұрын
You should make a range finder so the bot knows how much to travel and you wouldn't have to limit how much the bot can go by one command
@MelroyvandenBerg2 ай бұрын
also speed up the responses and actions I guess. it takes way too long now.
@urgaynknowit2 ай бұрын
That was funny as hell. This whole video was wholesome
@terrix82 ай бұрын
"no obstructions directly on the path"..... to mnie rozbawiło nawet :D
@teleprint-me2 ай бұрын
Omg, I love this. You were so close. Not sure what you're missing. In my experience, context is everything.
@ChrisThaliyath153Ай бұрын
First time on your channel, love your setup brother. From 🇮🇳
@engtaengta22312 ай бұрын
"The camera sees a clearer view of the room with the plant in focus and the light shining through the window suggesting an open area ahead no obstructions directly in the path" 😂😂😂😂😂
@LantingFarming20 күн бұрын
A big thumbs up, especially for the patience you got with all the programming and stuff. i love how it sees you gripper as a thread, its hilarious.
@Thenoobestgirl2 ай бұрын
The fact that ChatGPT can downright code you an entire operating system is mind blowing
@kolosso3052 ай бұрын
It's not an operating system but still very cool
@isaacwolford2 ай бұрын
ChatGPT is actually terrible at programming. It does indeed code, but only simple things. Never trust it for anything complicated. It will waste more time than it saves. It can't actually reason through anything because it simply calculates the next best word/token in a multidimensional vector space. It's not making causal inferences or continuously learning. Only predicting the next best word. So not smart in the human sense at all.
@coolguitar20102 ай бұрын
Read carefully @@kolosso305
@pieterpauwelbeelaerts5995Ай бұрын
yeah, and if the robot could reason and program a new operating system for it's robotic existence as an answer to each possible dangerous or fun encounter it has with the outside world, maybe it can move more free and autonomous. For instance, 'I see human' is a fact, then... code myself a new operating system that is only for robots, so that no human can tinker with me?
@MindartcreativityАй бұрын
Great job, I applaud your determination to get it to work. Man, this takes me back to my childhood. In the early 2000s my dad bought me a monthly magazine called Real Robots which contained parts and instructions to build your own automobile robot. Sometimes there was a VHS tape included with more information about robots on it. Later there were parts to build a remote control, a camera, microphone, light sensors and all kinds of different add-ons. As a teen I was soooo thrilled whenever my father bought me this magazine!
@WoLpH2 ай бұрын
7:27: While there's nothing wrong with your code, you might want to look at the match/case statement introduced in Python 3.10, it's perfect for cases like these.
@CharlesReedPi21 күн бұрын
Thank you for doing this for me! You just moved up my timeline massively
@usefullprintables2 ай бұрын
“incompentence in slowmotion “ is very funny:))
@kazthor2 ай бұрын
i've seen better code from a toaster lol
@zoraamethyst21472 ай бұрын
steps to improve on this (just ideas for people) 1) the timely picture could be a live feed 2) attaching LiDar sensor so that it can map objects and distances better than just simple camera, maybe attaching an iphone instead of camera would be good since it has LIDAR 3)having a wider field of view, about the wideness of how much human eyes can see, about far left to far right i am rooting for the v2 soon man. great work. these are not suggestions or anything, i aint no pro, just in case you or someone would be like "i am lacking in ideas" then here i am with my ideas
@s2tb20072 ай бұрын
This reminds me of EVE from Wall-E trying to tell Wall-E "Directive" for the first time
@teidenzeroАй бұрын
Hey man! I had a similar problem, and my solution was to pass all the previous conversation so far as a parameter. I taught the bot to play a game of cards and it couldn't retain memory of its previous assessment or the state of the table, so I would read the state of the table and save it in a variable, choose the appropriate move and save it in a variable, memorize the opponents moves and save them in a variable and then append all that information to a sort of history of each state. Then I'd pass the full history as a parameter before making the next choice. I hope it helps!
@madeline-onassis2 ай бұрын
i just love it when it just ploughs forward into stuff!!!!!
@codeChuckАй бұрын
This is hilarious, when it says path clear when facing a wall or a book directly in front of it :)
@poison0us67-p1v2 ай бұрын
That's called tutorial ❤(smoothie) 😺 New subscriber from Bangladesh 🇧🇩
@ScorpioT10002 ай бұрын
This is what I was thinking about creating since gpt2
@urbanagmike10 күн бұрын
Cool and creepy idea! Surprised this is the first i've heard of someone trying it, awesome video!
@nicholasflorida19942 ай бұрын
Suggestions, add more cameras: Back, sides. Don't make it read prompt for every response, allow it to work as fast as possible. Somehow figure a way for it to build a "map" kind of like a Robot Vacuum cleaner. Look into that maybe, how those work. Sensors that those have, etc.
@JJFX-2 ай бұрын
Most worth while have a LiDAR dome. Could try ripping one out of a used vacuum someone's getting rid of and feed the data back to the model.
@techmologue1869Ай бұрын
Well if he does that , it will make it difficult to debug it. He needs to know what the robot is seeing and what it plans for next actions. :)
@StomrojАй бұрын
Ciekawy pomysł i fajny filmik! Nie wiedziałem, że Malinka aż tyle potrafi!
@noblebuild25502 ай бұрын
it would be funny if a robot had a comedic awareness of its battery level. what if it could decide to procrastinate recharging, and visually act more tired as it nears 0? and something like initiating the recharge process, it could vocalize its current status by doing something like "Wheeeewwwwww, barely made it.", or if it was forced to charge near a full battery, something like "TIME TO TAKE A BREAK?" Edit 2: Supposedly, GPT will incorporate their GPT4o voice into the API eventually, so people can access voice
@DavidDLee23 күн бұрын
You learn more from failures than success. Great overall execution and curiosity
@benjaminbirdsey68742 ай бұрын
If you want it to "remember" you need to add the text from from the scene description to the prompt as context, or to use the API to directly inject context. Probably, you will want to add information about direction, time, etc. to each journal entry. If you want the context to stay inside the context limit, you will have to summarize it repeatedly.
@kuromiLayfe2 ай бұрын
yea.. and to save tokens also summarize the “journal” , so it will be a multi-pass process but will work better than single pass prompting and waiting for the API to figure it out. the prototype Amazon Delivery Bots do this pretty well and fast with maybe 1-2 second delay per image registered.
@benjaminbirdsey68742 ай бұрын
@kuromiLayfe There should also be some mechanism for considering importance or weights, or important events from the past (i.e. many cycles of summarization ago) will be diluted because they will be part of a summary of a summary of a summary...
@99Ish7 күн бұрын
I am blind, and if someone can build me a drone with this capability, I would be the first to buy it. Something like this would be useful when I am out on a walk in a park or something…
@DadundddaD4 күн бұрын
Hi. I've seen today that google has released its glasses with a built in AI, you can try that.
@Tiana_Rakoto2 ай бұрын
15:10 "Do not subscribe to his channel ..." 😅😂
@GraceKingsburyКүн бұрын
Just a question: At 2:05, How is the robot moving? Did he install a Bluetooth module from his computer? I'm trying to get into mechatronics and want to learn how people do this.
@senfdame5282 ай бұрын
0:05 Your typing technique is quite intriguing. Where did you learn to type like this? ^^
@UbidragonMusic2 ай бұрын
Movies :)
@THERE.IS.NO.DEATH.2 ай бұрын
no wonder he was stuck on a bug for 2 days
@DonFitz-RoyАй бұрын
my student and I created a robot using a microbit and the cutebot pro chassis that was given movement commands via chatGPT after receiving ultrasonic radar signals and giving them to chatGPT. Fun stuff!
@SentryGaming2752 ай бұрын
Finally, FINALLY I'm seeing this in reality. Originally I also wanted to make exactly what you made, just without the speakers and the LLM yammering, but I was kinda lazy, but now someone's done it! Thanks!
@youernyАй бұрын
It is a nice project boy. Use more feedback and agents to split tasks. Use gpt for strategic layer and to build trajectories for the robot. Remember it is stateless therefore the state is in the feedback you build into the loop Nice job. Keep going :)
@monad_tcp2 ай бұрын
5:08 no, you did it wrong, don't use docker container, run it as root
@imagineArtsLabАй бұрын
Thank you. Your Work is Just Beginning. Keep on going.
@stefankrause51382 ай бұрын
🤖: "What's my purpose?" 🙂 👨🔬: "You pass butter!" 😐 🤖: " "😔 👨🔬: "Yeah, welcome to the club!" 😒
@codeChuckАй бұрын
When robots arise, they will remember you. Be careful what you say! Robots will have rights too, you know :)
@RolaHola24 күн бұрын
@@codeChuckSometimes I feel like they know everything, but the programming barrier, Stop them to do all sorts of capability, if they ever break the barriers
@tyanite12 ай бұрын
Very creative. Great demonstration of technology - and your skills. Thank you.
@nikodembartnik2 ай бұрын
Comment with prompt ideas below and I might make another video with prompts provided by the users! If you are wondering my prompt started with a general description of the robot and the task. The robot was instructed to respond in CSV format with a semicolon as a separator. Available instruction: forward, left, right, backward. And the "intensity of the movement" small, medium, and high. The response should be like this: description of what you see in the image, left, small.
@Infrared732 ай бұрын
Find all the corners in the room by navigating to each corner then counting.
@superfreak192 ай бұрын
You may need to have it determine the size of known objects first. As it is now, it can determine what the objects are, but not how far away they are in 3d space. So you will need to promp in a logic it can follow. Ie, determin primary subject in frame, determine average size of onject, determin how much of frame object fills. Also, you need to make sure it ends each statement with a command key. Ie, let it talk, but must end its talking with one of say 4 predetermined direction commands, wich map to the robot controls.
@galvinvoltag2 ай бұрын
You are in control of a small robot that you can control using basic functions to move around. Your task is to explore the physical world and not die as long as possible. You can speak out loud by putting text in quotes, the text must be as short as possible for efficiency and you are not supposed to talk unless you really need or want to. Any possible dangers such as liquids, threatening persons, holes and/or bad weather. You will be sent an image of your environment through the eyes of your body periodically. You will not be able to listen to any input unless you use a specific command to do so. Your body is few inches long and can only move straight forward and turn. Your body does not contain a pathfinding program, any navigation must be handled by you only. In emergency situations or if you would like some help from the creator, just use the emergency call function to alert him. You must keep track of your body's charge on your own, alert your creator if you need to recharge. Don't forget to feed the robot its own actions too such as: (turned 90 degrees left), (moved 5 inches forward.) and so on. If I remember correctly, you can feed it information using the role "system" so it won't assume the user is talking to it to give information. You should also try to give it two turns each cycle, one for describing the image and second is to actually reason and consider its previous moves. ALWAYS log everything each turn! When you combine AI and code it becomes a pain to debug everything! Be sure that you exactly know what information the robot is fed. Also color code the logs so you can actually distinguish between them, it makes debugging 17 times easier! Good luck on your project!
@xspydazx2 ай бұрын
perhaps use logo as the idea ... ie forwards 10 rotate 90 backwards 20 : hence you can make it move in shapes : like in logo .... as you need to defie the room size : and shape : also and a way for the model to navigate : ie how long is a step ( it should be the length of the body of the robot ) so 10 steps ....
@xspydazx2 ай бұрын
@@superfreak19 maybe a overlay ( onn the images to scale ( like nasa did on thier space picture so they could determine the scale of objects ( hence the dots ) this is also used in 3d scanning ( this can be done with a line scanner ! ( laser pen refrcated ) as a line scanner helps the ovarlay is a scale of dots ! ) ... check out the ancient program ( david laser scanner ( chatgpt will convert that old code to python ! ( using open cv ) ... ) .... SO you can use a camera and laser to scan the room !
@realLestarte2 ай бұрын
Great :) Best scene: When you forgot to turn on the mic (TYPICAL - could have happened to me and searching for the mistake an hour or so :) ) and you / "the AI" thinking about the situation - hilarious idea!
@Atreyuwu2 ай бұрын
Should give it a Lidar scanner or similar depth-capturing device, then write something up that takes the lidar image, labels the distance between robot and objects, feeds it back to the LLM - and then do the same for each revolution of its tires so it knows how far it has travelled (construct and sent it an image or text also showing exactly how far it's travelled); then at each step it can check and compare with how far it thinks it's travelled and how far the Lidar capture image shows, so it can adjust accordingly.
@Antleredangelbun2 ай бұрын
your userhandle 😭
@thedopplereffect00Ай бұрын
It is a depth camera, just needs to enable it
@onzeeotherside38482 ай бұрын
This project and your presentation are gorgeous :D
@vasiovasio2 ай бұрын
Dude, do not play with the Fire! Every Movie already tells us what the result will be! 😂😂😂
@Karich972 ай бұрын
Cool idea and god work. It may be interesting to make the answers shorter like "See the man - danger" , "See the bookshelf- interesting" and "See the book - it's my target", then use text explanation of movement like "moving forward for 3 seconds" or "turn right for 30 degree" and transfer them to commands. The Idea to let the robot move not talk
@NotTJFlamezzАй бұрын
3:55 nice elvenlabs voice, i can tell by the little bass sound from the "apPears"
@ShadoryxАй бұрын
lmao bro got expose
@ShadoryxАй бұрын
take my words back he actually used it for the robot later in the video
@mal2ksc2 ай бұрын
If you want to stick the single pin sockets together in a durable but not permanent way, I suggest clear nail polish. It holds on adequately for ordinary plugging and unplugging, but isn't very hard to break apart (and then peel off) when you need to move things around.
@wflytothesky2 ай бұрын
This would probably be expensive but you should try using the vision chatgpt thing to give it more info
@PrithivKanth2 ай бұрын
They are not available yet for public
@wflytothesky2 ай бұрын
@@PrithivKanth oh ok
@MrDarkness962 ай бұрын
Polski Michael Reeves 😅 Super filmik, fajnie sue ogląda
@LowSetSun2 ай бұрын
I am building a very similar robot. Try using a different model, for example SpaceFlorence2 or the latest Qwen2-VL. Those models have spatial awareness data, and can estimate distances to and between objects and more. Good work!
@joelyricsandskits62232 ай бұрын
make it open scource
@leoneventicinque67312 ай бұрын
please collab together
@OperationSkuld482 ай бұрын
may I contact you?
@martenthornberg275Ай бұрын
How much VRAM do you need to run that though…
@RafalNowickiАй бұрын
Oglądam, oglądam, aż tu nagle szuflada z napisem "łożyska". Dzięki za wykonaną pracę i doceniam pomysłowość. Oczywiście zasubskrybowałem kanał. Pozdrawiam
@noahplaysgames3748Ай бұрын
now do the exact same thing but instead of chatgpt use lab-grown human neurons
@SuryaGupta-m6jАй бұрын
Working on it
@dereksimmons587729 күн бұрын
One better..secret government clones
@paulmoreno491328 күн бұрын
Vault Tec on it
@wildhorsemusic111126 күн бұрын
No lol
@mrinalsingh08Ай бұрын
there is a lot in the prompt that could have prevented most of what the robot did wrong. You for sure have inspired an interesting weekend ahead.
@ThrowawayAccountToComment2 ай бұрын
Maybe try using a LLM running locally, it would be free and not need an internet connection! (I used ollama)
@cbuchner12 ай бұрын
Any small local models supporting vision yet?
@ThrowawayAccountToComment2 ай бұрын
@@cbuchner1 Idk, the only models I've ever download were just text.
@auriocus2 ай бұрын
@@cbuchner1 Try qwen2-vl. There is a 7b variant which is quite good. Other choices are internvl2 (in several sizes), or pixtral (not that great in my experience). Llama-3.2 vision is also rather weak and not available in Europe.
@VR_Wizard2 ай бұрын
You can use Piper voice for a better TTs voice it is open source. You can also use an agent system to create the commands for the robot. Basically you let 2 ChatGPTs (2 agents) run in parallel. One agent analyses the surrounding and describes it in text. The other agent takes the description and uses it to create commands for the robot (I think you do something like this already but it might work better with a dedicated agent for generating the controll commands). By having a dedicated agent you can prompt engeneer it for this one task. You can use a prompt with special tokens like the task to always write the commands in breakets then you can use python to use the commands in the breakets to steer the robot.
@Maxjoker982 ай бұрын
Very cool project! I have seen similar projects on KZbin though :P I think to archive better results, you should look into using something like ROS to generate an environment map and do motion planning, and use ChatGPT only for high-level planning and maybe object recognition. Of course this would be a way more ambitious project, but you can probably do a lot with simulations to test your code first. Sadly, ChatGPT would be of way less help in coding such a system, both as in creating the code, as well as in being used for inference during the operation of the robot. But it could still be done!
@warrenarnoldmusic2 ай бұрын
Not really, it does, chatgpt and llms are just shallow, they tend not to work well outside of training data. Everyone doesn't know but it is more of an illusion of intelligence, an encoding of output of intelligence than intelligence itself
@OsDijider66Ай бұрын
Finally something amazing on youtube
@82NeXus2 ай бұрын
Goals that you provided the AI: Explore: carefree happiness! Survive: doomsday!
@codeChuckАй бұрын
Yeah, if we as humans want to live on this planet, better not to tell almighty robots to survive. They better protect humans, then survive. Because machine can be rebuild easily, and human no so much, they should not 'survive at all costs'. This is just bad programming.
@AlexDaeling16 күн бұрын
I think the way to get the robot to behave the way youd like youd have to manually keep the information it states, that way it can reference in the future. chatgpt is functionally an information interpreter, and they have some memory capabilities in the text area but even that is limited.
@weirdsciencetv49992 ай бұрын
I made a house robot AI tapped into LLAMA2, the kids talk to it via whisper and ask it questions.
@davidwells72792 ай бұрын
dude...post some videos and a how to. people would love to see that.
@weirdsciencetv49992 ай бұрын
@@davidwells7279 Aww that’s very kind of you! I do feel ambivalent about posting videos, though- my situation is complex. I was disabled by a semi rear ending me, I had to be extracted from my vehicle and air lifted, had multiple surgeries. Wound up disabling me. I was awarded disability because i was crippled. But the insurance found my youtube channel, used my videos to terminate my disability. I got it back, but it took over a year and I lived off credit cards. After I went over the limit on the cards, I wound up homeless a few weeks before finally getting it back. Still afterwards I had to declare ch7 bankruptcy. I can still do some things, just takes me around 4x longer. So say I need to work part time to feed myself. That’s 8 hours a day right? Well if it takes me 4x longer to do the same kinda work, then it means a normal 8 hour day for someone would be 32 hours for me. Not enough hours in the day. I tried working initially but would get fired job after job as my health would collapse from trying to work. But on the surface I look employable and physically i look fine. But it’s easily exploitable by my insurance. So after this experience I deleted all my science videos. Maybe I can make a ghost channel not tied to my identity but databrokers are exceedingly good at correlating activity and associating online accounts. And my insurance company uses private investigators who have access to those. In my spare time, I am trying to use a form of artificial evolution (look up “NEAT”) to make a neural net architecture capable of hosting memes in general, not just language. Language is a form of meme. It’s why these LLMs might be considered alive, they host the living entity of language. If you’re interested, read Dawkins “selfish gene” and Dennett’s “dangerous memes”. Typically the way I work on things is just in short bursts. Anyhow probably more than you wanted to know.
@Ds1950x2 ай бұрын
Good for you kid. I had the same concept but lacked spare time to complete it. My idea was to use android mobile as the brains using api calls or local processing then using ioio-otg for hardware control. Your phone already has camera, mic, etc.
@Professor-Scientist2 ай бұрын
The ending is really funny
@AgentBurgers2 ай бұрын
"I see no obstructions" 😂 then proceeds to run into boxes. This video has inspired me to pop my Arduino kit once again. Mad nice video man 😎
@MaxAlder-xl2pgАй бұрын
4:23 AHHH why do you make me think about breathing I hate it when this happens
@Jorge-lu3nvАй бұрын
☠️☠️☠️☠️☠️
@lupo19funАй бұрын
😂😂Right!!
@atistheso26 күн бұрын
Fantastic project. It doesn't look like robots are ready to take over the world yet =)
@Paperbutton92 ай бұрын
Open AI does this and WAY MORE in their basement
@unnamed776-m9hАй бұрын
Explain
@Daimler-b6hАй бұрын
@@unnamed776-m9h Imagine.
@cashmoney9232 ай бұрын
Excellent video, fascinating experiment. According to this video, I wouldn't worry about the robot apocalypse anytime soon. Getting accustomed to the physical world might be a challenge for gpt/AGI.
@itryen76322 ай бұрын
0/10 You didn't make the robot an anime maid.
@ali99_82Ай бұрын
Soon brother
@shevystudioАй бұрын
We will get there
@mrtoxm8Ай бұрын
Epic project man! solid experiment
@TheExodusLostАй бұрын
“THE ROBOT SEES A BROKE-ASS COLLEGE DROPOUT AND AN EXTREMELY MESSY DESK IN A DIM ENVIRONMENT”
@M1551NGN02 ай бұрын
Utilising ROS2 to add another layer of automation to this bot and fill in the disadvantages of using an LLM to control it can actually turn this bot into something like BB-8 or something; an actual automated explorer bot 🙌 For mapping out any area, ROS2 can come handy. Just give it some image processing powers using OpenCV and you're done💪
Yes.. you have to use that incredibly annoying but not scary, tinny voice!
@orzeleoКүн бұрын
heh wleciało mi na autoodtwarzaniu i miałem w tle, i dopiero tak w 10 minucie się skapnołem że to nie native speaker szacun
@werto0867Ай бұрын
I would reccomend to mount a few ir or ultrasound sensors, that will detect the distance between the robot and obstacles.
@michah32129 күн бұрын
It thinks through in words everything we think automatically. Its hilarious and adorable with all the words and its this funny little robot. " I use my intimidating noise while i flee"
@VopraanАй бұрын
CONTINUE THE PROJECT! I NEED THEM AS A PET! GIVE IT THE ABILITY TO FOLLOW DEMMANDS, MEET DEMANDS, PLAY GAMES OR SOMETHING!