Does this AI learn a skill (winning track mania games) or is it overfit to the track (winning track mania games on this track)?
@patrykchlipaa2577 сағат бұрын
Butterfly effect but 10000 times a second.
@workplaydie21 сағат бұрын
A new level of Tool Assisted Speedrun. Incredible editing btw! Do you think it is possible to inject some form of creativity into the system? I'm fascinated by how you would even approach this question.
@jonathantodd9906Күн бұрын
You would think that after all these cautionary AI takes over the world tales that we would think twice before training it...
@sid35gbКүн бұрын
The learning process is holding the AI back by using brut force learning rather then elimination of the irrelevant components leaves the relevant components to be studied.
@maciodbКүн бұрын
wow, the work you have done is so impressive
@carlito19934Күн бұрын
i immediatly thought of the double pendulum problem aka the 3 body problem after seeing the car in the "same" position reaching diffrent outcomes
@HassanSayeed1977Күн бұрын
7 months old vid but it looks good
@carlito19934Күн бұрын
i also noticed the ai never gets spawned in the top left corner of the maze most likely cause there is no need to go there but just noticed it
@kevmatteoКүн бұрын
Absolutely amazing video
@joaomiguelalves4063Күн бұрын
"And then, the AI got this run" nice, i see what you did there 😊
@michaeldromes39482 күн бұрын
> Wants to give the human player a second chance > Proceeds to humiliate him further by beating the record backwards lol
@thischannelonlyhas1videoan2272 күн бұрын
If I got a nickel for every time this guy said AI , I would have $4.37
@dmarsub2 күн бұрын
Deep dip 2 Cp version when it releases please :D (obviously impossible but maybe part of floor 0.)
@talwald16802 күн бұрын
Have you tried rewarding the AI on its average score on XXX runs? That way, maybe it will try to be more consistent?
@willchristian59542 күн бұрын
Have you considered the probability that dirt doesn't create "Glitches", It reflects the realistic fact that loose and wet dirt have perpetually fluctuating properties. Its not a glitch, its just realistic.
@alecklymshyn94963 күн бұрын
Unimaginably Cool!
@Carlang293 күн бұрын
26:10 Calm Down tomatox (at top right)
@mackxzs3 күн бұрын
bro made ai addicted to drifting
@leadpheonix2373 күн бұрын
You need to name it
@GhGh-gq8oo3 күн бұрын
Chaos is just god of the gaps. Everything happens exactly as it should and you not having enough information or ability to predict the future based off those initial conditions does not mean magic is at play. It’s just more human hubris. Wow over time small changes in initial conditions lead to thing different than what I expected and big thing! The universe is deterministic and with enough information you could predict the future.
@Tiamatria13 күн бұрын
Are the physics represented here specific to trackmania? Or is this something that we can use to apply to real life courses?
@HimmyNeutr0n13 күн бұрын
Wonder if the AI will get sick of carrots and rebel until it gets a better treat as a reward
@mr.hashundredsofprivatepla37114 күн бұрын
5:56
@0n0thing04 күн бұрын
"The AI favored it's pace over its consistency" It became addicted to carrots.
@ZMacZ5 күн бұрын
15:15 Heisenberg derivative. All results are unique, since no outcomes can be truly equal.
@65es5 күн бұрын
have you considered that the AI doesnt actually like carrots?
@oystercatcher9436 күн бұрын
6:01 love how it’s preferring to balance on two wheels at the very flat top of the pipe
@loanlamthuyto53936 күн бұрын
Wow is this even a video or a cinematic movie 🤩🤩🤩
@luigibeccali28406 күн бұрын
Thise video (and all of the other videos of yours) are a true masterpiece.
@jasperwiersma6 күн бұрын
22:50 Gotta love the Wirtual style
@shaunmackie22726 күн бұрын
There’s less bounce when car driving backwards. Ai needs to learn how to correct different bounce
@LogiLogVr6 күн бұрын
I recently looked into AI reinforcement learning and I think its a neat idea, I personally dont have the materials to create an AI for Trackmania but i was wondering if you would look into doing this type of stuff on Trackmania nations forever on the E7 map, for many years people have been trying for the most optimal run on this map , The world records are separated only by a singular hundredth, many say that the current record is the best humanly possible run so I was hoping you could let your AI try for it.
@EthanCuka6 күн бұрын
Great video! I'm no machine learning expert, but I have a bit of a background in control systems theory. I wonder if there's some additional state information you could provide to the AI that would help it learn more sophisticated patterns. Right now it seems like it only has access to information from its immediate frame, which is basically unrelated to the state information it gets a few seconds later due to the system's sensitivity to initial condition. To better understand how its decisions in one frame affect its state more than a few moments in advance, it probably needs some sort of information about past (or future?) states. I don't know enough about machine learning to know how one would add past state information into the model. My instinct is to give it access to a discrete fourier transform of the last few steps of the sim. (probably downsampled so you're not adding hundreds or thousands of inputs, since you get diminishing returns on a higher sample rate anyway). The fourier transform is all about identifying oscillations in a signal, so it would probably do a great job of helping the AI predict which little perturbations along the track are acceptable and which spell disaster. As for future state... this is just a wild stab, but I wonder if you could train a "guessing" AI that tries to anticipate the future state of the system - only a few seconds ahead, inside that small window of predictability. Then feed its prediction as another set of inputs to your driving AI - a bit like letting the AI look further ahead down the track. There's no getting around the inherent chaotic nature of the car's dynamics, *but* we do know the system is controllable! That means there *does* exist a control law that will *always* get your car to the finish line *even* accounting for small perturbations. Finding it will require helping your car understand how to identify and cancel out perturbations as it goes. The resulting car will necessarily be much slower, but also much more consistent. I hope.
@Bazrrrr7 күн бұрын
One way you could have evened it up a bit was to force the AI to use digital inputs for steering like the human record holder was doing by the looks of it. Considering just the smallest deviation can cause chaos that fast and precise usage of analogue steering may have given the AI an advantage. You can see the difference at 23:50.
@intern14557 күн бұрын
@yoshtm It's strange I haven't seen anyone mentioning double precision and rounding. 0.001 is not 0.001 it's 0.0010000...98. Floating-point format encoding might also be a thing that you should look into, both at the game physics level and neural net level. If you take this into account the game is not truly deterministic.
@Ekstrax7 күн бұрын
0:35 my inner track mania driver just screamed at the sight of that 😂
@Ekstrax7 күн бұрын
i'm rewatching this for the second time and you mention that it seems to have trouble with long straight pieces more than others and i think this is seen often in these AI There should be an energy cost for moving like calories in real life to avoid unnecessary movements in situations like these
@BennyKleykens7 күн бұрын
It has that infinite-monkeys-writing-Shakespeare kind of vibe.
@dennisrkb7 күн бұрын
"Unsupervised" - spends 3 months babysitting his AI.
@dennisrkb7 күн бұрын
In English you stress the last syllable, not the first.
@CHRISTICAUTION7 күн бұрын
Intrinsic motivation can definitely help. E.g. random network distillation, its basically initializing a second NN two times (both with same weights) that you regress between the states. You update one and take the distance between the two networks as a 'novelty reward'. Also do you use stochastic actions as the official ppo and sac implementations say? I think that and trying a LSTM could improve your model. Thrilled to hear from you.
@nelonut76327 күн бұрын
you are french right?
@mr2miach7 күн бұрын
Your first concern about the AI always choosing speed over caution has a perfectly reasonable answer, which you kind of touched on later in your video, and I'm surprised you didn't connect the dots. While you're providing a reward/punishment, there is no ultimate punishment. I've done actual track driving with my own car. Time attacks, open track days, and even auto-x. Out at the track in the real world you see a few different types of drivers. The reckless speed hunters, the technical time chasers, the cautious fun seekers, and the slow but steady wins the race turtles. I fall into the technical time chasers group. I know my cars limits and I push to about 90-95% of those limits and focus on cleaner lines, faster exit speeds, blah blah blah. Why? Because I have fear. I fear damaging my car, damaging another's car, damaging myself and others. AI's have no actual fear. Sure, they don't want to be punished and want the reward, but they have no fear. No fear of harm or death is how you get the reckless speed hunters out at the track. In Trackmania, obviously there's no damage or death for AI or human driver. But, there is the fear of having to make yet another attempt with humans. AI once again, has no fear of even that. It's simply another attempt with possible reward or punishment. As for your discovery about the slight variances when the car is sitting still, a good possible answer is right in front of your face. Name one thing created by a human that is perfect. There will be nothing ever created by a human that is perfect. Mainly, because we ourselves are imperfect. We are cursed with only being able to create things that have flaws. You already noticed one of the flaws AI has. AI isn't creative. I just pointed out, AI has no fear. There's two flaws right out of the box. With that in mind, is it actually artificial intelligence or software capable of learning to a point? Even the reckless speed hunters change their driving style once something knocks some fear into them. What does AI have to fear? Can it even be afraid?
@rhino29607 күн бұрын
30:52 if you wanted the AI to be creative, you should have given it a reason to be, Im spitballing here but to my mind, an AI trying to learn this stuff as youve set things up here can't tell the difference between being creative and being random, to its eyes theyre the same thing, and in the AIs shoes given the parameters its learning within, if I have a choice between doing something random and likely being punished for it vs doing something i know I can already do with consistency, im gonna stick with what i know how to do, because even if the end result is punishment at least via what i know how to do, there will be still be some reward between the start and the end.
@rhino29607 күн бұрын
14:26 I think its the timing, the AI has learned to consistently alternate between sides almost like its using each side of the car like a foot, each of these movements requires a complex bunch of correctly timed inputs to follow through into the next cycle of movements, i think a combination of the speed of the AI's pace, the length of the pipe stretch its on, and the direction of the turn it has to negotiate are what's tripping it up, its like going to take a step, and suddenly where you expected and counted on there to be a solid piece of ground to put your foot on, instead your foot falls through a hole in the floor, and you follow after it. The AI by this point I think has effectively learned how to walk so to speak, but now it has to learn how to navigate uneven and somewhat unpredictable terrain.
@jojipoji23227 күн бұрын
What is the reward for something that does not exist?
@jojipoji23227 күн бұрын
Hon hon, iz ze end ov ze world
@rhino29607 күн бұрын
1:09 this right here is exactly why the AI prioritizes pace over consistency, its the way you're rewarding it, the faster it progresses, the higher the reward. The reward should be a combination of this and a cumulative reward on top of that which gets larger the further it manages to progress overall, to fine tune it even further you might try adding in another that rewards the AI based on the number of inputs it uses, if I'm right about this, it would teach the AI better consistency, and efficiency.