Thanks! Your tutorials were the first that worked for me. Biggest problem that I had was the directory path for the Starcraft maps.
@sentdex Жыл бұрын
Thank you for the super!
@serta57272 жыл бұрын
I have to say you make the most understandable learning materials Your website together to the videos. All the Code is there, the book, the playlists from scratch. Most professional educators can’t do this 🤗
@kailalueni3251 Жыл бұрын
I love you idea of drawing your own minimap! Thats a smart way to make more information available easily.
@awsamalmughrabi8602 жыл бұрын
I like how in depth this video is, really enjoyed it!
@JohnJackKeane2 жыл бұрын
I do not code or have the desire to code, but this video is beautiful. I enjoy StarCraft videos seeing people micromanage, but the thought and process that goes into creating a “program” to do the same thing is fascinating. The amount of work and work to obtain the knowledge that goes into the work is far underrated. I hope for you the best!
@Derrekito2 жыл бұрын
Never before has a marketing ploy worked so well on me. I'm looking forward to receiving the hardcover version of the book!
@protoplmz2 жыл бұрын
Hey! I love the update here. I followed the original series you put out. As a SC2 veteran I noticed deficiencies and deviated in a strong way halfway through. I setup separate models to handle the decision making for each aspect of the game. This makes it so it can make the decision to use its army separately from the decision of progressing tech (or not). I stopped around the time I couldn't figure out how to have it build its own strategies as I ended giving it a long set of possible actions and letting it pick and it felt too 'guided'. It was able to beat "Very Hard" 50% of the time vs random's 0%. Was my first exercise with ML. I got the chance to apply the concept it at work for something outside of my scope. Used both that and the SC2 project as demonstration in an interview and got a promotion out of it. This inspires me to try my hand at it again! EDIT: To handle army movement which you mentioned in the video, I chopped the maps up into a grid and gave it decisions to make where it could attack-move its army to any of these at will. 9 worked the best but you could make it much more granular. It used this to both attack and defend.
@fuba442 жыл бұрын
This was an interesting video. I will have a look at your example code for sure, wanna try to tinker a bit. Thanx for all your hard work.
@fuba442 жыл бұрын
Very interesting idea with a macro ai and a strategic ai, sort of working in tandem forming a symbiotic relationship of sorts.. could maybe even break that down even further, like on a per unit type basis... Tho i imagine the complexity explodes at that point.
@sentdex2 жыл бұрын
We have very few unit types, at least here. For the full game, there are more, and even here I wasn't utilizing all the things a voidray can actually do, but certainly there are ways to have a "voidray" algo and a "probe" algo...etc. Definitely something to think on.
@hikari16902 жыл бұрын
This sounds like how deepfakes work. Have 2 ai models compete with each other to improve each other. So if the macro ai needs to try to defeat the strategy and vice versa
@prodj.mixapeofficial64312 жыл бұрын
I believe dota have 5 controllable units, with individual open ai per unit, and modified communication between the 5 to mimic real human gameplay.
@Dethek2 жыл бұрын
When I was looking into the AI for starcraft i was thinking of the following: Overarching AI - makes final decision on what action to take Supported by: Strategy AI - use training from professional replays to assess based on what player has seen, what is their likely strategy, and then choose strategy based on that Macro AI Micro AI
@TheFalconerNZ2 жыл бұрын
@@ccriztoff Get his book lol ;-)
@pognar2 жыл бұрын
I have played starcraft for years and years, and I love this channel. This is going to be great.
@adityachawla75232 жыл бұрын
Here is an idea: You can use more then 3 channels to give spatial information to your network. No need to limit yourself by conventional idea of 3 channels! If you are worried about how to visualize this, just think of it as an extra map.
@kevintyrrell74092 жыл бұрын
14:49 That's some next-level Gateway placement.
@Neceros Жыл бұрын
This is great! I'd love to see something like this could compete in the arena
@faithful4512 жыл бұрын
I'd love to see the next video in this series with dual macro and micro algorithms and improving the win percentage
@PathToPrestige2 жыл бұрын
I'm replying very rarely to those kind of videos.. but hats off. Even though the project structure is messy, your genuine "realistic" practical approach was very enjoy some to watch.
@XmKevinChen2 жыл бұрын
It’s a very interesting video about the ML + gaming. As a newbie to this AI world, it also gives lots incentives to continue learning.
@erics35962 жыл бұрын
Do you want Skynet? Because this is how you get Skynet :) (also great strats and explanation on how this works)
@adye882 жыл бұрын
This is freaking intense! also for the hunters problem: Why not make a "return to safe space" function for them when they detect enemies. That way they only perform scouting duties.
@adye882 жыл бұрын
And obviously set a variable for safe space= position holding command center
@sebbes3332 жыл бұрын
*__* One thing I feel is missing from the map, is a kind of "ghost" of where enemies have been seen previously, which could become "points of interest" for scouting in the future. The "ghosts" could "fade" over time, but never fade to zero again (caped at minimum 1, starting at like 255 or something), to make the algorithm prioritize the most recent ghost locations. Also, instead of scouting with void rays, wouldn't it be cheaper to scout with drones (to generate ghost areas) (scouting probably targets mineral areas without ghost, to see if enemies have expanded, while voidrays can scout areas WITH ghosts, to see if the enemies are still there & try to defeat them there, can also send a probe first to ghost area, to determine enemy strength before attacking).
@Lithane972 жыл бұрын
Better yet, just train an observer to scout ghosts, it's almost like they're made for that 👍 Wouldn't even require any logic really, just if ghost entity train observer and have it sit there all game.
@achtsekundenfurz78762 жыл бұрын
I can imagine some ways to refine the AI using more inputs: -- time elapsed since game started (there's hardly any risk of attack at all in the 1st minute, but at a late stage, the risk is much higher), -- current resource totals (letting resources sit in the "bank" is usually wose than expanding the economy or forces), -- # of "ghosts" on the map (where enemies were sighted and lost again). About rewards and penalties, I'd suggest the following: -- adjust the reward/punishment for victory/defeat: a "good" AI should aim for a quick victory, but not at all costs. Maybe set the victory reward to 24,000 / sqrt(seconds played) and cap at 1000 (i.e. don't reward any higher for games lasting
@tjw24699 ай бұрын
@@Lithane97 if there is a raven+cyclone/raven+viking/missile turret then its a dead observer
@J3553xAnotherFan2 жыл бұрын
This is now the 3rd programming/ artificial intelligence channel that I've found myself watching even though my ability to code (or even Math) is so awful that if there was a gun to my head I would beg to just be shot. But I find it satisfying to watch. Like a time-lapse of an ant colony diligently working away.
@Ammothief412 жыл бұрын
Thanks for putting all of that together. Looks neat.
@serta57272 жыл бұрын
I recommend your Channel every now and then to people learning python 🤗
@gavinmorton76822 жыл бұрын
this is such a cool project! would love to see this keep going
@serta57272 жыл бұрын
Can’t get enough of learning this awesome stuff
@danielglidewell2 жыл бұрын
I wasn’t in the mood to watch the video when I read the title, but when I realized what the thumbnail was I stopped by to drop a like lol.
@EnderSword2 жыл бұрын
Kind of neat, I'm wondering if you looked at the AlphaStar research at all to do this, or looked into the StarCraft 2 AI community? There's about 70 coders of various bots and AI that compete against each other and it'd give you a ton of ideas on build choices and especially unit control and decision making.
@Leonhart_932 жыл бұрын
The AI coders in the community don't make true AI, they just give them a set of commands and responses to various actions. A true AI learns from successes and failures (reinforcement) with very little initial programming.
@PeterRAmice2 жыл бұрын
@@Leonhart_93 while this has some truth to it, what you are referring to is machine learning. The ai spectrum is much wider than learning like a human, the best way of describing ai imo is: a machine which observes it's environment and executes actions which maximizes its goals. So with that definition in mind I would argue those people are actually building ai's which do not automatically learn from their past experiences and thus they do not build machine learning ai's, which alpha did.
@Leonhart_932 жыл бұрын
@@PeterRAmice We just called bots that follow specific sets of instructions AI in the past out of laziness and limited understanding. It doesn't apply to current times anymore, fake AI and true AI have almost nothing in common. We can't use the same word to describe them both, so a "bot" is proper for the fake AI.
@Leonhart_932 жыл бұрын
@string name; Yes, bots. I played vs the top bots of the sc2 bots community, they are really good. They won't be easy unless you are at least masters, which is impressive for a bot. The major problem with those fake AI is that they can always be cheesed in some way, no human programmer can ever input the right answer for every situation. Btw, AlphaStar never had complete map vision, it wouldn't have been a valid test. It had complete vision of whatever parts it could see since there was no player-like camera which removed any delay from responses. I think that's ok, even bots respond to everything with 0 delay. AlphaStar has potential, but it will never progress past a certain point if they don't train it permanently on the ladder vs pro players and actually see current tactics.
@ErazerPT2 жыл бұрын
@string name; It's no more cheesy than a grandmaster switching cams at 400+apm (yes, they do it...). And while "beating the best" might sound like a great eng goal, all you need is to beat 99% to already go WAY beyond what humans can do (on averga). There's a few F1 top racers, there's billions of "common drivers", for a driving ML model which is more important, beating the top F1 or consistently outdoing "Average Joe"? p.s. that one "human trick" that beat the model in one game was a simple "loop", as the model got stuck reacting to the same thing in a loop, back and forward. You can observer that level of idiocy in humans too at times ;)
@BretBowlby2 жыл бұрын
I like the ideas here, but be sure that you've got task that can understand the adv. of having a high ground vision giving better attacking vs not having high ground vision. Also, I'd consider having the model constantly scouting as all information gained on the players actions can lead for better counter attacks and so forth. But yeah I'm loving this. keep'em coming!
@teardowndynamic61712 жыл бұрын
i know nothing about programming or AI but this is just so fun this watch
@binxuwang49602 жыл бұрын
Already super impressive that you could do rl for macro level strategy! Totally agree that to solve a csrtain problem how to formulate the state action and reward is key
@serta57272 жыл бұрын
Wow congratulations I think what you did is amazing 🤩 I would like to do something like this for software testing for a while but it is so complicated
@kylee.76542 жыл бұрын
At 4:52 regarding your comment, I added async def on_start(self): self.last_sent = 0 after the on_step function. It makes it a little clearer
@Mutual_Information2 жыл бұрын
Wow I'm literally working on a series on RL theory and I was just wondering how the hell you'd code things up to actually play Warcraft 3. Starcraft 2, close enough! Such a useful channel
@Singularitarian2 жыл бұрын
Very illuminating!
@ericzahn2742 жыл бұрын
Great vid. Buying the book.
@cedrickram31802 жыл бұрын
Some time series analysis (windowed access to what has been searched, where stuff was, ...) would probably help the AI make better decisions. The data of just the map does not do a good job of storing time-information. Your rewards seem like a good fit. Great video!
@stonecoldscubasteveo48272 жыл бұрын
Reward for resources spent. this will incentivize expansion and rapid army growth until max out. At that point change the reward to enemy units/structures killed. Something like (big reward) for spending money on nexus/probe/stargate (bigger reward) for void ray, (penalty) for having too much money banked up unless supply is >190. Then (big reward) for killing enemy unit/structure, while dialing back on rewards for building structures. zero out the rewards for probes over 70-80 and for pylons over 200 supply. When supply drops due to combat, flip the rewards back to making void rays to max out again.
@VaSoapman2 жыл бұрын
Why not give rewards based on how many enemy units/buildings are destroyed? Then give a penalty based on how many units/buildings are destroyed? Also to help the AI prioritize winning over stalling, you could increase the value of a win based on how fast it won.
@nrobo38402 жыл бұрын
Yeah, adding a time decay to the win reward was where my mind immediately went.
@moseszero32812 жыл бұрын
I was thinking a k/d reward and a lowering of all rewards for time
@benoitkinziki39162 жыл бұрын
For the reward mechanism you could probably build a LSTM that gives you the probability of winning for each action you take and you should probably include a time penalty to avoid the bot dragging the game out
@djsyntic2 жыл бұрын
When you got talking about how to handle the gas extractor on your minimap was that you handled it strangely. So keep in mind that the RGB values for the colors you put on your map are arbitrary and serve to help you visually more than the computer. But you could have encoded some meaningful data into the RGB itself. For example, instead of saying "This building is green, this building is dark green" and so on, you could have put all building/unit type info into the R-value of RGB. IE: This building is R-value 12, this building is R-value 13, and so on. Then the G-Value could represent something else, like building health. IE: R-12, G-255 means it's a Refinery at full health while R-12, G-1 means the Refinery is about to explode if it takes any more damage. Finally, the B-Value could then be used as some sort of indicator of something specific to that building. R-13 might be a Barracks, and B-2 might mean that it's in the middle of training something and has 2 units of time before it finishes and can do something else. On the other hand, R-14 might be a Gas node, and B-# could indicate how much gas that node has, while R-15 indicates that this is an extractor with the B-# still indicating how much is still in the node. Sure to YOU R-14 and R-15 are basically the same amounts of red and your eyes wouldn't be able to tell the difference, but to a computer, those are two distinct values.
@RickBeacham2 жыл бұрын
Great stuff! Super interesting.
@cmilkau2 жыл бұрын
Interesting actions. Not only do they encode a lot of knowledge about the game, they include deep causal chains that otherwise would take long to learn.
@achtsekundenfurz78762 жыл бұрын
Just a quick note: the "can afford" check at 04:47 is NOT totally redundant. You're inside a "for each idle stargate" sort of loop, and if two are idle, you could end up in a situation where you can afford one but not the other -- and depending on the capabilities of the ex-handler, tripping an exception doe to insufficient resources could crash the AI.
@tibielias2 жыл бұрын
What an awesome video! I wonder how making an API like this for other RTS games would be possible and then training AI models for those separately. 🤔
@ButtersDClown2 жыл бұрын
Very cool idea. I think programing a few meta builds into your algorithm and seeing how it learns with time (if achieved "this" by "this time" do "this" otherwise do "this") like doing a rush build ect.
@BasicAndSimple2 жыл бұрын
Book Purchased. Thanks
@dracomurdock63492 жыл бұрын
The criteria I would try to ensure it has highest on its priority is- if you win, only- unit efficiency. IE: how many resources did this unit earn, or destroy for an opponent, relative to its own cost? Averaging them out, and defining those units by a percentage based on the actions they were made to perform- and segmenting the game into the first 5 minutes and the rest of the game- you could provide a huge assist to the AI learning more complicated macro and micro strategies.
@robanson322 жыл бұрын
Great book! Love my copy
@achillestroy31222 жыл бұрын
I remember when you live coded AI plays GTA v and that too on python's default IDLE. Bring those days back. Great video though.
@whitey99332 жыл бұрын
Looks great, always been interested in the Alpha Star gameplay and how it manages all the different tasks. For the enemy search, can focus on undiscovered minerals (enemies would normally congregate around minerals fields) and probably better than random search.
@Cursedmountainstudios2 жыл бұрын
You were so preoccupied with if you could, you didn't stop to think if you should... and so it begins.
@nastrimarcello2 жыл бұрын
This amazing. Amazing code, amazing explanation, amazing editing. Only one suggestion: when possible, don't use try:...except:pass As this can lead to hellish problems. If you know what exception you are having in that try-except statement, using that exception explicitly is better (even if you are just going to 'pass' it)
@FF7Cloud2 жыл бұрын
it might help to allow a phoenix now and then for scouting purposes since void rays are super slow
@dogtato2 жыл бұрын
very interesting to see how you structured it to use ML decisions for higher level decision making. would definitely be interested in seeing how you approach a micro script and specifically wonder about the ability to add new behaviors without having to retrain from scratch
@KennTollens2 жыл бұрын
Hey Lieutenant Commander Data, I'm going to go buy your book so I can join Starfleet too!
@matthias9162 жыл бұрын
Read the first comment at 5:14. Usually what I do is this: ``` if not hasattr(self, "var_name"): self.var_name = initial_value ``` but I believe it is good practice to initialize the variable somewhere else like in the constructor
@wootcrisp2 жыл бұрын
Nicely done.
@ZhiYin2 жыл бұрын
I love how I know every word but have no idea what you're talking about. (youtube recommended this video because I follow startcraft2)
@Shadow-yl2tf2 жыл бұрын
9:00 another reward could be time. If you win a match, then the shorter the time, the extra rewards you get. Like wise, the opposite if you lose.
@cheddar5002 жыл бұрын
Very satisfying to watch
@matheusGMN2 жыл бұрын
your strategy of multiple Ais to coordinate everything at the end that you mention is the same one Paradox Entertainment uses in games like Stellaris and EU4
@MFTomp092 жыл бұрын
I wonder if modifying the reward structure to include a small reward for scouting. Like finding new enemy structures or something would be useful to get more wins in those games where you said they regrouped and came back with a larger force to beat you later
@AlexGrom Жыл бұрын
Later on there is potential to counter based on what and when was seen. You see early barracks - prepare to counter marines, marauders or reapers.
@cowjuicethepallytank2 жыл бұрын
Some potential rewards (or punishments) could be losing a voidray is a negative percentage of the positive reinforcement for attacking. Locating the enemy could be a small reward every x seconds to incentivise optimal searching patterns. Another question I have is what information does the API have access to? Does it have the capability of identifying enemy units? Are you able to get unit counts of the AI's specific units? Do we have the capability of training upgrades? In general, I think that with given the correct training it may be possible to find certain timings of when best to scount and taking optimal scouting paths as well as best attack timings in terms of time in game as well as potentially within build order. The difficulty, depending on how far you take it, could come down to army composition and as you were saying, micro. Lastly, showing my lack of knowledge in AI learning. Would it be possible to train the AI using professional gameplay wins, then use that as a baseline "build order" for then using the reinforcement learning?
@ReallyWhy1232 жыл бұрын
this book is impressive
@Ranshin0772 жыл бұрын
You could also give the ai a larger gradient win reward or larger gradient lose reward for shorter games.
@saksaganskasadyba2 жыл бұрын
It's amazing how are you doing it. Your videos are really inspiring
@Peter-rn5bu2 жыл бұрын
maybe understanding how ai learns how to get to a known goal, for example losing as fast as possible, could help develop or program ai which is more efficient or accurate
@HubertRozmarynowski Жыл бұрын
i'm not usually a fan of your videos but this one did not spare crucial information and presented the topic nicely
@ccgamerlol2 жыл бұрын
like Deepmind Alphastar, cool, would love to see full gameplay of this, please?
@thepacific29332 жыл бұрын
I think the best limitation would be a time limit to win the game. It would optimize all the aspects to achieve the best result
@Stthow2 жыл бұрын
Amazing video dude. Gj.
@Magicks2 жыл бұрын
well done sir
@a258852002 жыл бұрын
It remind me my FYP in university.
@FuneralProcession2 жыл бұрын
Reward for attacking and killing is so psycho though 😲
@canadiancoding2 жыл бұрын
Might also want to look into upgrades if you haven't. Units in mid-late without any upgrades are much worse in SC and this might have quite an impact.
@SuperShiki6662 жыл бұрын
You should make one for total war it's way simpler because there's no resources or capturing, just manuvering and using abilities.
@Veptis2 жыл бұрын
Coming up and implementing a good evaluation function is the hidden challenge if any deep learning project. What I feel like your model is lacking - is finer options. It's just classifying one out of 5 actions and that's it. So it's strongly limited by your hard crafted actions. As usual, those interfaces between models lose fine details. Which I believe is why large langauge models with all the residual connections do so great as they do all the tasks implicitly.
@jeremyheng85732 жыл бұрын
very inspiring video! Looking forward for more reinforcement learning tutorial!
@boon15802 жыл бұрын
i think if can make the ai go apeshet on micro-ing the fights, u can pretty much win at the first minute using just drone on drone action.
@serta57272 жыл бұрын
I find it very interesting 🤓
@BalimaarTheBassFish2 жыл бұрын
Well its been a while. A little sad we have stopped here -or at least seem to have. I was always curious how multiple agents working together could be easily implemented but I guess I am doomed to not know.
@BalimaarTheBassFish3 ай бұрын
Annual check in. Still sad that this felt a little rushed. Would have loved a more in depth thing like the first CNN ML method we did back then.
@matheusmterra2 жыл бұрын
Well, you could add rewards for scouting new locations, rewards for keeping units alive, and check out the math of pro starcraft players of what units you should use and when. Also tier rewards for which units and buildings it will destroy to reinforce priority targeting for better performance.
@RickBeacham2 жыл бұрын
I really want to buy that book.
@kristopherleslie83432 жыл бұрын
I would love to see you apply same thinking to Diablo 2 Resurrected
@Telos82 жыл бұрын
Any plans on a part 2 with the microgame plan implemented and see how it runs in tandem?
@TheThunderSpirit2 жыл бұрын
u have to use pipes for interprocess comm. or at least udp
@FireTouched2 жыл бұрын
I wonder the reward structure. It doesn't realy feel like looking for optimised play as the only negative reward you mentioned was the loss itself and after that only determining efficiancy by the total score. But what about tracking negative rewards (loss, loss of units/structures/resource access, etc.) and comparing the positive and negative score? That way the AI could pick a winning strategy that accrues few losses over one that accrues many losses - despite both having the same end score. And in turn the AI would be able to know the errors due to the dip in the comparison. Also maybe implementing a way that reduces positive/increases negative score over time? That way stalling would also be discouraged.
@serta57272 жыл бұрын
Thanks for the amazing content
@vladimirtchuiev22182 жыл бұрын
This video is so God damn cool, I have a current project that I try to make chess self play work on very limited resources, I think SC2 will be my next project if the actual python API is open. What is your GPU, and how long did it take for you to train the agent?
@IlIWarGIlI2 жыл бұрын
years of learning starcraft leads me to simply queue a few things but otherwise not give too many orders as that delays facilitation of actions.
@NVMDSTEvil2 жыл бұрын
Might be good idea to set reward based on time
@mstrkllr2 жыл бұрын
You're like Code Bullet, except without the Adderall and Self Depricating Humor 🤣
@robwolters74012 жыл бұрын
In my experience grouping attacks and synchronising targets is very important.
@HalIOfFamer2 жыл бұрын
Maybe code a reward for seeing unique enemy units/buildings. That way ai would have to scout the map for enemies, then double the reward for attacking if the attack unit was recently seen by a scout unit.
@raymoreclef2 жыл бұрын
What if you sent a few units to the starting locations in a clockwise method using waypoints of some sort. (At x time go to y waypoont repeat) These waypoints could be also used to pinpoint locations the computer hid and resulted in a loss. Also, what if a losing so many units quickly it was set to build the upgrade function. I think both of these ideas could be implemented with your "engines". This was extremely entertaining!
@MrSlyFoxJr_2 жыл бұрын
Definitely a very Protoss approach to SC2
@norik16162 жыл бұрын
Please, start using `pathlib` and the `Path` object 🙏🏼
@whateverppl12292 жыл бұрын
9:20 that's what I figured you'd do but my question is would it be a bad idea to take away points if an enemy unit/building dies? because then, it would be rewarded for attacking. (more points from a kill than a loss, or individually price every enemy unit/building as its own value and same with ally losses) to help teach it to not lose units but to do damage.
@witherslayer86732 жыл бұрын
how about building more than just void rays(having stats of each unit, cost, and space. may save for big units, or LOTS of small units) and where air units can go, and were ground units can travel
@JathTech2 жыл бұрын
a FAR better reward scheme would be kills to losses. Each unit would have a value assigned that you would lose points if you lost, and gain points if you killed. This is ultimately going to reinforce itself since by killing enemy units without losing your own, you always increase your relative combat power. Just "being in combat" incentivizes entering engagements that are losing propositions when you could choose not to fight, and instead wait until you have the advantage. Read the Art of War by Sun Tzu, and you may gain some insights on how to reward your AI.
@JathTech2 жыл бұрын
Another great way to do this with to be have the rewards scale dynamically over time. Early kills and loses will have a greater impact on the games' overall trajectory. They should be scored as such. An earlier more decisive win is considered better than a long drawn out dragging victory. So the victory reward should start high and reduce over time.