This is a high quality video and I'm sure a lot of people can tell you put a lot of effort into these.
@satindershergilla7 жыл бұрын
Finally a KZbinr I always wanted to watch. Speed cool and great information
@jmzg12297 жыл бұрын
Hey Siraj, I don't think any of us can beat the Dota 2 bot that OpenAI just unveiled. Those guys really deserve a shout-out.
@aksa7067 жыл бұрын
I can't feel my brain anymore
@SirajRaval7 жыл бұрын
oh no
@aliciasuper70145 жыл бұрын
Tell me about it :/
@randomorange68075 жыл бұрын
Hey did u know? The brain can't feel pain, it senses the pain of the body and transmits it but it can't feel the pain... ..It can get hurt tho and get brain damage
@janmichaelbesinga38674 жыл бұрын
@@randomorange6807 but what if you punch a brain? does it feel pain?
@giuseppeguap72505 жыл бұрын
Just saw this now, your jokes we're KILLING IT back then
@ShaneCalderssc7 жыл бұрын
Thanks Siraj. Can't wait for the Super Mario Bros Bot. I enjoyed your videos in the deep learning ND. Cheers your effort is appreciated.
@luckybrandon7 жыл бұрын
One of your best vids Siraj!!
@SirajRaval7 жыл бұрын
thanks Brandon!
@neonipun6 жыл бұрын
At 8:08 whats the input_shape supposed to be ?? the challenge code and what you show are different ......
@voodoocobblestone83207 жыл бұрын
i cannot understand your videos. how should i start learning?
@vladislavdracula17637 жыл бұрын
Start by learning basic calculus, statistics, and linear algebra. Once you understand the basics, learning advanced concepts is not that hard.
@xPROxSNIPExMW2xPOWER7 жыл бұрын
no tensorflow and most of the other libraries handle almost all of the higher level math, all you will need buddy is to learn basic object orientation and then move into ML techniques. Don't fret, most of the complex math has been solved, all you will need to do is creativity implement that. Trust me it gets very easy once you learn the flow. If you are interested in advanced topics where you want to build your own ML algorithm then learn Linear Algebra with an emphasis on higher dimensional linear algebra will help greatly.
@hammadshaikhha7 жыл бұрын
Like others have mentioned, have a math and some machine learning background helps understand these faster pace videos. Another thing you can do is look in the description, read some of the blogs on the topics under "learning resources", and then come back and watch video again, it should make more sense.
@MachineLearningwithPhil7 жыл бұрын
Great place to start is coursera's class on machine learning. It's free and a solid intro to the core concepts From there, there are plenty of step by step tutorials on KZbin. SentDex has a great channel with lots of content - check him out
@notapplicable72927 жыл бұрын
Tip if you're trying to start don't start with Siraj. Start with someone slow (possibly the udemy machine learning micro-degree) as Siraj is very fast and awesome to expand your understanding but hard to start learning with.
@insightfulgarbage7 жыл бұрын
Very nice information and rythm, subscribed !
@harshitagarwal51887 жыл бұрын
at 5:15 you say the more in the fuure the reward is - more are we uncertain of it? i didn't get it-can you explain with an example ?
@rolininthemud7 жыл бұрын
I understand that a convolutional neural network can be used to simplify the state from an array of pixels to a smaller collection of values, but how does the algorithm use a deep network to approximate the Q-function? 8:19 Thank you!
@herougo7 жыл бұрын
Hi Siraj, could you include pseudocode of algorithms you talk about? I think it is crucial to be able to implement algorithms you learn about (ie "What I cannot code myself, I do not understand"). Explaining pseudocode is a great way to communicate algorithms in a clear, complete, and non-ambiguous way.
@basharjaankhan93267 жыл бұрын
OMG, i googled "Q Learning with Neural Network" a few months back without realising it was this important.
@SirajRaval7 жыл бұрын
haha awesome
@shreyas7077 жыл бұрын
I don't understand 10% of what you say but your videos are just epic! Please keep posting them often :)
@JakubRohla7 жыл бұрын
I still don't understand how we can store these Qs. Wouldn't they contain a quadrillions of states and actions for a pretty simple game? Seems pretty inefficient, so I would love to know where I'm wrong in my understanding of Q learning. Is there some generalization in place or what?
@poc71587 жыл бұрын
You can store all possible actions for all possible state in a matrix for a simple game like tic tac toe. However as you say it is impossible for more complexe game, it's why we use a neural network wich replace this matrice by taking the pixels of the screen as input (the state) wich then output an action. After training it is suppose to give the optimal action for any state we give as input.
@SirajRaval7 жыл бұрын
great answer pierre
@JakubRohla7 жыл бұрын
Thanks for the reply, this clarified it for me. Much thanks ^^
@ml-squared5 жыл бұрын
The way this works is by approximating an optimal Q function. A Q function is a function of state and action, so Q(s,a). Q*(s,a) is the optimal Q function. This is great for games with few states, but because of combinatorics, it does not scale to games with hundreds of thousands of states, such as video games. To accommodate this, we approximate Q* by using a parameterized Q function, Q(s,a,Theta), where Theta is a set of parameters that we need to optimize to bring us to approximating Q*. A type of function that's excellent at iteratively approximating functions through parameters is a neural network. So that's where Deep Q learning comes in, optimizing a neural network to approximate Q*.
@prashanttarey99026 жыл бұрын
Awesome and optimized explanations in all the videos! Thanks a lot!!
@look32487 жыл бұрын
Hey Siraj could you expand on this topic and explain how Sethbling's MarI/O program works?
@xPROxSNIPExMW2xPOWER7 жыл бұрын
I believe siraj already has a video on Genetic evolution decision making if im not mistaken. Does't Seth explain it pretty indepth tho, he talks about everything from math to how he programmed it with pearl I think.
@SirajRaval7 жыл бұрын
genetic algo vid coming this week (similar to what he used)
@-justyourfriendlyneighborh58987 жыл бұрын
Siraj Raval Hey Siral, in a previous stream you mentioned that learning this kind of thing (neural networks/Machine Learning) is best to do on the internet. I was wondering, for a near complete beginner, (minor experience with Processing.JS) where would you suggest that I start off? (I'm 15 and want to get into this field as soon as possible)
@flyingsquirrel32717 жыл бұрын
icancto Did you read the NEAT paper? If not, I'd recommend it, because it's actually really smart and comprehensible. NEAT is not just picking the best randomly generated genomes but uses a crossover mechanism which makes sure that only connections are crossed over that have a similar "purpose" inside of the neural net. It can intelligently crossover neural networks of different topologies which are created through mutation, starting with minimal neural networks. That way it improves the weights AND selects the ideal topology of the neural nets. Comparing NEAT to back-propagation doesn't make any sense because it's purpose is to be used, when you can't use back-propagation. MarI/O is a good example for this. What target-data would you use for back-propagation there? ;-)
@TheAnig1408957 жыл бұрын
he used lua
@hangchen5 жыл бұрын
7:46 Well I don't think the pooling layer is used to get insensitive about the locations of the objects in an image. The convolutional layer can already do that since the convolutional operation is actually a pixel window going from location to location until all locations are considered under the set stride. The pooling layer is used to semantically merge similar features into one, like in the max pooling example used in this video, you can see the image is partitioned into 4 parts and in each part, the max number is preserved. The max number can semantically represent a feature in that region. It's more like image compression but we have preserved the key features of this object in this image. Feeding this pooled image into the neural net could be more efficient.
@UNIVERSALMINDBAND6 жыл бұрын
And, what happen to rewards functions?, are the same for all these games?
@arthurwulfwhite82826 жыл бұрын
Probably score? Did you get an answer?
@vamsikrishna-qz8rt7 жыл бұрын
Hi Siraj, is there any way we can train a machine learning mode with a raw text file and properly arranged data from the text file in .csv file? So that when we input a new text file it automatically converts that text file into the .csv file format with columns and rows which we used as training data. Is this even possible?
@Machin3967 жыл бұрын
Your videos are amazing, thanks.
@haziqhamzah30715 жыл бұрын
Can you give some insights for Deep Q Learning in Mobile Networking?
@Machinima20137 жыл бұрын
You should do a video comparing this with NEAT, which is popular for this same use case.
@weinansun93217 жыл бұрын
thank you Siraj for your awesome content, you really made learning fun and easier!
@huluvublue1127 жыл бұрын
Question: Why do pooling layers make the Network spatially invariant? Don't they just compress information? I thought convolutional layers do that, which the model does have
@viviankeating73277 жыл бұрын
Max pooling compresses information, but it's lossy. On the first pooling operation you lose a pixel or two of position information. On a final pooling operation you might effectively be taking the max across the entire image.
@moonman2397 жыл бұрын
So with a Markov discrete process, there will always be some reward function R because getting the reward depends only on the states and actions we take. Thus, our AI can learn Q simply by going?
@abhinashkhare19337 жыл бұрын
hey siraj , can you help me explain this.. in sethbling video , the bot learned to play a mario level. But he didn't use the learning on new data or level. isn't this a overfitting, i mean bot just learned that level from trial n error.
@nermienkhalifa59976 жыл бұрын
Great, I do really love your way of explaining!! thanks
@MavVRX5 жыл бұрын
How would reinforcement learning work on a game with a town hub? One that requires mouse clicks to go into a dungeon, eg, Diablo, MMOs.
@lampsonnguyen94254 жыл бұрын
you explained it so well. thank you
@Sohlstyce3 жыл бұрын
hey siraj, can you make a full tutorial on reinforcement learning? thanks siraj
@Chris-di3us7 жыл бұрын
I love you man, I always wanted to do this myself
@rnparks7 жыл бұрын
Can you show the game Mario game actually running? It throws an error in my notebook. I'm using python 3.6 so maybe its a translation issue?
@AliAkhtarcs6 жыл бұрын
What is the difference between static and dynamic dataset? Can you elaborate more?
@piyushgupta8097 жыл бұрын
Great Improvement Brother. I am sorry, but previous videos were not good. Nice Tutorial and Intuition. although I Do recommend watching Deep mind's Reinforcement Learning Tutorial before jumping into Practical Application
@manojagarwal34414 жыл бұрын
Hey siraj , can you please share the link to code by winner and runner up on lda challenge , i know i am pretty late but i would really appreciate if you could help
@yehorpererva68037 жыл бұрын
Cool video. Thanks. But who to adjust this for certain purpose (like collecting all coins / getting the less score / speedrunning)?
@williamcosta66837 жыл бұрын
Could you guys give me any hint on how i can approach pong game to build a model where i can apply q learning? (I have all the informations necessary, like ball x and y position, player x and y position, ball speed, etc). I'm struggling at this :_:
@Veptis3 жыл бұрын
I am starting a deep learning course at university this semester. And maybe I can do a homework project. There is a mobile game from my childhood: Mirror's Edge mobile which launched on iOS and Windows Phone in like 2011 but is no longer available. If I somehow find a way to emulate the game on a computer and get either frames or game state values and manage to give it one of four different inputs per frame, I might try and teach a network to play the game. I also want to have it beat levels really fast and explore speedrunning this way.
@boemioofworld7 жыл бұрын
That was an awesome explanation. Thanks.
@mattgoralka39415 жыл бұрын
Hi, can someon please explain to me how the model is predicting in this sequence of code when it hasn't been trained yet? I'd really appreciate it. Thanks!! if np.random.rand()
@lefos995 жыл бұрын
Hey there, so the epislon tells us when it is ready to exploit Q-values instead of explore the map. The main idea is: 1) We specify an exploration rate “epsilon,” which we set to 1 in the beginning. This is the rate of steps that we’ll do randomly. In the beginning, this rate must be at its highest value, because we don’t know anything about the values in Q-table. This means we need to do a lot of exploration, by randomly choosing our actions. 2) We generate a random number. If this number > epsilon, then we will do “exploitation” (this means we use what we already know to select the best action at each step). Else, we’ll do exploration. The idea is that we must have a big epsilon at the beginning of the training of the Q-function. Then, reduce it progressively as the agent becomes more confident at estimating Q-values. Here is a nice graph of this idea: cdn-media-1.freecodecamp.org/images/1*9StLEbor62FUDSoRwxyJrg.png Hope that helped! :D
@mattgoralka39415 жыл бұрын
@@lefos99 Hi, thanks for helping me out! I understand that (at least I think), but I don't understand how the model can predict if it hasn't been trained. At what point is the model learning from the D values and is able to "exploit." I'm from more of a c background but I don't get how it's learning until the next block of code where it does "Experience Replay."
@lefos995 жыл бұрын
@@mattgoralka3941 Oh okay now I see your question. Well it depends on the Reinforcement Learning Technique you use. For example if you use simple Q-Learning, you just create a matrix (row is for state and column is fro action). There are plenty of concepts that you use, that I cannot explain in just one KZbin comment. A reallllly good and simple tutorial is this: simoninithomas.github.io/Deep_reinforcement_learning_Course/#syllabus In this tutorial you will find not only mathematical explanation but also explanation with examples in simple games. Check this out! ;)
@claudiowalter30926 жыл бұрын
How do you get the computer to play the game by itself and read the screen?
@altafurrahman94045 жыл бұрын
Hi Siraj, I want to know that I am going to do a path planning project to navigate a robot with Q learning. How much minimum hardware will be required to do this? Do we need a GPU? Will a core i5 PC only with CPU will be enough?
@harshakada33747 жыл бұрын
hey siraj I have a 4 node raspberry pi cluster computer, can I use it to train this Mario game?
@benjaminpaul35457 жыл бұрын
is it possible to do what you do in windows? cause i cant get the environment started even though the emulator is running can anyone help?
@johndoe-ug3lo7 жыл бұрын
So I am working on an AI for a hidden information game (for the sake of simplicity, you can think of poker). Optimal play would actually be a nash equilibrium problem, where each action is being taken some percentage of the time. Would the proper way to make an AI for this be to use a random number generator, and scale the frequency of each action to its Q value?
@fayezbayzidify7 жыл бұрын
first, but seriously nice vid Siraj you are amazing at what you do !
@user-zu1ix3yq2w7 жыл бұрын
RIP Chester.
@JazevoAudiosurf7 жыл бұрын
so this version is without a NN, at which point to you need a NN?
@arafatullahturjoy53805 жыл бұрын
Can Q-Learning be used for solving classification problem? If it does then how? could you explain or make a video regarding this topic? If you do it will very helpful.
@hanyuliangchina7 жыл бұрын
i Like this interesting video rather than the previous purely theoretical video, more humor is better For me, now the most important question: 1. for Machine learning beginners to train machine model which one is better buy gpu graphics card, or buy amazon cloud gpu hours 2.the tip of deep learning environment configuration, 3. the tip of Programming development skills,
@GKS2257 жыл бұрын
And now open AI beats human in Dota2 1v1 matchups
@SirajRaval7 жыл бұрын
stuff is moving fast
@readingsteiner60617 жыл бұрын
Sir I don't know who you are, but you totally blew me away with your comment. It is very rare to come across an individual who did us(the viewers on the Internet) a huge help in debunking certain methodologies in machine learning. I would love to see more of your writings. Folks at Vicarious are of a different breed I believe, maybe it is because of their influence by the Redwood Neuroscience Institute. It would certainly be a privilege if you consider my request. Truly humbled. Thanks Sir. I hop
@Belofsky17 жыл бұрын
I'm a hardware guy mostly, how do I go about ai or algorithms?
@TheLibertarian975 жыл бұрын
How I define when will give a reward to the bot?
@jobrown046 жыл бұрын
Hi Siraj. Have you thought about using Capsules (CapsNet) instead of not having a MaxPooling layer?
@maxitube306 жыл бұрын
where can find the winner of stock prediction challange?
@tylersnard5 жыл бұрын
Smart guy, talented teacher.
@Rambo96776 жыл бұрын
Great Video Siraj Thanks But I don't get something. How do you input 4 Gamescreens? Do you combine them as one input?
@mankitpong55917 жыл бұрын
The videos of David Silver from Deepmind are worth watching, that might be the bast reinforcement learning courses on web.
@rheeaa7 жыл бұрын
Siraj I'm a huge fan of your KZbin channel and I truly admire the way you taught yourself ML. I'm in my final year of undergrad, and I was thinking of not pursuing a master's degree rn. Any advice on what resources to use to teach myself ML or how to get some industry level exposure ? Thanks in advance 😉
@SirajRaval7 жыл бұрын
thanks rhea! see the ML subreddit
@qwerty111111227 жыл бұрын
Hi Siraj, could you have a video mention the OpenAI bot that beat a pro gamer at Dota 2 a few days ago? It's great that you released this video so close to this current event
@thedeliverguy8797 жыл бұрын
Thanks for the great video. I‘m still confused how this algorithm can generalize in any game. Is the generalization of algorithm different from the generalization of a specific AI program? Since the input and label (or control/buttons whatever) are fixed in a game, I don't think you can make an AGI just with this algorithm.
@stuartdavid7 жыл бұрын
Very nice! Do you have a video with more detail on Q learning? Would be interesting to see how the Q matrix evolves over play of a simple game.
@tjhannover30697 жыл бұрын
Is it possible to do that with games like Overwatch?
@TheAIChannel7 жыл бұрын
Hi Siraj, I am interested in stock price prediction and would like to have a glance on the second runner up code, can you kindly share the github link, thanks in advance,
@karljay74736 жыл бұрын
Can't find the links to the winner and runner up. Great series of videos!
@nitishravishankar65866 жыл бұрын
Thanks a lot Siraj! This video provided a great insight on applications of Q learning and RL. Are there any programming assignments (that includes a dataset) for this?
@HuyNguyen-rt7eb7 жыл бұрын
Hey Siraj great job on the videos. :) what do you think of the dota 2 ai that beat a pro player?
@MotazSaad5 жыл бұрын
The link of the paper web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf
@herstar95103 жыл бұрын
When are you forming a mid 90s boy band with machine learning themed ballads?
@chiranshuadik7 жыл бұрын
Nice video! Are u coming to Pune or Mumbai ??
@SirajRaval7 жыл бұрын
Mumbai
@chiranshuadik7 жыл бұрын
Siraj Raval when and where ?? Can ur fans meet you?
@egor.okhterov7 жыл бұрын
Too fast. I need a longer video :(
@mferum777 жыл бұрын
поставь скорость 0,5 )
@SirajRaval7 жыл бұрын
more to come
@chicken61807 жыл бұрын
been waiting so long for this! havent even watched it , but know it's going to be great already edit: confused, but not dissapointed :D
@eav300M7 жыл бұрын
Super Siraj AI. Who do you think is correct regarding the future of AI, Elon or Zuck?
@getrasa17 жыл бұрын
Elon, because he's aware of the danger that AI might cause to human race if we lose control over it.
@vijayabhaskar-j7 жыл бұрын
If you know AI,then you wont think AI as a Danger.
@getrasa17 жыл бұрын
Edgar Vega As soon as its intelligence starts increasing exponentially, we won't be able to keep up with it and understand it. Everything we don't understand is dangerous at some point (I'm referring to AGI and ASI)
@SirajRaval7 жыл бұрын
elon. we do need some regulation.
@frankribery33625 жыл бұрын
That part where he says hello world it's siraj... I'm replaying it again and again coz it's soo funny xD
@bofeng69107 жыл бұрын
Do I have to learn calculus to learn deep learning?
@rolininthemud7 жыл бұрын
Pretty much
@shahzmalik6 жыл бұрын
The only thing I am impressed is with his creativity
@masoudmasoumimoghaddam38327 жыл бұрын
Siraj all your videos are awsome. Could you make a video about temoral-difference learning which is announced by Professor Sutton. I also ask you to make another one about General Game Players and Monte Carlo Tree Search. Thanks
@xPROxSNIPExMW2xPOWER7 жыл бұрын
Yes a video on TD Learning would be wonderful
@masoudmasoumimoghaddam38327 жыл бұрын
Yeah! Specially if its differences and similarities with reinforcement learning would be pointed out.
@xPROxSNIPExMW2xPOWER7 жыл бұрын
I think TD learning is just an extension to back propagation, Its pretty fascinating
@koppuravuriravisankar79547 жыл бұрын
Hi Siraj, I love your teaching style and I am a member in UDACITY's deep learning foundation program in which you are an instructor, Here my doubt is that can we use DEEP Q-LEARNING in any other situations where image or pixel input would not be there, If yes can you tell how. I have read that for building Q-table we can use neural networks instead of table(state * action). can you explain it or if possible do a video about this.
@srenkoch45976 жыл бұрын
Hey Siraj! Great stuff! it could be really cool if you would combine Recurrent Neural Network and Deep Q-network = DRQN in a video! Thanks!
@YaduvendraSingh7 жыл бұрын
This is ultimate !! A game bot !! Thanks a lot Siraj ! When are you heading to India for a meet-up ?
@SirajRaval7 жыл бұрын
thanks! Sept 1 delhi one-way ticket. i'll figure things out from there
@_____86325 жыл бұрын
Wait, where my brain at?
@pinkiethesmilingcat28627 жыл бұрын
Siraj you have not accepted english subs in MoI #6 :(
@SirajRaval7 жыл бұрын
just did thanks
@donaldhobson88737 жыл бұрын
Wouldn't it work better if you trained a variational autoencoder on the screen data to capture the important patterns, then trained the deepQ model on the encoded screen. That way the VarAuto can learn a lot of info about how the world works even when rewards are scarce. I would use a bottleneck thats about 1/4 the dimensions of the image with say 3 layers. Leave the shrinking down from convolutional layers to dense layers for the deepQ.
@hammadshaikhha7 жыл бұрын
I don't know anything about this topic yet. But why don't you submit something along this line for the coding challenge for this week?
@SirajRaval7 жыл бұрын
hmm good thought . an autoencoder could work well.
@anonco19077 жыл бұрын
The memes were distracting, was to busy laughing that I didn't learn anything.
@TheLordyyx7 жыл бұрын
Hey Siraj, Deep Mind also works on a StarCraft 2 Learning Enviroment. I would love to see a video about it :)
@tomwojcik7 жыл бұрын
Video uploaded Aug 2017 and it's only 9:46 long? Autolike from me :)
@albertoguerrini97615 жыл бұрын
"We can't be sure that we'll get the same rewards in another episode" to justify discounted rewards... There's a gap between the two that I can''t seem to grasp, could anybody help?
@1992jamo4 жыл бұрын
I think he's looking at it the wrong way. A higher discount value means you value actions in the future more so, and a lower discount factor means that you value short term rewards. Long term goals seem the best, but long term predictions are less accurate.
@BhagavatVibes7 жыл бұрын
Hey Siraj, fantastic work. I am a unity developer so how can i integrate this functionality in games i already coded. Best wishes for future videos.
@larryteslaspacexboringlawr7397 жыл бұрын
thank you for deep q video game video
@sandzz7 жыл бұрын
Bill Nye of Computer Science Kanye of Code Beyonce of Neural Networks Osain Bolt of Learning Chuck Norris of Python Jesus Christ of Machine Learning
@SirajRaval7 жыл бұрын
thanks Sandzz
@sandzz7 жыл бұрын
I copied it from your channel description...I don't deserve that "thanks"
@sandzz7 жыл бұрын
I copied it from your channel description....I don't deserve that "thanks" .
@GameCasters6 жыл бұрын
but how do you grab input from a game in android? for example, an APK?
@anudeep1687 жыл бұрын
Awesome video :) But reminded me of Xin-Yang's HotDog-NotHitDog App :D
@codethings2717 жыл бұрын
thst was a classifier , SUPERVISED learning
@NaoshikuuAnimations7 жыл бұрын
Just a piece of advice, I hope you see this : never speak while showing text ! (I remember Vsauce saying this in a video too) But really, either show text and read it, or show images / yourself while talking; but displaying a text while saying something different is really hard to follow. If you want to talk about a part of the text, try to darken everything but the line you're talking about; overwise we won't know where to stop and whether to listen to you or read. (at least that's what most "educational" KZbinrs I follow do, and it works quite well) Especially when you're talking about such complicated subjects (and with such pace), I think that's important ! Hope it'll be useful somehow; Thanks for the vid' !
@SirajRaval7 жыл бұрын
great point thanks
@yashagarwal82496 жыл бұрын
Excellent point
@FilipeRebollo6 жыл бұрын
If advices were good it would not be for free..
@CausticCatastrophe5 жыл бұрын
i just pause.
@MissFashionDesign7 жыл бұрын
Siraj Rival is the neurotransmitter of generation z
@kermitthehermit93736 жыл бұрын
we all miss Chester!!😢
@synetic707x7 жыл бұрын
A video about Q learning for video games on actual games (without opengym) would be great