14 - Prediction and Planning Under Uncertainty

Рет қаралды 4,611

Күн бұрын

Пікірлер: 35

@qiaochuma3861 3 жыл бұрын

Thank you very much Alfredo for presenting this awesome course. I have watched all the courses you have presented in this and last year and am looking forward to seeing you in the next year’s course! You and Yann are really my hero in learning machine learning

@alfcnz 3 жыл бұрын

🥳🥳🥳

@cambridgebreaths3581 3 жыл бұрын

Brilliant as always. Love these videos. Thank you Alf...

@alfcnz 3 жыл бұрын

Last one for the semester from me. We'll see if more will come up…

@cambridgebreaths3581 3 жыл бұрын

@@alfcnz no way! Am addicted now:)...Hope more wonderful videos will be in the near horizon...Thanks for this lovely effort...

@alfcnz 3 жыл бұрын

I'll ask what y'all want me to teach next. These are all the lectures I have so far. But sure, I'll make more stuff since this field keeps evolving 😅😅😅

@alfcnz 3 жыл бұрын

Like transformers for image, pixel CNN, and some other stuff.

@cambridgebreaths3581 3 жыл бұрын

@@alfcnz Perfect. Already following your Twitter account, so will keep checking:). I'll enjoy rewatching all the interesting materials again. Thank you kindly Alf.

@НиколайНовичков-е1э 3 жыл бұрын

Professor, that was cool! :)

@alfcnz 3 жыл бұрын

😎😎😎

@chivonchhai389 3 жыл бұрын

Thank you so much.

@alfcnz 3 жыл бұрын

You're very welcome 🐱

@kalokng3572 2 жыл бұрын

It might be very interesting to compare your PPUU with the World Models paper authored by David Ha and Jürgen Schmidhuber!

@alfcnz 2 жыл бұрын

Yeah, you're right.

@jonathansum9084 3 жыл бұрын

Thank you for your hardwork. I still remember PPUU is way better than RL since last year.😀

@alfcnz 3 жыл бұрын

It is definitely more sample efficient, since it doesn't interact with the environment at all.

@tanmoym6241 2 жыл бұрын

Hi, I like you slides. What software do you use to prepare your beautiful slides ? Another point, would you agree to show how you make your videos i.e. the arrangements, devices etc. and how you organize them to create videos ? I would also like to make video tutorials and I like your videos. It will be helpful, if you can give me these information. Thank you in advance.

@alfcnz 2 жыл бұрын

twitter.com/alfcnz/status/1450501404062269446

@alfcnz 2 жыл бұрын

twitter.com/alfcnz/status/1296148252572647424

@youtugeo 2 жыл бұрын

Very interesting project! For the policy training you use the unfloding in time technique (model predictive control). Can I ask, how many timesteps does the moving window has? I guess that it would be desirable to unfold for very large number of timesteps because this would mean that we can make really long action plans (long into the future). What is the limiting factor on the number of timesteps? Is it the vanishing gradients (like in a long RNN)?

@alfcnz 2 жыл бұрын

Yes, of course, you need to start learning close to the target, then increase the distance.

@nourinahmedeka9518 2 жыл бұрын

Any advice for a new MSc student who has come back to academics after 4 years spent in the industry? How to start exploring the latest development in the field of AI with the aim in mind to find my own topic for the thesis? There are just so many topics and sub-topics, it is honestly overwhelming.

@alfcnz 2 жыл бұрын

The point is that you should be aware of what's out there. Then you pick _one_ topic and dig further. This course picks a few of these topics and walks you through each of them.

@jadtawil6143 3 жыл бұрын

The uncertainty regularization in the training of the agent is how you force the agent to take actions that maintain the states in a state space that has been observed by the world model? Whereas in RL, since you have a simulator, you don't need to do this because you can purposely explore the action space for "adversarial actions" during training. Does this make sense?

@alfcnz 3 жыл бұрын

Yup. But someone would need to create a realistic simulator with smart agents controlled by…? So, it's better to use data from a real environment.

@jadtawil6143 3 жыл бұрын

@@alfcnz controlled by other PPUU agents! lol

@alfcnz 3 жыл бұрын

But then all could simply go to the same speed and move uniformly. This would suggest that one should train a latent variable policy, which encodes an individual driving behaviour.

@jadtawil6143 3 жыл бұрын

@@alfcnz Right! then each latent sample corresponds to a different driver...

@jadtawil6143 3 жыл бұрын

Hello Alfredo, with regards to the world model. The deterministic predictor-decoder that produced the blurry prediction. This is because the state observations are stochastic (the bounding boxes jitter around)? Had the state observations been "clean", would the deterministic predictor-decoder work? If you can re-explain the exact aspect of this problem that requires latent variable to be involved! Thanks!

@alfcnz 3 жыл бұрын

The main issue here is that the future is multimodal. Given an initial condition, the data set contains multiple evolution of the future. For example, say you have a car next to you. Once it goes faster and once it goes slower than you. If you want to predict _both_ behaviours at the same time, you'll need a car that goes both faster and slower, meaning the car will stretch / elongate and its intensity vanish. (Imagine the superposition of both behaviours.)

@jadtawil6143 3 жыл бұрын

@@alfcnz Understood thanks!

@alfcnz 3 жыл бұрын

👌🏻👌🏻👌🏻