Introduction To Optimization: Gradient Based Algorithms

Рет қаралды 76,379

Күн бұрын

A conceptual overview of gradient based optimization algorithms.
NOTE: Slope equation is mistyped at 2:20, should be delta_y/delta_x.
This video is part of an introductory optimization series.
QUIZ: goo.gl/forms/1...
TRANSCRIPT:
Hello, and welcome to Introduction To Optimization. This video covers gradient based algorithms.
Gradient based algorithms and gradient free algorithms are the two main types of methods for solving optimization problems. In this video, we will learn the basic ideas behind how gradient based solvers work.
Gradient based solvers use derivatives to find the optimum value of a function.
To understand how this works, imagine that you are hiking on a mountainside, trying to find your way to a campsite at the bottom of the mountain. How would you know where to go? Perhaps you could follow a trail, look at a map, or use a GPS. You might even be able to see your destination, and head straight there. Now imagine that you have no map, no GPS, no trail, and there are trees all around that keep you from seeing anything but the area immediately around you. Now what?
Knowing nothing except for the fact that the campsite is at the bottom of the mountain, one possible option is to head downhill. You could look around, evaluate the slope of the mountain in the small area you can see, and walk in the direction with the steepest downhill slope. You could continue doing this, pausing once in awhile to find the best path forward, and eventually make it to the campsite.
On a basic level, this is the same thing that gradient based algorithms do. There are three main steps:
Search Direction: The first step is to pick a direction to go. The solver evaluates the slope by taking the derivative at its current location. In one dimension this derivative is the slope. In more than one dimension, this is called the gradient. The solver then uses this information together with other rules to pick a direction to go. This is called the search direction.
Step Size: The next step is to decide how far to go in the chosen direction. You don’t want to go too far in one direction, or you might end up going back up a different mountain. However, you do want to go far enough to make some progress towards your goal. The value the solver chooses is called the step size.
Convergence Check: Once a direction and a step size are chosen, the solver moves in the chosen direction. Then it checks to see if it has reached the bottom. If not, it uses the slope again to pick a new direction and step size. This continues until the solver reaches the bottom of the mountain, or the minimum. We call this convergence.
There are many variations on the way that these steps are performed, but these are the basic ideas behind how a gradient based optimization algorithm works.
Let’s take a look at what this might look like on an actual function. We’ll try to find the minimum of the equation x^3 + 15x^2 + y^3 +15y^2.
We’ll start out by visualizing the function. This is a plot of the function values over a range from -10 to 10 in both directions. Notice how the function slopes down towards a minimum in the center. To begin, we’ll need to give the optimizer an initial guess. Let’s choose (8,8). Another way we can represent this information is with a contour plot, where the lines represent constant levels or function values.
We can watch now as the optimizer chooses a search direction, and takes a step, a direction, and a step. Eventually it reaches the minimum point at x = 0, y = 0.
Gradient based algorithms have their own strengths and weaknesses. They are widely used, have fast performance, and scale well to large problems. However, they do require smooth, continuous function gradients, and computing those gradients can be computationally expensive. Many gradient based optimizers are also susceptible to finding local minima rather than a global optimum, meaning that they will find the bottom of the closest valley, rather than the lowest point on the whole map. Gradient based optimizers are a powerful tool, but as with any optimization problem, it takes experience and practice to know which method is the right one to use in your situation.

Пікірлер: 26

@PatienceRequired 5 жыл бұрын

I'm currently in a graduate numerical methods course for mechanical engineers and I must say, as someone who doesn't have a math background, the visual analogy was very helpful! Thank you, you've brought me one step closer to grasping this stuff!

@thirumurthym7980 3 жыл бұрын

simple and the best to grab the concept. thanks for this video

@yifeifeng7150 6 жыл бұрын

Hello. First thanks for the vivid video and explanation! I just have a doubt about the formula of slope shown at 2:19 where slope = delta_x / delta_y. I think it should be slope = delta_y / delta_x

@alphaopt2024 6 жыл бұрын

Glad you enjoyed it Kevin. Yes, I mistyped the slope equation. I've noted the correction in the video description.

@davidyenni3455 3 жыл бұрын

Your videos are excellent! 10 / 10 !

@usamsersultanov689 7 жыл бұрын

many thanks! please upload more videos! there are very helpful, especially for students!

@JCOpUntukIndonesia 7 жыл бұрын

Nice video AlphaOpt. I would like to help to point out that the slope equation in 2:20 is mistyped. I hope it helps. Thanks.

@alphaopt2024 6 жыл бұрын

Nice catch JCOp, I've noted the correction in the video description.

@mohithmenon1998 4 жыл бұрын

Hello sir ,isnt the slope =dy/dx rise over sun?

@abcdef-zb7qs 6 жыл бұрын

it helped me a lot! thank you for your video

@sitrakaforler8696 2 жыл бұрын

Nice 👍🏽

@SamRao 6 жыл бұрын

Thank you!! Very useful primer

@thenone339 5 жыл бұрын

thank you for that :) i would like to know why (0,0) point is the minimum value ? what if we choose a values that goes in -x and -y , is that mean the (0,0) is not always the minimum value ?

@alphaopt2024 5 жыл бұрын

That's correct, the value shown in the video is a local minimum, and the function is unbounded for large negative x and y values. It's a good example of why it's important to understand the shape of the function you are optimizing, and one of the potential downfalls of gradient based solvers. This example would technically be more correct if posed as a constrained optimization problem to limit the search to just the area shown.

@sunnymag1093 6 жыл бұрын

Thank you so much for such a helpful video! May I ask what software setup you used to create illustrations and images on the screen and to edit the video?

@alphaopt2024 6 жыл бұрын

Thanks Sungwon, I'm glad you enjoyed it. I make these videos in PowerPoint with Matlab and Python graphs included when needed. The new morph transition in PowerPoint is especially helpful. I record the presentation and voice-over in OBS Studio, and do some touch up in Adobe Premiere before publishing.