Great talk. I think the TL;DR is - Don't use rate limits - Capacity *planning* is hard. Make it dynamic. - Dynamic Capacity Planning can be done with AIMD en.wikipedia.org/wiki/Additive_increase/multiplicative_decrease - It's OK to tell your clients you're overloaded, they are the ones who are obliged to respect back pressure.
@LiamBaker-y4s Жыл бұрын
Sounds helpful for the server to proxy and proxy to client to use a back channel signal to advise a 'back off', and then a response (dropping/refusing requests) if it does not.
@yairmorgenstern4162 жыл бұрын
Incredible presentation. We need more practical strangeloop talks like this one!
@igorgulco66084 жыл бұрын
High Quality presentation! Thats what im here for!
@adityasanthosh7028 ай бұрын
One thing I'd like to see is how AIMD is configured. How do you decide backoff factor in multiplicative decrease and additive factor in Additive increase?
@growlingchaos7 жыл бұрын
Really great talk!
@julienviet6 жыл бұрын
yes great!!!
@parodoxis3 жыл бұрын
You're still rate limiting; you're just pushing that job onto the clients. But how will they know how long to wait after receiving a NOPE before trying again? If they try again to quickly, they'll just keep getting NOPEd. If they try again too slowly, there may be stranded capacity again. Instead, what if we simply sent back a "wait this long before your next request" header with every response? The wait period could be zero if the server is below capacity, but if it's at capacity we calculate a conservative estimate in milliseconds for how long they should wait - may be different every time - before making the next request. Simply compare how many requests we completed last second to how many we get done this second and assume the demand will be the same for the future second - then divide the next second up fairly among all the clients. Clients who we've never seen before will get priority, as they have no known wait time to go by, and we should give them VIP treatment anyway over the clients who have been hammering us for a while with no sign of stopping. Clients who disrespect the wait time get deprioritized, or even noped. I feel like this would maintain a constant near-100% pressure, and yet clients also know exactly what to expect - if they respect the wait time, they're guaranteed a quick response and no NOPE. If they see a wait value that's too high, they can choose to write the server off as too congested and give up for now, if they want. This just leaves even more capacity for the rest of the clients. Same happens when you gave a client a wait time but they have no more to send. Some capacity goes unused, and you can account for this in the next second's measurement.
@AmanGarg955 жыл бұрын
This is great. Well done.
@AnhNguyen-vu7mc6 жыл бұрын
this is a really helpful talk. Do you happen to open source all the code?
@vajravelumani1827 Жыл бұрын
instead of using `number of concurrent users` , should'nt we be using the `time taken to serve a request` as deciding factor for increasing/decreasing incoming requests ?