Hey Arpit, I’m having some trouble finding the reference blog in the info card. Can you please help me? One thing that really caught my attention is how GitHub, despite being such a large organization, has been using a single global memcache server until they encountered actual issues. It’s pretty impressive that they weren’t just for the sake of following the distributed systems trend but sticked to the basic simple setup until they truly needed to change. Also, regarding the Lua script, I recall (though not very clearly) that you mentioned in other snippet it has commands like EVAL that can write to a replica in Redis, and it actually writes to the master. Please correct me if I’m mistaken.
@AsliEngineering4 ай бұрын
1. Click on the I Card on the video, top right, or in the description scroll to the very end 2. I made that mistake once and said "server" instead of a cluster. They had one global Memcached Cluster (not server) 3. Yes. I could not catch this, but yes if you fire EVAL on Replica, Redis redirects it to Redis Master. Although GitHub did not reveal their exact flow, but in the blog they did mention that the reads were happening from Replica. My guess is that the operation is broken into two parts, one read and if all good then update. The first one can now still go to replica (through EVAL_RO instead of EVAL) and if the requests are within the limit then the count reduction can go to master through EVAL. I will be honest, this is my guess This was a nudge for me to think of it as following pseudocode 1. get and check the limit [from replica using EVAL_RO] 2. if exhausted: return error 3. else: 3.1 update the rate limiting counters in Redis using EVAL 3.2 create a new resposne obj and set the rate limiting headers 3.3 perform the operation 3.4. if error in execution, return error response 3.5 else: return the response This is what my best guess is as per the behaviour I have observed and the blog as I see two scripts attached in the blog at the end of it. Once again, thanks for this question, made me think harder about my understanding.
@nehagour69284 ай бұрын
@@AsliEngineering Thank you so much, Arpit. I really appreciate your explanation of EVAL; it is clear and easy to understand. Thanks again!
@rjarora5 ай бұрын
It was a pretty bad design tbh from the start itself. Why would they calculate a static field at runtime? It's anti-pattern and counterintuitive.
@imhiteshgarg5 ай бұрын
Out of all the issues that github faced, this was the silliest as per their engineering standards!
@obaid57615 ай бұрын
They should hire you bro. They'd never have another outage in the future 🎉
@SanjayB-vy4gx5 ай бұрын
@@obaid5761 🤣
@Polly101895 ай бұрын
Bugs are always silly, just like you!
@LogsofaCodingNomad-ns9us4 ай бұрын
So basically the real impact is when a user starts using Github api and has a use case where they need to make several calls , the reset time keeps drifting resulting in their valid calls getting rate limited hence causing the frustration.
@rahul106155 ай бұрын
Thank you for sharing the arpit , you content is your awesome
@PriestCoder4 ай бұрын
Sir I want to know how you gained such in depth knowledge of these topics , also what you learned in early days while exploring engineering , I only find ultra advance topics which I have never heard till now before landing on to your channel
@himeshrupareliya74774 ай бұрын
non tech question - what app / device do you use to make these notes and do the markups...
@adityakirankorlepara45005 ай бұрын
They could have also stored this static value in redis right ? Why store in a db ?
@AsliEngineering4 ай бұрын
That is what they are doing. I mentioned it in the video as well. The reset at is being stored as another key in Redis for every rate limiting key.
@adarshshete79874 ай бұрын
@aditya Redis is nothing but a DB in this case, I think you confused DB to some traditional relational DB.
@dhruvsolanki44734 ай бұрын
Time wobbling, seems like Interstellar. 😅
@anycat165 ай бұрын
great explanation
@anycat165 ай бұрын
great quality video
@ArnabBasu-r5m3 ай бұрын
Hi, Maybe a stupid question, what if the network transfer time was subtracted at the API server before sending back the reset time. Reset time = api_server.current_time() - network_transfer_time between the API and the redis server + TTL This way the extra storage footprint of the reset time can be avoided. Your thoughts?
@shaikmastan17809 күн бұрын
It would work when all the timestamps and time durations are of the same precision. However we cannot control the precision of the network latency and due to this difference in precision, you can always come up with an edge case.
@ArnabBasu-r5m9 күн бұрын
@ agreed!
@rupeshagarwal648715 сағат бұрын
There will be an edge case here as well. Considering the same example, if the request reaches the server at 1001.9 and takes 100 ms to reach the Redis server, the timestamp when it arrives at the server will be 1002. The TTL calculated at that point will be 3 seconds. If it then takes another 100 ms to return, the time at our server will be 1002.1. Subtracting the network latency gives us 1002.1 - 0.2 = 1001.9, and the calculated header will be 1001.9 + 3 = 1004.
@mohitvachhani11335 ай бұрын
Changes of this getting missed in integration testing....
@prasenjitgiri9194 ай бұрын
I did not get the 1005 and 1006, time has passed, so what is the challenge
@AsliEngineering4 ай бұрын
The value should have remained the same throughout but it did not.
@prasenjitgiri9194 ай бұрын
@@AsliEngineering No, I get that and also I appreciate your time and effort, but something is off in my understanding. is it because time is decaying (i mean the redis ttl) so no matter how much the latency be the overall should be lower than the previous trip. I hope I understood that, but somehow it is still foggy. Ive a lot of reading ahead of me. thanks again sir!
@vuleanh76474 ай бұрын
@@prasenjitgiri919 there will be an issue if you are downstream and need to use the `reset` field, but for a regular user, it's not a problem.
@nikhilmugganawar4 ай бұрын
Any reference link for readout?
@AsliEngineering4 ай бұрын
Check the I card of the video.
@nikhilmugganawar4 ай бұрын
Unfortunately I am not able to find the I card or perhaps dont know where to find it, could you please share it here or add in description
@AsliEngineering4 ай бұрын
@@nikhilmugganawar I card pops at the end not sure why it is not loading. You can find it below the description as well. Open it and scroll at the bottom of the description. There is the link attached
@Polly101895 ай бұрын
Thanks @Arpit for the video, however, what actually was the impact of this bugs, what actually was the problem is created?
@AsliEngineering4 ай бұрын
Clients (API users of GitHub) were getting inconsistent values and their downstream systems were getting affected. Not a large set of users, but people who smartly handle throttle, instead of adding random sleep, were affected by this wobble.
@gradbharath5 ай бұрын
pls speed up explanations
@J0Y225 ай бұрын
you can spped up the video speed from your end 3>
@AsliEngineering4 ай бұрын
After a long time I spoke slow to make explanations natural. People commented the opposite in other videos 😅 But yes, you can always speed up the playback speed.