When Code Reviews Go Wrong...
6:29
The Coding Influencer's Dilemma
6:44
2024: Still No Life
3:59
5 ай бұрын
A Tribute To My Viewers
2:49
6 ай бұрын
Пікірлер
@RohitKumar29
@RohitKumar29 5 сағат бұрын
Great content Jordan! Really Enjoyed it. I kindly request you to also cover Online Bidding Platform (like e-Bay) in one of your videos. Thank you!
@sgulugulu19
@sgulugulu19 6 сағат бұрын
Thanks for the videos and specially making the 2.0 playlists. Been a follower since Dec last year and was able to get Amazon and Google's offer. Would have liked to thank you in person if you were still at G. 😄. Keep up the great stuff man!
@jordanhasnolife5163
@jordanhasnolife5163 6 сағат бұрын
Congrats man!! You're an absolute legend! And have fun at Google!
@khushalsingh576
@khushalsingh576 12 сағат бұрын
great video and the information at 07:54 (Fortunatly there are engineers who has no life ... 😂😂) made the practicle touch
@user-ee7oi8qv7f
@user-ee7oi8qv7f 16 сағат бұрын
So basically we can have both the implementation for the hash maps, one on the memory (i.e. space constraints) and second on the disk right (but either case it doesn't support range queries) and that's why we need B-trees. Now B-trees are disk only right, because the size of the data could be huge, it solves the range query issue but the problem with it is that the write could be expensive if we have to perform a B-tree split (recursive) right? This is what I understood so far, let me know if my understanding is correct. Also thanks a lot for this series, really helpful. Just one more question I have is that, we generally maintain WAL when we are doing a memory implementation of any kind of index right? or we do we maintain the WAL irrespective?
@jordanhasnolife5163
@jordanhasnolife5163 7 сағат бұрын
I think just about all DB indexes will use some sort of write ahead logs. I think what you said is basically correct, but the main reason b-tree indexes are good for range queries is that similar keys are next to each other on disk. Also, in reality, there's lots of in memory caching of B-tree pages that is done.
@VIPINKUMAR-dr7vu
@VIPINKUMAR-dr7vu Күн бұрын
Can we implement a product clustering approach using Natural Language Processing (NLP) techniques to group similar types of products together on the same node, thereby optimizing search query performance?
@jordanhasnolife5163
@jordanhasnolife5163 Күн бұрын
Yep!
@sankalpsharma1755
@sankalpsharma1755 Күн бұрын
Learned about binary search trees in 2016 And i am learning about their real use cases in 2024 :O And yes i'm old :/
@MultiCckk
@MultiCckk Күн бұрын
takes me 3 hours to understand and complete a 45 min video rip😂
@akibali7123
@akibali7123 Күн бұрын
Hi, You are delivering high quality content, It's very unique and hitting the core problems of each and every design. Can you please make a video on collaborative editing tool like excalidraw.
@jordanhasnolife5163
@jordanhasnolife5163 Күн бұрын
Thanks! Any reason in particular that you think Excalidraw is challenging? Unlike google docs, I imagine that there aren't enough concurrent edits to make a single leader infeasible here
@chaitanyatanwar8151
@chaitanyatanwar8151 Күн бұрын
Thanks!
@daisuke.ryomen
@daisuke.ryomen 2 күн бұрын
Just started watching this series, till now it has been a lot of fun + a lot of learning!
@msebrahim-007
@msebrahim-007 2 күн бұрын
Question about adding elements after they have been removed (14:04): If a user adds "ham" 5 times to the set on the same node, what is preventing the set from containing different 5 instances of "ham" with unique IDs?
@jordanhasnolife5163
@jordanhasnolife5163 2 күн бұрын
Nothing. You have multiple instances of ham now. On the front end though, we just tell the user that we have one instance of ham.
@tunepa4418
@tunepa4418 2 күн бұрын
Thanks for the video. How do we get top k count from count min sketch, I assume count min sketch is just for counting not to get top k
@jordanhasnolife5163
@jordanhasnolife5163 Күн бұрын
In theory you can maintain an in memory heap of size k with the counts of the top k elements as you compute the count of each while using the count min sketch
@medaliboulaamail6491
@medaliboulaamail6491 2 күн бұрын
hahahahaahah deserved for flying on delta (I never step foot inside an airplane)
@jordanhasnolife5163
@jordanhasnolife5163 2 күн бұрын
C'mon man it wasn't even spirit or frontier!
@medaliboulaamail6491
@medaliboulaamail6491 2 күн бұрын
@@jordanhasnolife5163 if it was spirit they would seat you on the jet engine none the less pick your poison
@aniketpandey1913
@aniketpandey1913 2 күн бұрын
Hey jordan, i'm not able to understand the use of CDN's here, are we going to store that 1 sec chunk in those CDN and also why do we need to store metadata, can you please clarify these doubts of mine
@jordanhasnolife5163
@jordanhasnolife5163 2 күн бұрын
Metadata basically just stores the link of the clip in S3 as well as it's sequence number and resolution. The CDN is used as a globally distributed cache of the video clips.
@truptijoshi2535
@truptijoshi2535 2 күн бұрын
Hi Jordan, can CDC have a single point of failure? If yes, how do we avoid? Also does CDC add extra latency?
@jordanhasnolife5163
@jordanhasnolife5163 2 күн бұрын
I mean in theory kafka, but I tend to imply that our Kafka cluster has replicas. CDC does make things slower, but I suppose in the cases where I use it I don't actually care (hence why I use it)
@seifeddinedridi4898
@seifeddinedridi4898 2 күн бұрын
Great video Jordan. It became a habit of mine 😄 to watch your content and study your system designs. Thanks mate for your work, I appreciate what you're doing.
@manishasharma-hy5mj
@manishasharma-hy5mj 2 күн бұрын
Hi Jordan, while performing writes in B-Tree, those are first written to WAL, and then it gets from WAL, to insert in its Btree like structure. So that's why writes are slower, as this must be synchronus operation. Am I right ?
@jordanhasnolife5163
@jordanhasnolife5163 2 күн бұрын
Not sure what you mean by synchronous but basically a write must be fully committed in the WAL for it to count otherwise we throw it out
@SonOfTheSoil_1
@SonOfTheSoil_1 2 күн бұрын
webcam is fine, i really don't care about video quality. amazing content though, thanks.
@huguesbouvier3821
@huguesbouvier3821 3 күн бұрын
Great video! Thank you! One comment: The encoding server will have to write to 4 different places. Either: - 2 PC: Bad - Write into the metadata cache last: Better And we could have a CRON job that cleans up failed job in the background?
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
So I'd say it really only has to write to two places (since the caches should be content that is pulled in, not pushed in). To avoid two phase commit, we can just write the metadata row once the S3 clip is uploaded. If for some reason there are some orphaned S3 files that's no biggie
@jalaj6253
@jalaj6253 3 күн бұрын
You have used same db type - columnar db but different underlying db for chunk metadata and chatDb. I guess reason behind it is you want chunk metadata to have consistent db (Hbase) while for chatDb you want availability more (cassandra). Is this right understanding or there are other reasons also ?
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
Hey I tried to explain this one more in the video. Cassandra - as fast write throughput as you're going to get (other than maybe like kafka or something and just using a log, which actually now that I think about it isn't the worst solution here). HBase - fairly fast for writes, but important for reads since we only want to access a couple columns of our data at once. You call both of these columnar dbs. To my knowledge, they both use column formats, but HBase uses column oriented storage, which is a significant difference.
@mystica7284
@mystica7284 3 күн бұрын
toe reveal stream when?! great video btw!!
@jalaj6253
@jalaj6253 3 күн бұрын
Your videos are one of the best source to learn about system design. Appreciate your effort and consistency:)
@emenikeanigbogu9368
@emenikeanigbogu9368 3 күн бұрын
start streaming on twitch!!!
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
Feet incoming?!
@foxedex447
@foxedex447 3 күн бұрын
ill keep watchin ur videos till i die or u die
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
I appreciate that man! I already have no life though so you've paid your debt. In all seriousness, once you get that job, no need to keep watching go have fun and socialize :)
@foxedex447
@foxedex447 3 күн бұрын
@@jordanhasnolife5163 i kinda watch em for fun at this point XDD
@InfiniteRabbitHole
@InfiniteRabbitHole 3 күн бұрын
Oh. So. Cool.
@jorgealonsogastelumgonzale2870
@jorgealonsogastelumgonzale2870 3 күн бұрын
Amazing video!
@scottmangiapane
@scottmangiapane 3 күн бұрын
Babe wake up, new system design just dropped
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
She may still be sleeping
@scottmangiapane
@scottmangiapane 3 күн бұрын
@@jordanhasnolife5163 That's OK, the system is fault tolerant. I will eventually achieve consistency via an air horn
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
@@scottmangiapane lol, I see you're the primary in your relationship then and you're asynchronously replicating your sleep status to her
@scottmangiapane
@scottmangiapane 3 күн бұрын
​@@jordanhasnolife5163 Oh yeah. And if she's still not up, I'll implement sharting I mean sharding ;)
@scottmangiapane
@scottmangiapane 3 күн бұрын
@@jordanhasnolife5163 Oh yeah. And if she still won't wake up, I'll implement sharting I mean sharding ;)
@devops_junkie9203
@devops_junkie9203 3 күн бұрын
Ah, this is amazing, I have soome junior developers that I am training on MS Architeture we are now at the integration point and wanted to see which is better from our case. It seems we might be using both options for us. Thanks
@shobhitarya1637
@shobhitarya1637 3 күн бұрын
Does nosql databases use same mechanism i.e using WAL or Logical replication log to replicate data to other nodes OR it is just applicable to sql databases?
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
I imagine this would be database specific but I don't see why they wouldn't
@sergiuchiuchiu6692
@sergiuchiuchiu6692 3 күн бұрын
@2:20 Your information is wrong there. I think you wanted to say that there are Partition keys (at least one) and Cluster keys (0 or more). Together they form the Primary key. Please review the video as it is misinforming thousands of people.
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
Oops, typo on my part. If this were a bigger deal I'd revise the video, but I don't think anyone is losing their job due to using the wrong terminology on cassandra key names.
@ganesansanthanam-5896
@ganesansanthanam-5896 3 күн бұрын
Please accept me as your mentee
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
I'm sorry man I'm pretty pressed for time these days, perhaps you could find one amongst my other gigachad viewers or go asking on blind/linkedin
@tunepa4418
@tunepa4418 4 күн бұрын
Why does a rider need to be connected to a matching service close its location using the geohash loadbalancing ? I am quite confused. Can you please clarify
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
The server itself doesn't need to be physically near it's location but it should be responsible for the area of the map that the potential rider and all nearby drivers could connect to
@muven2776
@muven2776 4 күн бұрын
Good Video - Too honest - Great content - Injecting the content like a slow poison
@meenalgoyal8933
@meenalgoyal8933 4 күн бұрын
Hey Jordan! thanks for the video. I have a question around the part where flink consumers sends the message to other users in group chat. The consumer checks with load balancer to know about the hosts to which other users are connected and then consumer sends the message to those hosts. But how does the communication between flink consumer and chat service host occur for consumer to share those message to host to send them further to users?
@jordanhasnolife5163
@jordanhasnolife5163 4 күн бұрын
Chat server can just expose an http endpoint that the flink node can hit
@thenamestails7152
@thenamestails7152 4 күн бұрын
How about Are you by any chance a C++ program? Cuz #include <iostreeam> #include "Rizz.h" using namespace std; int main { if (Rizz::wannaDate == true) { Rizz::isPregnant = true; } else { // there is no else } return -1; // yeah, I'm a bad guy :sunglasses: }
@thenamestails7152
@thenamestails7152 4 күн бұрын
"1+1 equals 10, if you know what I mean 😏😏"
@harshchiki7796
@harshchiki7796 4 күн бұрын
Did not follow the range query across geo hashes, in the slide where we were calculating the nearest points (slide having the diag of boxes)? Geo hashes don't have a mutual ordering - or do they? (if you don't mind, can you share a bit on that)
@jordanhasnolife5163
@jordanhasnolife5163 4 күн бұрын
You're right that there's not a direct ordering, but for a given geohash I do know all of the 8 hashes of the surrounding boxes so I can just check all of those with a Pythagorean equation to check things are in the correct distance from my focal point
@msebrahim-007
@msebrahim-007 4 күн бұрын
Question about the example at 9:55 where both leaders now have (Key: Jordan, Value: cute | scary). If a user reading from leader 1 is prompted to choose and picks "cute" whereas another user reading from leader 2 is prompted and chooses "scary", then we end up in a scenario where Leader 1 has "cute" and Leader 2 has "scary". Hence, there is a conflict. My question is, when a user is prompted and chooses a value, does the version vector increment for that leader? - If this is the case, I presume that when the leaders share their version vectors again we end up back at the same situation where we started and store the siblings. - If not, how do we go about resolving this conflict? Or perhaps this situation doesn't happen at all and i'm overthinking this?
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
1) Yes, let's increment the version vector. 2) Yes, we'd get siblings again lol, hopefully this doesn't happen too frequently and the two leaders have time to synchronize.
@slover4384
@slover4384 4 күн бұрын
LSM tree is the name industry gives to the entire structure (the in memory BST plus the different SST tables) The in memory BST part is called the "memtable" That is, memtable + SSTables == LSM Tree
@FaisalAnees46
@FaisalAnees46 5 күн бұрын
Awesome video man ! Super qq - towards the end, in your secondary global index - where the shards are partitioned by height, why would Dwyane's 6'3" "height" get hashed to the second shard ? Isn't the hashing happening on the value of heights and because of that Dwyane would hash to shard 1 ?
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
So the point is the height index is the secondary one. So his primary hash puts him on 2 but his secondary puts him on 1, now we need a two phase commit.
@slover4384
@slover4384 5 күн бұрын
If you are using hash index to offset the location of key-values on disk, why do you need a write ahead log too? That's 3 writes each time (in memory to hash index, on disk twice). I don't know any database that does this. You can always recover the hash index for the active segment of the database on restart by reading the actual segment of key-values from disk. For the older inactive (i.e., read-only) segments, the database stores a snapshot of the index for that segment onto disk when the segments goes into read-only mode. The write ahead log is useful if you are storing actual data values in memory only without the on disk key/values. In general, the write ahead log for any database, even relational databases, stores changes that were made in memory that >were not< made to disk yet. Similar for LSM commit logs. If a change was made to disk, there is little value in also tracking this in a WAL or commit log.
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
What about atomicity of transactions...
@slover4384
@slover4384 20 сағат бұрын
I don't know a database like we are discussing that has atomic transactions - though there may be some. I mean a database that immediately stores the key-value on disk with an in-memory hash index pointing to offsets on disk. ( Note: There are some well known databases that store sorted key-values pairs on disk as SSTables that provide ACID transactions, but I don't mean those. Note: Cassandra is a database that uses SSTables but doesn't/didn't support typical transactions. But since it buffered entries in memory first before disk, it has a subset of a WAL that was just the "redo" entries. Some call that log file a commit log to make it clear that it isn't a full blown WAL, but I usually call it a redo log. If Cassandra started supporting ACID transactions, it could possibly repurpose the commit logs and make them complete WALs. So my initial comment to you really should have said I was referring to that redo part of a WAL (which is what we were discussing I think in the video). That isn't really needed if you are going to persist all changes to disk immediately) If such a database was created, where hashed key-values are immediately written to disk instead of being buffered in memory for a while... and you wanted to implement atomic transactions -> you could use a stripped down version of a WAL that just has the undo entries. I would call this an undo log. But that's still a subset of a typical WAL, but most people would not object to a undo log being called a WAL so long as context makes it clear. You made a great point above though, which is forcing me to write these thoughts down.. Now, _if_ we were to introduce a WAL (i.e., with redo entries in it) to a database with immediately persisted hashed key/value pairs, the logical step is to actually not write things to disk immediately. That is, you want to spend some time buffering the hashed key/value pairs in memory with redo entries going into WAL first in case the power gets cut. TMI: A WAL is really a combined redo log and an undo log. 1) Redo logs refer to the durability in ACID. They are not in general needed if writes are persisted to disk immediately, because we already get durability when we persist writes. Adding a redo log allows us to program performance boosting features into the database which amount to buffering recently modified items in memory. 2) Undo logs refer to the atomicity in ACID. These are used to aid in aborting or rolling back partially committed transaction.
@jordanhasnolife5163
@jordanhasnolife5163 7 сағат бұрын
@@slover4384 Thanks for your detailed response, I appreciate it!
@ganesansanthanam-5896
@ganesansanthanam-5896 5 күн бұрын
I would love to be mentored by you. I am an international student who's struggling to find a job
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
Hey man! Wishing you the best - unfortunately unlike my time at google, I'm now a bit short on time, so I don't think I have the availability to mentor at the moment :(
@ganesansanthanam-5896
@ganesansanthanam-5896 5 күн бұрын
I would love to be mentored by you. Hey is there a discord or some community? My brain is melting and i am struggling..feel like an idiot
@ganesansanthanam-5896
@ganesansanthanam-5896 5 күн бұрын
I am a F1 Student, got laid off and would love to connect with you and grt some guidance on how to improve. I would love to be mentored by you....
@ganesansanthanam-5896
@ganesansanthanam-5896 5 күн бұрын
I would love to be mentored
@soumik76
@soumik76 5 күн бұрын
Hi Jordan, If DAG update isn't needed (as in if it's a simple cron job) then does executor directly updates schedules table, as there won't be CDC in this case?
@jordanhasnolife5163
@jordanhasnolife5163 3 күн бұрын
Seems reasonable to me
@collinmonahan3428
@collinmonahan3428 5 күн бұрын
I wanted to know more about your beef with Tech Lead. 😀
@jordanhasnolife5163
@jordanhasnolife5163 5 күн бұрын
I actually found tech lead pretty entertaining until I realized he wasn't being ironic
@wil2200
@wil2200 5 күн бұрын
Solid side job (id =14)