28: Bidding Platform (eBay) | Systems Design Interview Questions With Ex-Google SWE

Рет қаралды 7,987

Күн бұрын

Пікірлер

@hotnonsense5892 5 ай бұрын

Hey Jordan, just wanted to say I just landed an E5 role at Meta and your videos were a big part of helping me get past the system design interview. It wasn't the only resource I used, but it was definitely a huge help, especially early on in my study when it helped me fill a lot of holes in my knowledge. Thanks so much for producing these videos!

@jordanhasnolife5163 5 ай бұрын

Woo!! Congrats man and good luck in the new role!

@aishwaryakala9653 5 ай бұрын

Congrats! Mind sharing your resources and learning paths so that it could help others as well.

@hotnonsense5892 5 ай бұрын

@@aishwaryakala9653 Hello Interview blogs and videos are awesome. They also help with getting the logistics of how to structure and pace the interview. I watched System Design Fight Club videos, but the videos are so unfocused and long-winded you really have to watch them at high speed and skip over large parts. DDIA is a great read, although I can't directly attribute anything I did in any interviews to it (there isn't enough time usually), I do think it helped reinforce and internalize a lot of ideas I'd only skimmed. Also Kleppmann is also a great, engaging writer. I read Alex Xu's books and... idk personally I kind of hated them and found some of his solutions very questionable, but maybe that's just me. The writing isn't that great and some of it is a real slog to get through. It seemed to me like he didn't get anyone to proof-read or edit them. A lot of people like them though. Though even for material that is bad, reading it can still help because it stimulates your mind to think about what's wrong with it and come up with your own better solution. More important than any specific resource... practice! At least for Meta you have only 35 to 40 minutes after accounting for introductions and questions at the end. Write down checkpoints/milestones for how many minutes you can spend on each step of your system design and practice consistently hitting them. Being able to come up with a great design in 2 hours is pointless because they will cut you off long before then. I practiced with a random design prompt every day and a timer and made sure I was hitting those checkpoints. Also consider that in the real interview you will probably get interrupted sometimes. Also if you can afford it, consider some paid mocks.

@hotnonsense5892 5 ай бұрын

@@aishwaryakala9653 Hello Interview blogs and videos are great. They also help with getting the logistics of how to structure and pace the interview. Various other random YT videos on topics I felt I didn't understand. I watched System Design Fight Club vids, but the videos are so unfocused and long-winded you really have to watch them at high speed and skip over large parts. DDIA is awesome, although I can't directly attribute anything I did in any interviews to it (there isn't enough time usually), I do think it helped reinforce and internalize a lot of ideas I'd only skimmed. Also Kleppmann is also a great engaging writer. I read Alex Xu's books and... idk personally I kind of disliked them and found some of his understanding of some systems a bit questionable, but maybe that's just me. The writing isn't always great and it's a slog to get through some parts. He needed better editing and proof-reading imo. A lot of people like them though. Even for material that you don't agree with, reading it can still help because it stimulates your mind to think about what's wrong with it and come up with your own better solution. But more important than any specific resource... practice! At least for Meta you have only 35 to 40 minutes after accounting for introductions and questions at the end. Write down checkpoints/milestones for how many minutes you can spend on each step of your system design and practice consistently hitting them. Being able to come up with a great design in 2 hours is pointless because they will cut you off long before then. Practice with a random design prompt every day and a timer and make sure you're hitting your time checkpoints. Also consider that in the real interview you will probably get interrupted sometimes. Also if you can afford it, consider some paid mocks.

@ronakshah725 25 күн бұрын

Hey this was a helpful comment! I got E5 M too recently, and timing is the name of the game. Been through the same resources and agree with your opinions on the resources. I did one thing additionally for really getting better which was super random. Stumbled upon Cassandra deep dive book and went through first 150 or so pages. This was not specifically to learn Cassandra but I’d say that’s a side effect. The main knowledge is how a production level highly distributed system handles various topics like fault tolerance, coordination, and the issues arising out of them. The way it’s described in the book is just excellent from a writing perspective. This was more of a “foundational “ knowledge I gained around a lot of the concepts and it has helped me a lot. So would highly recommend

@mayuragrawal4352 3 ай бұрын

Great in depth video. loved the part that all unnecessary part of design or commonplace is not focussed on but only critical part which is about scale and concurrency and ordering problem, achieving atomicity without transactions. Thanks Jordan. I watch your videos only for system design. no need to go to other place.

@jporritt 4 ай бұрын

Hi Jordan. Kafka ordering is only guaranteed within a partition. To get Kafka to have guarantee ordering therefore you need to configure a topic to have only one partitoin.

@jordanhasnolife5163 4 ай бұрын

Yeah we partition by auctionid here

@jporritt 4 ай бұрын

@@jordanhasnolife5163 Ah…sorry, yes!

@akshayyn 3 ай бұрын

@@jordanhasnolife5163 correct me if I am worng, but we also need only one consumer with single thread listening to a partition to ensure order right?

@jordanhasnolife5163 3 ай бұрын

@@akshayyn Typically in kafka you're not interleaving which entries go to which consumer, each consumer tends to read all entries in its partition. But yes, technically events could be processed out of order if you did do that.

@MahdiBelbasi 2 ай бұрын

Dude, this was awesome. Well done on the design and the explanation.

@Ambister 5 ай бұрын

You look like Marv from Home Alone after the iron fell on his face. Love the vids tho

@jordanhasnolife5163 5 ай бұрын

Been a rough week for me

@Keira77L-t3b 3 ай бұрын

Great contents as always, thanks! Some questions if I may: 1) on the state machine replication for bid engine back up, isn’t it almost like a 2 phase commit? And because user has to wait for the backup writes/replication confirmation, how is it going to be fast? Or at least faster than Kafka? Imagine if you put the same in memory bid engine behind Kafka, isn’t it going to similar or even better? 2) on handling the hot bid part, I see although we can scale the readers (get the current bidding info), biding engine, a single machine most likely, is still the bottleneck and arguably does most of the heavy lifting? Acquire lock, decide on winning logic, send to backup/kafka and etc.

@jordanhasnolife5163 3 ай бұрын

Yes, it basically is a two phase commit, except the message only has to get to kafka (I'm using a successful write there as an indication that the write is durable), as opposed to completely fully replicated in memory down stream. 2) Yeah, but there's not really a solution to this. If you need multiple things to choose an absolute ordering over them they've gotta go to one place. We could buffer in many different kafka queues and have the bid engine poll at the rate it's able to, but then this whole process becomes asynchronous, and I had wanted it to be synchronous.

@ekinrf 3 ай бұрын

@@jordanhasnolife5163 Not sure I fully get the argument for a synchrous process though, as long as the througput is high, which arguably async architecture might be better here, user experience would not be affected?

@jordanhasnolife5163 3 ай бұрын

@@ekinrf perhaps not. being "synchronous" is really an illusion over an asyncrhonous network anyways.

@youaresowealthy7333 4 ай бұрын

If interview is to design a heavy read light write, then sync is fine. But for heavy write&read, it can dangerous/hard to sell this sync design, all sync part should be doing early rejection only while those surviving bidding need be queued and processed async by a single node eventually. More like a flash sale problem, rejection starts from client/browser.

@jordanhasnolife5163 4 ай бұрын

I agree that putting all bids in kafka and processing them asynchronously is the better way to go here.

@soumik76 4 ай бұрын

Hey thanks for this. I was wondering if Redis would be a good choice to run on Bid Engine, given it's atomic and single-threaded. Custom Lua scripts could have the logic to push the things in kafka. Do you see a downside to this?

@jordanhasnolife5163 4 ай бұрын

I think if you can make what you said work that seems pretty feasible!

@user-sx4wm5ls5q 8 күн бұрын

thanks for the great video. One qq, how can a kafka queue have multiple consumers from 28:36 for "top-bid kafka queue"? I heard one partition of kafka can have only one consumer at a time. I guess we need to post the record into multiple partitions?(which parititions by the server it will be subscrbing..?) or relaistically, just change the last bit to in-memory queue like pubsub?

@jordanhasnolife5163 8 күн бұрын

You can have many consumers on the same partition, you just can't do round robin within a partition

@user-sx4wm5ls5q 8 күн бұрын

@@jordanhasnolife5163 Ahh thanks. I had a misconcept of kafka, I thought it was one consumer per partition, and after seeing your reply, and found out that one consumer per partition within a consumer group, where each consumer group keeps its own offset. Thanks!!

@krupaharan9406 2 ай бұрын

Thanks for the detailed overview. Quick question, do you think Redis Sorted Set collection would help to find the top bids ? And that can even offer as an backup option. Any thoughts pls

@jordanhasnolife5163 2 ай бұрын

Well, we only need to know the top bid at a time, as opposed to the top k, which is when a sorted set would be useful.

@AkritiBhat 4 ай бұрын

one quick question, how would we handle hot auctions near the deadline. When there are too many bids. In that case, we might need to have Kafka before our bidding engine

@jordanhasnolife5163 4 ай бұрын

Agreed, we lose any request response ability for our requests but what can ya do

@LUN-bo2fb 5 ай бұрын

Jordan, how do you feel about the delivary framework that's suggested by vast amount of youtubers. Functional and nonfunctional requirement -> back of envelop estimation -> list down core entities -> API design -> high level design -> deep dive. I found you have your own style of delivary. and it is still smooth.

@jordanhasnolife5163 5 ай бұрын

I used to do this in the 1.0 playlist. I found that it wasted a lot of time for me, and I think that most people are pretty capable of figuring out the APIs that you need fairly quickly. The high level design to deep dive is something I've considered doing more, which is why I tend to have these overview slides. I don't think giving a high level design without some initial discussion first makes much sense to me.

@LUN-bo2fb 5 ай бұрын

@@jordanhasnolife5163 Yes I think high level design always goes into very similiar form if you do it without discussion on how you are going to handle and store data. I am still thinking if I should draw a high level diagram in my upcoming interview or not. I did this in a mock interview, and eventually, the interview still ask me how data model looks like and how I handle race condition. I think put the high level design after presenting data model to do a quick summarization of current discussion may be a good idea.

@nikhilm9494 5 ай бұрын

Hey Jordan, Do you plan on bringing back the low level system design videos anytime going forward?

@jordanhasnolife5163 5 ай бұрын

At least for the time being, I'm planning to stick with distributed systems design. That being said, I'm sure I'll eventually fall into a rut, and once I do I may revisit these!

@tobiadeoye1439 4 ай бұрын

Very nice video, Jordan. One question I have is: how do we accurately restore the Auction state on the server if both the Bid Engine and Backup Engine go down?

@jordanhasnolife5163 4 ай бұрын

MOAR BACKUPS Beyond a point, everything can fail. There's no way to guarantee fault tolerance against everything, but within reason hopefully.

@manojgoyal-y3k 3 ай бұрын

Hi jordan, i i believe, when a bid is being processed, then we are keep the price of the auction updated with highest what if,we can keep the bid id also in same auction table of auction db. now in case. we are keeping the auction db replicated also using leader follower with consesus algo. then we will not need backup engine at all even big engin(stateless) gose down. will this method work?

@jordanhasnolife5163 3 ай бұрын

Yeah for sure. It just now takes a full consensus write to submit a bid, so your throughput goes down quite a bit.

@Ayshhg 2 ай бұрын

Hey Jordan, Why dont we use something like transactional outbox pattern for writing to kafka. Instead of the broker listening to db I can have some outbox table whose job is to send the data to kafka and this will achieve dual write with a very high throughput.

@jordanhasnolife5163 2 ай бұрын

Because then I have to write to disk first which lowers my throughput. If this table is in memory then by all means go for it

@madhuj6912 17 күн бұрын

why can't we have multiple instances bid engine and have a distriburted lock using redis (update the bid inprogess).. will that work? May be another basic question, is backup the same instance of bid engine?

@jordanhasnolife5163 15 күн бұрын

For your first question, that's gonna lead to a lot of contention since they're all trying to update the same bid object (at least that's what I think you're trying to say) #2 - not sure what you mean here. They're different servers

@madhuj6912 15 күн бұрын

@@jordanhasnolife5163 Thanks for you reply. I meant , is it the same bid engine code deployed on diff server?

@jordanhasnolife5163 8 күн бұрын

@@madhuj6912 Ah yeah basically, but it just pulls incoming events from kafka

@rr5349 3 ай бұрын

I think I missed it (or maybe i keep tuning out at the right time), but where is the actually collision case discussed where two people at the exact time submit the same bid price (and how is it determined who wins?). I understand the "we lock each bid coming in -> increment sequence -> ship to primary/backup engine for state, kafka for source of truth" but wouldn't the two concurrent writes be the same sequence id. I guess what I'm asking is what the collision principle is there. Either way, phenomenal video. I'm having a great time binging these before my interview weh.

@jordanhasnolife5163 3 ай бұрын

what does locking accomplish? We aren't "locking bids", each bid must successfully grab a lock so we can assign it a sequence number.

@rr5349 3 ай бұрын

@@jordanhasnolife5163 ah i see the difference there, so we are locking against the sequence number to avoid writing two of the same sequence ids? I can see it from a logistical perspective, but where does the fairness aspect come into play? If we have two bids at the exact same time, then wouldn't the winner be whoever acquires a lock first? I guess if it isn't up to us to decide who wins the tie breaker (for the sake of this exercise), and its just get lucky and your bid is accepted before the other one because you got the lock first, then I get how it is expected to work. Thanks for the follow up!

@jordanhasnolife5163 3 ай бұрын

@@rr5349 Yeah I don't see how you can ever be "fair" unless one party decides which thing gets there first. You can't trust distributed timestamps.

@jordiesteve8693 4 ай бұрын

in the 2nd pseudo code, you enqueue some bids into an in-memory queue, and then ship to kafka, am I right?

@jordanhasnolife5163 4 ай бұрын

Yep!

@golabarood1 4 ай бұрын

Thanks Jordan for an amazing upload as always! A question from the last slide - - what is the use of Auction DB? Is it just for the bidding engine to read and write some state that it needs? As looks like only bidding engine is interacting with it.

@jordanhasnolife5163 4 ай бұрын

Yeah, we basically need it to write end auction result as well as query for existing auctions if users want to bid.

@perfectalgos9641 5 ай бұрын

At the 31:28th second, why do we need an Bid Gateway behind LB? I think It should be other way around BidGateway and then LB for Bid Engine.

@jordanhasnolife5163 5 ай бұрын

Technically you'd want one for both. The bid gateway is likely going to be run by many servers which we can round robin to. Finding the right bidding engine will depend on the id of the auction.

@techlifewithmohsin6142 5 ай бұрын

Love the content, can you also make a video on KZbin Analytics like Video counts, watch time with no double view on given time period and extensible to have new metrics for content creators.

@jordanhasnolife5163 5 ай бұрын

I'd say this sounds a lot like the "top K" problem, but instead of getting the top K you compute it for all of them.

@techlifewithmohsin6142 5 ай бұрын

@@tttrrrrr1841 Yeah I saw this on Leetcode only, can you add more details. You mean you used video chunk_id to get the count and its time_window can be used to multiply the count to get total watch time?

@techlifewithmohsin6142 5 ай бұрын

@@tttrrrrr1841 you mean you used video segment_id events to aggregate and then use video_id to perform those metrics operations at the query time.?

@techlifewithmohsin6142 5 ай бұрын

@@tttrrrrr1841 Can you also tell what was the question you had in second round of SD?

@jordanhasnolife5163 5 ай бұрын

@@tttrrrrr1841 lol funny enough I did not see your leetcode comment, this one was me I promise

@Av-cu6gm 5 ай бұрын

Why we have auction db mysql, when it's decided to choose kafka and time series as source of truth?

@jordanhasnolife5163 5 ай бұрын

This is just for the metadata of the auction itself, not the bids

@popricereceipts4279 5 ай бұрын

So how does the bid engine actually determine the ordering? Like what is the actual logic?

@jordanhasnolife5163 5 ай бұрын

You just grab a lock, increment the sequence number by 1, release the lock.

@Raymondhjc 5 ай бұрын

Hey Jordan, do you plan to do a email system design?

@jordanhasnolife5163 5 ай бұрын

If I think it has some aspects of it that are new then sure

@mdasifqureshi 5 ай бұрын

One thing I wanna point out is that if the primary goes down, the backup server will have to wait to be caught up with all the kakfa messages in its partition before it can start serving requests. So we can have some unavailability, which is ok here since we are trading it for consistency. Another thing is that since kafka already guarantees ordering within a partition I don't think we need the sequence numbers. Since bid engine is evaluating the bids in a serial order through a critical section and also persists messages to kafka in that critical section, the ordering of messages in kafka will be consistent with the sequence numbers making them redundant. @jordan let me know if I missed something.

@mdasifqureshi 5 ай бұрын

On reviewing the pseudocode it seems like we are writing to kafka in a background thread. I don't think thats feasible as it'll violate our durability guarantees. I think have to persist to kafka before we acknowledge a bid.

@jordanhasnolife5163 5 ай бұрын

1) Totally agree that you'd have to wait for the backup to read all Kafka messages. You could mitigate this by sending messages to the backup first and then it puts them in Kafka, but there are tradeoffs there. The sequence numbers were just for the case where we publish to Kafka on a another thread

@jordanhasnolife5163 5 ай бұрын

As for the separate Kafka thread thing, I registered a handler for when we get the ack at which point we return to the user. But then is it truly synchronous?

@mdasifqureshi 5 ай бұрын

@@jordanhasnolife5163 Got it. But I wonder if there is any benefit to using a separate thread to publish to kafka. If we use a single bg thread that reads for the in memory queue and publishes to kafka, I'd say lets just use the main thread as pusblishing to kafka is the bottleneck so having multiple thread doesn't really improve response time to client. If we use multiple bg thread that reads for the in memory queue and publishes to kafka( which eliminates the main bottleneck through parallelism), then the messages will be out of order in kafka. Now sequence number do come to rescue here but consider a scenario where I successfully published sq no 5 but the primary died before publishing sq no 4. Since the determination of whether sq no 5 was accepted/rejected depends on sq no 4, loosing sq no 4 which sq no 5 is persisted will create inconsistency in our system. Interested in hear your thoughts on this.

@mdasifqureshi 5 ай бұрын

@@jordanhasnolife5163 Got it. Though I am having trouble understanding what the benefit of having a separate thread to publish messages is. I don't think we can use multiple bg thread to publish to kafka because that would screw up ordering and can create inconsistency like sq no 5 is persisted while the primary died before persisting sq no 4. And since sq no 4 was used in the determination of sq no 5's accepted/rejected status this would lead to inconsistency. And if were using a single bg thread might as well use the main thread and publishing to kafka will be bottleneck in the bid processing. using a single background thread just changes where we wait i.e. do we wait before entering critical section(main thread scenario) vs after entering critcal section(bg thread scenario)

@firefly_3141 5 ай бұрын

Hey Jordan your linkedin?

@jordanhasnolife5163 5 ай бұрын

www.linkedin.com/in/jordan-epstein-69b017177?

@minhsiuhsieh1811 2 ай бұрын

Thank you for the video! Quick question, at kzbin.info/www/bejne/aZK7Xp95asSDZ68si=cHbcNLOSx6Z6Rsik&t=1008 Is there an option that Bid engine first persists the bid (accepted/rejected) into a database. database use cdc to publish bids to all consumers like what the kafka does in your design? The database will be the source of truth.

@jordanhasnolife5163 2 ай бұрын

Absolutely. But that introduces a disk to a problem where we otherwise don't have it. If this is acceptable latency for us, I think it's a significantly preferable solution.

@foxedex447 5 ай бұрын

Bro exposing his company system design 💀💀

@scuderia6272 5 ай бұрын

Has he worked at eBay?

@jordanhasnolife5163 5 ай бұрын

No never lmao, I'm a bit confused here

@foxedex447 5 ай бұрын

@@scuderia6272 no i meant like "he getting the system design from his company and putting it here", like he has a bidding system in his company and putting it here not on ebay XDD

@foxedex447 5 ай бұрын

@TenFrenchMathematiciansInACoat GUYS ITS JUST A JOKE 😭

@scuderia6272 5 ай бұрын

😂