RabbitMQ is push based model , consumer does not poll , rabbitMq tries to push as soon as it gets the message
@NimeshKumar2989Күн бұрын
what is the software used here to create presentations and interact with Apple Pencil?
@VarunArora142 күн бұрын
Love your channel mate. Very helpful for interview prep
@MuhammadIlyas-dr7eg2 күн бұрын
First view 😀
@YoungChiao5 күн бұрын
they got cooked on the Tyson fight
@jpmc13125 күн бұрын
Here because of the Tyson fight as well - one provider in my area (Bell) had bad Netflix, one provider (Rogers) had good Netflix. Presumably something to do with how they're connected into the Netflix CDN...?
@ChoocoPunch10 күн бұрын
Can I ask what should a NoSQL database table schema look like? Should it looks like 0:05 ?
@surenderthakran662210 күн бұрын
You say consumers when you should be saying consumer groups when talking about Kafka fanout. Otherwise a nice explanation
@pradeepbhat136311 күн бұрын
Great content. It is not clear how to fetch the user's tweet when the key is slowflake id and it is distributed over multiple nodes ?
@h.l.a.ondoua851411 күн бұрын
What about emqtt
@deddallama11 күн бұрын
Why not Use CDC?
@vaishnavpureddiwar309112 күн бұрын
really nice :), thanks
@pingqiu731814 күн бұрын
Very good video! Thanks for sharing! One tiny thing, I would prefer NFS over Blob Store like S3 to keep the downloaded pages. A webpage will keep references to lots of resources, like json/css/javascript files. The bold/highlighted words are more important than plain text. That's very important information for ranking. If we don't keep the css files, those information will be lost. So we need to keep them together with the HTML file. It will be very complex to keep those information for multiple files in a single webpage in metadata if we use S3. So I suggest just download the web page with everything to a folder in NFS, and ask indexer to help themselves.
@optimusdebugger963816 күн бұрын
nitpick. We lost quantity when we went from model to table on order_product table
@ryan.aquino17 күн бұрын
I'm still quite confused on why we prefer queue + Flink (stream processing) instead of just ingesting data directly from sources using spark.
@saiyijinprince18 күн бұрын
I think you missed a huge aspect about when you to choose between SQL vs NoSQL. At the end of the day, its more about picking the right tool for the job, and less about the scale of the data. If you need ACID transactions, use SQL. If you're okay with eventual consistency then NoSQL might be easier. The scale of your data is just one minor aspect you need to consider when building a system. I'd argue you can scale both types of DBs just as much as each other but it'll all come down to what kind of trade offs you're willing to make. Its not uncommon to have a combination of both NoSQL and SQL DBs, they are not mutually exclusive.
@arbazadam340719 күн бұрын
I have faced issues while accessing data in nosql databases. 🥲
@ibrahimkoz198320 күн бұрын
Postgresql offers the same power as MongoDB when it comes to json, so there's no sense to choose MongoDB even if you want to store your data in denormalized form.
@ibrahimkoz198319 күн бұрын
@codingwithjamal nope. Postgres stores json as jsonb and it is blazingly fast.
@interviewpen5 күн бұрын
From my understanding, postgres doesn’t have the same ability to select or filter on json fields. But regardless, using a solution that was designed from the ground up for this use case will generally result in a better experience. Thanks for the insight!
@soumenmondal729420 күн бұрын
hi @interviewpen , it's diwali can i get one coupon code so that i can buy your sytem design course, it's diwali , love from india
@vivekgujari911820 күн бұрын
I have a question. Everytime when a user logs in, the system will generate a session token regardless of if it is a good user or a bad user. How this session token differentiates between the good and bad user?
@interviewpen6 күн бұрын
When we log in, our authentication system has to verify the user’s credentials and generate a session token. If the user doesn’t have the right credentials, the system will reject the user and will not generate a token. Hope that helps!
@Fikusiklol22 күн бұрын
Good stuff! However, what is also important apart from schema, that relational database has relatively slow writes/updates because of their index structure (B-tree) and storage mechanisms. Apache Cassandra, afaik, has LSM tree and mem tables. MongoDB has async I/O journaling and also has LSM tree (or does it?) in addition to B-tree. I, personally, dont like any relational DB, but "right tool for the right job" :)
@charliebitmyfinger712421 күн бұрын
Relational databases are also acid compliant and are by far the most reliable types of databases.
@Fikusiklol21 күн бұрын
@@charliebitmyfinger7124 NoSQL database are also acid compliant. It depends on what guarantees they provide, as every engine provides different guarantees.
@interviewpen21 күн бұрын
Really great insights, thanks!
@ehm-wg8pd22 күн бұрын
this is gold
@interviewpen21 күн бұрын
Thanks for watching!
@goldenlin952822 күн бұрын
do most large scale companies use nosql then in favor of ease of storing and accessing data?
@interviewpen21 күн бұрын
Yep--companies that need to query massive amounts of transactional data typically use NoSQL databases for flexibility when distributing data across a large cluster.
@professionalsd964723 күн бұрын
I'm i missing something or he didn't design the model in nosql ?
@ssefrhd23 күн бұрын
Exactly my thought as well..
@interviewpen22 күн бұрын
The schema shown is just an example to show the limitations of the relational model. Point is, NoSQL databases allow us to design around query patterns. The same logical data could result in very different schemas depending on the requirements or which database we choose. If you want to learn about non-relational schema design, you might want to look at our course on interviewpen.com :)
@mindrust20322 күн бұрын
@@interviewpen True, but I think I speak for most of us when I say we were expecting to see how you would design the NoSQL data model to solve the issues you mentioned, with regards to querying products in an order and getting the number of times a product appears across all orders.
@KENTOSI24 күн бұрын
Hey nice video, but I'm confused about 1 thing: You mentioned that the CDNs are caches themselves, but then we're adding an additional cache layer? This seems redundant, no? Even if you go with the requirement of leaving NY data close to NY users, why can't you can still implement this algorithm at the CDN level?
@interviewpen22 күн бұрын
The CDN hosts static content such as images, but we can’t cache an api like that-the data is dynamic. So we still need an API, but caching this dynamic data will lower latency. Thanks for watching!
@KENTOSI22 күн бұрын
@@interviewpen Got it thanks for explaining.
@四伊柳26 күн бұрын
Amazing video, thanks a lot ❤
@interviewpen26 күн бұрын
Thanks for watching!
@smaug983329 күн бұрын
Great video! Would like to have seen NATs included in the comparison.
@guoardАй бұрын
What you said is not correct for MinIO. MinIO features a data sharding concept that splits files into specified parts and stores each part on a separate drive.
@interviewpen26 күн бұрын
Yep, we're not going into detail on erasure coding in this video, but we have content on this on interviewpen.com :)
@jreamscapeАй бұрын
thank you bro
@interviewpenАй бұрын
of course :)
@Gallardo994Ай бұрын
2:40 is kind of incorrect. RabbitMQ uses at-least-once delivery as it requires messages to be acknowledged by the application (either automatically or manually after doing some work). This way a failure of an app instance which processes a message, but doesn't ack it, or just acks it too late, may result into double processing. The consumer still has to have some level of idempotency to have resulting data in a consistent state.
@interviewpenАй бұрын
Yes! That is true of both types of systems. Thanks for the insight :)
@cya3mdirl158Ай бұрын
Have you ever heard about Hermes which is using by Allegro - one of the biggest e-commerce in Poland
@Otomega1Ай бұрын
So the BLOB system assume every node have the exact same storage capacity ? Does it do rehashing if a node is already full and cant accept more data ? What happen if a node or multiple nodes are down ? What happen if a node have too many connections for reads ? Do the network congest or does it load balance in a specific way ? Are the reads cached somewhere if some datas are way more accessed than others ? If multiple clients fetch the same data at approximatively the same time (in the same window of time), are both queries optimized to use less ressources together ?
@interviewpen26 күн бұрын
If you want to learn more about these edge cases, check out our full video on interviewpen.com :D
@uzairsultanАй бұрын
which whitebaording software do you use?
@interviewpen26 күн бұрын
We use GoodNotes on an iPad. Thanks!
@sh0dan2Ай бұрын
Very nice vid! Great job explaining something this complex in 5 mins! A minor note is that S3 clients don't know about the topology behind the scenes, so it is not the client, but "something" in between. For MinIO it is the server itself that does the hash and reaches out to the nodes with the data and collects it to serve the request. But the principle is exactly as you described! (I work @ MinIO) Having the requester know the remote host topology makes the clients more complex, and also makes serving via reverse proxy/load balancing hard. But of course you avoid a forwarding hop, so pros and cons. Memcached, Aerospike does this for example.
@interviewpen26 күн бұрын
Yes--thanks for pointing out this nuance!
@dimit9Ай бұрын
It would be cool to have a video on design of url shortener
@interviewpen26 күн бұрын
We have one, check out our systems end-to-end course on interviewpen.com!
@vikramj22Ай бұрын
Thank you for a nice details explanation @ mechanisms of Kafka
@interviewpen26 күн бұрын
Thanks for watching!
@OmkarMondkar-cw8gjАй бұрын
What software is he using to draw?
@interviewpen26 күн бұрын
We're using GoodNotes on an iPad
@ithaca2076Ай бұрын
please keep doing these system design videos where you go over a description of operations and explain the architectures behind them. this is great
@interviewpen26 күн бұрын
Glad you liked it!
@hyperborean72Ай бұрын
could you explain please how different Kafka consumers are assigned to this or that consumer group? I guess this is the key moment in creating the balanced model with multiple consumers in Kafka
@interviewpen26 күн бұрын
Sure, Kafka consumer groups just make sure that each partition has exactly one consumer (but a consumer can be connected to multiple partitions). There are various algorithms (such as round-robin) that can be used internally. Thanks!
@hyperborean72Ай бұрын
what exactly do you mean by "consumers in RabbitMQ have control over what messages they are consuming"? either they are binded to direct queue or topic they always consume messages - aren't they?
@interviewpen26 күн бұрын
The difference we're pointing out is that RabbitMQ consumers are responsible for polling the queue and getting the next record, while in Kafka, the messages are already destined for a specific partition when they're produced.
@hyperborean72Ай бұрын
strange... AMQP (and RabbitMQ) has topics that let distribute messages to multiple listeners as well. It was never mentioned
@interviewpen26 күн бұрын
Yes, this is done using exchanges--accomplishing this in RabbitMQ requires fanning out the message to multiple queues.
@nader2560Ай бұрын
Honestly one of the best videos on the internet for system design!
@interviewpen26 күн бұрын
Thank you!
@daitedve1984Ай бұрын
Бери кафку - кафка фкуфнее! 😆
Ай бұрын
Are we basically talking about queue vs topic?... Those are just implementation of each type no?
@interviewpenАй бұрын
A Kafka “topic” is essentially an abstraction around a group of individual queues (called partitions) that allows producers to distribute messages easily. Hope that helps :)
@CopainbigАй бұрын
@@interviewpenyeah in the concept of Kafka, but in architecture, a “topic” is the role that kafka serves as a whole here. While a “queue” is what RabbitMQ is here. So what is described here, is the opposition of a “topic” (of which kafka is an implementation, like SNS is too) and a “queue” (RabbitMQ, SQS…). No?
Ай бұрын
@@interviewpen but it allow producers, to spread messages across multiple consumers, while making sure that each consumers can get each message at least once. This kind of component, in software architecture, is called a "Topic". RabbitMQ never had this ambition, because it is actual a "queue".
@RandomDGuy21Ай бұрын
I think it would be great to have a follow up video covering AWS IoT core
@interviewpenАй бұрын
Ok, we’ll add it to the list.
@kamalrathod25Ай бұрын
1:44 Haven't you heard pub sub in rabbitMq? RabbitMq delivers only one message to each consumer is such a wrong statement dude ... This exact use case i had implemented 8 years ago with rabbit
@ocean645Ай бұрын
Hi, what exactly is this subject? Is it data science?
@interviewpenАй бұрын
This is system design-we’re considering what services and infrastructure to use to solve a high-level problem. Thanks for watching!
@jack.klimovАй бұрын
I thought the whole idea of Google Drive and Dropbox was to focus on distributed storage rather than just using ready-made cloud solutions like S3. In my opinion, that's the most interesting aspect of such a task.
@interviewpenАй бұрын
For sure :) We have other videos about how BLOB storage systems are designed on interviewpen.com
@jack.klimovАй бұрын
@@interviewpen oh great! Thank you, i will have a look
@TheRyddeАй бұрын
Haven't all moved to pulsar yet?
@PoRBvGАй бұрын
Some parts were misleading and confusing unlike the other videos. I'd suggest to redo this video to match the quality with the other videos :) , thanks for all the great contents
@interviewpenАй бұрын
Could you elaborate on what you thought was confusing?