When to Use Kafka or RabbitMQ | System Design

  Рет қаралды 129,728

Interview Pen

Interview Pen

Күн бұрын

Пікірлер: 98
2 ай бұрын
"If a consumer disconnects before committing the offset, Kafka will automatically send the data to another consumer." Actually, this part is a bit misleading, as Apache Kafka doesn't 'send' anything because it's a pull-based model which means when a new consumer is assigned to a partition, it will start reading data from the last saved offset for that partition. If the previous consumer didn't save its progress, the new consumer will start from the last saved point and might process some messages again.
@interviewpen
@interviewpen 2 ай бұрын
I can see how this was misleading. In terms of data flow, the message moves to an available consumer, but yes the consumer first must poll for new records. Traditional queues work this way as well. Thanks for clarifying :)
@rahulsangvikar7973
@rahulsangvikar7973 2 ай бұрын
@@interviewpen RMQ is push-based where the exchange assigns a consumer for processing a message. It is different from how kafka works.
@maf_aka
@maf_aka Ай бұрын
I feel like what's not talked about enough in this kind of comparison is the tradeoffs 1. kafka is much more expensive to run, especially because of the persistence 2. rabbitmq is generally slower at processing messages (hence unsuitable for streaming) 3. kafka has a lot more plugins for different backend services 4. rabbitmq is relatively simpler conceptually hence faster to integrate
@interviewpen
@interviewpen Ай бұрын
Thanks for sharing :)
@maksimmuruev423
@maksimmuruev423 24 күн бұрын
Not true. Rabbit is much faster until it hits 100%. It's outrun kafka very significant. And consume less resources. So it's not true that rabbit is slower
@alokrao9921
@alokrao9921 Ай бұрын
Okay i think a few people have pointed it out already, but each consumer of a Kafka topic does _not_ get a copy of the same message. It is one consumer in each consumer group that gets a copy of the message. A consumer group is usually assigned to an application, and each instance of that application is a consumer within that group. If each consumer ended up receiving a copy of the message, then it would lead to a lot of duplicate processing. The example they video used of Logs, Data Analysis etc. is sound. Each of those would be a consumer group though, and e.g. the logs service would have multiple instances, where each instance would be a consumer.
@interviewpen
@interviewpen Ай бұрын
This is a common misunderstanding :) Every consumer assigned to a partition in Kafka gets a copy of every message sent to that partition. All consumer groups do is distribute consumers to different partitions so that each partition only has a single consumer. This causes difficulties for traditional queueing use cases where we want to load balance on the consumer side and/or need to handle consumer disconnects without duplicating messages.
@alokrao9921
@alokrao9921 Ай бұрын
@interviewpen Yes the job of the consumer group is to distribute the consumers under it so that each partition gets no more than one consumer attached to it. So in your example, "logging" would be a consumer group. E.g. if the topic has 4 partitions, and you have two instances of the logging application, then the consumer group would assign each instance (i.e each consumer) to two different partitions. And a new application for say "analytics" that wants to listen to the topic would need a different consumer group to assign its instances (consumers) to partitions. And each message that lands in a topic goes to a single partition, and is then ready by a single consumer from each consumer group. So the fan out is happening across consumer groups, not individual consumers. Among the two instances of my logging application, only one instance receives the message. And since each instance is reading from two different partitions, we also get good load balancing. The caveat is to not have more instances than partitions.
@pratikshitsingh415
@pratikshitsingh415 2 ай бұрын
Awesome explanation on the difference of use cases. I've worked with kakfa but kind of new to RabbitMQ and this video helps me getting the brief insights on these. Thanks
@interviewpen
@interviewpen 2 ай бұрын
Glad it helped. Thanks for watching!
@Fikusiklol
@Fikusiklol 2 ай бұрын
Both systems are designed to decouple. Both systems have routing/filtering/processing capabilities, that either built in or part of their eco system. Main difference is that kafka is designed for events, rabbit is for commands. And ordering/persistence associated with those.
@asian1599
@asian1599 Ай бұрын
what are events and commands?
@Fikusiklol
@Fikusiklol Ай бұрын
@@asian1599 Both are messages. Events are facts about something that has already happened, e.g. OrderPlacedEvent. Events can be internal/external, e.g domain/intergration, it depends on a scope and usually are a part of systems that implement EDA (event driven) architecture. Commands on the other hand, are messages that tell some service to do something, e.g ValidateCustomerCommand or ShipOrderCommand. Those can or can't be part of EDA systems, but certainly a part of Orchestration (saga). Events are meant to be 1-many (1 event - many or none listeners), thus kafka streams. Commands are meant to be 1-1 (point to point), thus rabbit queues.
@snekbaev
@snekbaev Ай бұрын
@@asian1599 event: OrderPlaced and you can push all the relevant details associated with it to any consumer that maybe interested, say, to update total sales stats, inventories and what not. Command would be: PlaceOrder - you basically requesting your system to start taking steps required to handle the order, once every required step is done, you can push the aforementioned event OrderPlaced as a notification of something that has already taken place
@kebien6020
@kebien6020 Ай бұрын
​@@asian1599 I'd say that events are things that happened, and you want to notify "to whom it may concern". The receiver of the event decides if it has to do something or not based on the contents of the event. Commands are things that you want to happen. So you put it in a backlog, and hopefully some other system picks it up and does the thing.
@ashutoshpanda4336
@ashutoshpanda4336 2 ай бұрын
There is so many things wrong this video. Kafka doesn't directly fan out. Each consumer is assigned to one or more partition based on the configuration of kafak. Once a message is read and it doesn't send to all the consumer since all consumer would be assigned to different partition. This is just misleading. I don't know why people aren't calling it out. and There is another comment also pointing out "Kafak doesn't send anything it's pullbased" was there which is wronngly mentioned in video.
@interviewpen
@interviewpen 2 ай бұрын
That’s not true. Each message delivered to a partition is fanned out to every consumer assigned to that partition. And yes, consumers poll for new messages, as with traditional queues. This might help: docs.confluent.io/platform/current/clients/consumer.html
@adarsh.hatwar
@adarsh.hatwar 2 ай бұрын
I think there is some confusion here... There seems to be confusion around consumers and consumer groups.. As @ashutosh mentioned, consumers of the same consumer group are assigned one or more partitions of a topic. Multiple consumer groups can read from the same topic. Ex: logging, notification, alerting etc can all be three different consumer groups reading from same topic and doing their own independent things.
@juanitoMint
@juanitoMint 28 күн бұрын
​@@adarsh.hatwaryes! this is not well explained the video needed to introduce consumer groups and consumer for those of us who know Kafka it's clear but for a newcomer it's confusing
@kaixu1026
@kaixu1026 2 ай бұрын
You misunderstood the difference between consumer group and consumer in Kafka. In Kafka, one particular message can ONLY be delivered to a consumer, as same as RabbitMQ. But, different consumer in different consumer group will be able to consume a same message independently.
@interviewpen
@interviewpen 2 ай бұрын
Hi! Consumer groups work around this issue, but at its core, Kafka is in fact fan-out. Two consumers assigned to the same partition will both receive the same messages. All consumer groups do is assign consumers to different partitions within a topic. This is a commonly misunderstood nuance :)
@syphondu
@syphondu 2 ай бұрын
​@@interviewpen Agree with @kaixu1026. Within a consumer group, you get a similar abstraction to RabbitMQ without the scalability ceiling. Audience will have a difficult time discerning that, however.
@kaixu1026
@kaixu1026 2 ай бұрын
@@interviewpen you can not assign two consumes in a same consumer group to a same partition for runtine consuming in Kafka. Right? Then,technically,it is impossible to consume the same message from 2 consumers if they belongs to the same consume group. There is no doubt Kafka is in fact fan-out. But,your audience may be confused if they believe Kafka can assign 2 consumers to consume a same message without any restriction. consumer group is a key core concept of kafka in my mind. :)
@smaug9833
@smaug9833 Ай бұрын
Great video! Would like to have seen NATs included in the comparison.
@muhammedashkar692
@muhammedashkar692 2 ай бұрын
Nice content. I was doing a microservice project where I had chosen Kafka over RabbitMQ without actual knowledge of the use of kafka. I needed the traditional RabbitMQ behaviour for my current microservice project, and I thought Kafka would be better for that as I heard it had high throughput. now I know the why's. gonna go with rabbitMQ.
@interviewpen
@interviewpen 2 ай бұрын
Nice, glad we could help :)
@PoRBvG
@PoRBvG Ай бұрын
Some parts were misleading and confusing unlike the other videos. I'd suggest to redo this video to match the quality with the other videos :) , thanks for all the great contents
@interviewpen
@interviewpen Ай бұрын
Could you elaborate on what you thought was confusing?
@Gallardo994
@Gallardo994 Ай бұрын
2:40 is kind of incorrect. RabbitMQ uses at-least-once delivery as it requires messages to be acknowledged by the application (either automatically or manually after doing some work). This way a failure of an app instance which processes a message, but doesn't ack it, or just acks it too late, may result into double processing. The consumer still has to have some level of idempotency to have resulting data in a consistent state.
@interviewpen
@interviewpen Ай бұрын
Yes! That is true of both types of systems. Thanks for the insight :)
@vaishnavpureddiwar3091
@vaishnavpureddiwar3091 12 күн бұрын
really nice :), thanks
@四伊柳
@四伊柳 27 күн бұрын
Amazing video, thanks a lot ❤
@interviewpen
@interviewpen 26 күн бұрын
Thanks for watching!
@atulshah1045
@atulshah1045 2 ай бұрын
Great breakdown of the queues and side-by-side comparison. And I can’t believe I’m the first one to comment here!
@interviewpen
@interviewpen 2 ай бұрын
Thanks for watching!
@whoknows3679
@whoknows3679 2 ай бұрын
Kafka can be used as queue tho, sharing the same consumir group and increasing the partitions as many consumers needed to balance messages
@interviewpen
@interviewpen 2 ай бұрын
It certainly can, but that doesn't mean you should...it's really not designed for that. There's a lot of strange issues that arise from doing this--you need lots and lots of partitions, group rebalances can cause message duplication, brokers will time out clients whose jobs run long, etc.
@AlanD20
@AlanD20 2 ай бұрын
​@@interviewpen What you are saying is misleading, Kafka indeed is built for both pub-sub and queue systems. Kafka is meant to handle loads of partitioning and in fact that's the whole purpose of kafka. It depends on your configuration to increase group rebalancing and broker timeouts. There arent any harm in doing these. Kafka configurations are very flexible to adapt to these models and they can cover the majority of cases, it's mostly up to how you configure it for your use cases. Not sure what you mean by "group rebalancing can cause message duplication", if a message hasn't been committed, it can be picked up by another consumer after the group rebalancing, and if your implementation does not handle idempotency when processing messages, the issue here is the implementation not kafka's fault. Even if you have database records, you can still handle these cases with transactions. And the whole idea behind the consumer group is to act as a pub-sub and as a queue depending on how you want to use it. Hope this helps!
@hyperborean72
@hyperborean72 Ай бұрын
could you explain please how different Kafka consumers are assigned to this or that consumer group? I guess this is the key moment in creating the balanced model with multiple consumers in Kafka
@interviewpen
@interviewpen 26 күн бұрын
Sure, Kafka consumer groups just make sure that each partition has exactly one consumer (but a consumer can be connected to multiple partitions). There are various algorithms (such as round-robin) that can be used internally. Thanks!
@sanchitbatra4431
@sanchitbatra4431 Күн бұрын
RabbitMQ is push based model , consumer does not poll , rabbitMq tries to push as soon as it gets the message
@cnkumar20
@cnkumar20 2 ай бұрын
Kafka is like a Lake and Rabbit MQ is like a River
@bhanuprakashrao1460
@bhanuprakashrao1460 2 ай бұрын
Are consumers in Kafka do polling to get the events or is it a push based model, where Kafka pushed the events to consumers? For traditional queues, I know that it is a pull based model, where identical consumers (i.e replicas) do polling.
@interviewpen
@interviewpen 2 ай бұрын
Kafka consumers poll for new records (same for traditional queues).
@cya3mdirl158
@cya3mdirl158 Ай бұрын
Have you ever heard about Hermes which is using by Allegro - one of the biggest e-commerce in Poland
@eduardmart1237
@eduardmart1237 2 ай бұрын
Nice video
@interviewpen
@interviewpen Ай бұрын
Thanks!
@mageexian
@mageexian 2 ай бұрын
This is well articulated, 🙏
@interviewpen
@interviewpen 2 ай бұрын
Thanks!
@ryzlesalt
@ryzlesalt 2 ай бұрын
this is a great summary of them!
@interviewpen
@interviewpen 2 ай бұрын
Thank you!
@FirstnameLastname-cl4op
@FirstnameLastname-cl4op 2 ай бұрын
So for handling the stripe webhooks, i should use RabbitMQ but many times i have seen people using kafka in their system design
@interviewpen
@interviewpen Ай бұрын
Yep, common misconception about the use cases of the two systems!
@VaibhavShewale
@VaibhavShewale 2 ай бұрын
ooh damn, it was interesting
@interviewpen
@interviewpen 2 ай бұрын
Thanks for watching!
@shis10
@shis10 2 ай бұрын
Amazing video 💯
@interviewpen
@interviewpen 2 ай бұрын
Thanks!
@iCrimzon
@iCrimzon Ай бұрын
I prefer..... NATS
@hyperborean72
@hyperborean72 Ай бұрын
what exactly do you mean by "consumers in RabbitMQ have control over what messages they are consuming"? either they are binded to direct queue or topic they always consume messages - aren't they?
@interviewpen
@interviewpen 26 күн бұрын
The difference we're pointing out is that RabbitMQ consumers are responsible for polling the queue and getting the next record, while in Kafka, the messages are already destined for a specific partition when they're produced.
@0xtz_
@0xtz_ 2 ай бұрын
wait I can't imagine that's free content 😮 well explained 👏👏
@interviewpen
@interviewpen 2 ай бұрын
Thanks!
@martonmagyar541
@martonmagyar541 Ай бұрын
Use NATS
@OutreGeek
@OutreGeek 2 ай бұрын
Brilliant! Glad I chanced upon this gem. My current project has a requirement to process real-time location co-ords continuously and by different consumers. I now fully understand why the Architect chose Kafka over alternatives. 👏
@interviewpen
@interviewpen 2 ай бұрын
Sweet. Thanks!
@hyperborean72
@hyperborean72 Ай бұрын
strange... AMQP (and RabbitMQ) has topics that let distribute messages to multiple listeners as well. It was never mentioned
@interviewpen
@interviewpen 26 күн бұрын
Yes, this is done using exchanges--accomplishing this in RabbitMQ requires fanning out the message to multiple queues.
@shauryavardhansingh7917
@shauryavardhansingh7917 2 ай бұрын
Give me a video on kafka and nats
@gabrielg9592
@gabrielg9592 Ай бұрын
5:01 Why is bursty data better handled by traditional queues instead of Kafka? Why do you say Kafka is better suited for "messages that take uniform(and short) time to process"?
@interviewpen
@interviewpen Ай бұрын
Kafka tends to have a bit higher latency. The connection handling also works better for consistent throughput (Kafka will time out idle clients and whatnot)
@gabrielg9592
@gabrielg9592 Ай бұрын
@@interviewpen awesome, that makes sense. Thanks for answering!
@RandomDGuy21
@RandomDGuy21 Ай бұрын
I think it would be great to have a follow up video covering AWS IoT core
@interviewpen
@interviewpen Ай бұрын
Ok, we’ll add it to the list.
@satish1012
@satish1012 Ай бұрын
Si think you use the "Topic" Terminology for Kafka
@OmkarMondkar-cw8gj
@OmkarMondkar-cw8gj Ай бұрын
What software is he using to draw?
@interviewpen
@interviewpen 26 күн бұрын
We're using GoodNotes on an iPad
@debangshubanerjee1311
@debangshubanerjee1311 2 ай бұрын
Excellent video, one request can u compare Solace Queue also with Rbbit and kafka?
@jephthahsnr6558
@jephthahsnr6558 2 ай бұрын
🔥🔥🔥👌🏾
@interviewpen
@interviewpen 2 ай бұрын
Thank you :)
@rishiraj2548
@rishiraj2548 2 ай бұрын
thank you
@interviewpen
@interviewpen Ай бұрын
Thanks for watching!
@h.l.a.ondoua8514
@h.l.a.ondoua8514 11 күн бұрын
What about emqtt
@surenderthakran6622
@surenderthakran6622 11 күн бұрын
You say consumers when you should be saying consumer groups when talking about Kafka fanout. Otherwise a nice explanation
@matveyshishov
@matveyshishov 2 ай бұрын
I really like your videos (and even bought your course because of that), but this one seems to be lower level than your usual standards. You make it seem like the difference is mostly topological, or maybe API-wise. But in reality the systems are very different in purpose and use cases. RabbitMQ is an implementation of AMQP, full stop. Just go read John O'Hara's article "Toward a commodity enterprise middleware". Being an AMQP, all it does, and nothing else, is delivery of messages to the specified recepient. Again, full stop. As soon as the message is in the memory buffer - it's done its job. Kafka, on the other hand, is a CDC on steroids. A distributed WAL. There are no messages, therefore, no addressee, there are only "data changes", which we usually call "events". I could wax poetic for much longer, and do comparisons with this and that, but honestly, I don't see the point, as we live in the time when you can literally ask the person who created the system what they wanted to do ;) .
@dan_le_brown
@dan_le_brown 2 ай бұрын
Thanks for the clarification
@interviewpen
@interviewpen 2 ай бұрын
Yeah this is a good explanation of the implementation differences between the two systems. This video just covered the practical use cases and functionality, so thanks a lot for sharing this perspective!
Ай бұрын
Are we basically talking about queue vs topic?... Those are just implementation of each type no?
@interviewpen
@interviewpen Ай бұрын
A Kafka “topic” is essentially an abstraction around a group of individual queues (called partitions) that allows producers to distribute messages easily. Hope that helps :)
@Copainbig
@Copainbig Ай бұрын
@@interviewpenyeah in the concept of Kafka, but in architecture, a “topic” is the role that kafka serves as a whole here. While a “queue” is what RabbitMQ is here. So what is described here, is the opposition of a “topic” (of which kafka is an implementation, like SNS is too) and a “queue” (RabbitMQ, SQS…). No?
Ай бұрын
​@@interviewpen but it allow producers, to spread messages across multiple consumers, while making sure that each consumers can get each message at least once. This kind of component, in software architecture, is called a "Topic". RabbitMQ never had this ambition, because it is actual a "queue".
@daitedve1984
@daitedve1984 Ай бұрын
Бери кафку - кафка фкуфнее! 😆
@TheRydde
@TheRydde Ай бұрын
Haven't all moved to pulsar yet?
@augustinemunene3469
@augustinemunene3469 2 ай бұрын
would you kindly do how to make a backend as a service system design
@interviewpen
@interviewpen 2 ай бұрын
Yeah we might do that soon :)
@throwaway-lo4zw
@throwaway-lo4zw 2 ай бұрын
why are you holding a pen
@sproutboot
@sproutboot Ай бұрын
?
Google's 1.3Pb/s "Jupiter" Network | System Design
12:55
Interview Pen
Рет қаралды 8 М.
Basic System Design for Uber or Lyft | System Design Interview Prep
16:18
Noodles Eating Challenge, So Magical! So Much Fun#Funnyfamily #Partygames #Funny
00:33
風船をキャッチしろ!🎈 Balloon catch Challenges
00:57
はじめしゃちょー(hajime)
Рет қаралды 88 МЛН
Twin Telepathy Challenge!
00:23
Stokes Twins
Рет қаралды 79 МЛН
Event-Driven Architecture (EDA) vs Request/Response (RR)
12:00
Confluent
Рет қаралды 170 М.
Proxy vs Reverse Proxy vs Load Balancer | Simply Explained
13:19
TechWorld with Nana
Рет қаралды 211 М.
Sqlite Is Getting So Good
28:52
ThePrimeTime
Рет қаралды 203 М.
Kafka vs RabbitMQ Performance
9:21
Anton Putra
Рет қаралды 40 М.
Session Vs JWT: The Differences You May Not Know!
7:00
ByteByteGo
Рет қаралды 257 М.
Design a High-Throughput Logging System | System Design
8:23
Interview Pen
Рет қаралды 47 М.
Design a Data Warehouse | System Design
14:08
Interview Pen
Рет қаралды 29 М.
What is RabbitMQ?
10:10
IBM Technology
Рет қаралды 329 М.
Noodles Eating Challenge, So Magical! So Much Fun#Funnyfamily #Partygames #Funny
00:33