Lesson 198 - Swarm of Gnats Event AntiPattern

Рет қаралды 2,446

Software Architecture Monday

Күн бұрын

Пікірлер: 23

@mahdi5796 2 ай бұрын

Excellent video, as always. Thank you Mark for generously sharing your knowledge

@markrichards5014 Ай бұрын

Glad you are finding it useful!

@karimfouad2145 2 ай бұрын

Thanks for sharing this valuable information.

@dimitrikalinin3301 Ай бұрын

Delta updates represent a more complex approach compared to full updates, particularly when system capacity is a critical quality attribute and concurrency is involved. When using delta updates, it’s essential to ensure that all events are processed in the correct chronological order, which increases system complexity and can degrade performance. In some cases, it may be simpler and more efficient to send the entire object with its final state in the event. This allows clients to determine whether they already have a newer version, reducing the need for managing complex event sequencing. But back to delta updates: while changes to the address and birthdate can be treated as separate events, where the order of processing might not matter, two consecutive changes to the address must be processed in the correct order to maintain consistency. If various types of changes are grouped into change events without distinction, we lose the ability to control them individually, and all events must then be processed in strict sequence. This adds further complexity to handling delta updates, as out-of-order events could lead to inconsistencies. It's difficult to make a general statement about which approach is better in a given situation.

@BlindVirtuoso 2 ай бұрын

Hi Mark. Excellent as always. Thanks, I appreciate it. A question though. Would you create a separate queue/topic for each event type FraudDetected and NoFraudDetected or a single one?

@BrianKlausen Ай бұрын

I think this is one of the "funny" questions to consider, where guidance on patterns actually depend on, or can be supported/countered by your internal application architecture, and also your infrastructure. Because individual topics quickly becomes "swarm gnats topics anti-pattern", if separation by topic, is your only way of limiting the consuming services from having to investigate and process the message - if that's the case, you haven't achieved anything I'd say. But depending on your message infrastructure, the publish/subscribe pattern differs a little. Personally I am mostly familiar with Kafka, which is all about "smart end-points, dumb pipes". But notice first of all that I wrote "process the _message_" - not the payload - and this is where you application architecture - as opposed to systems architecture - can help IMO. A couple of suggestions are: 1. If you do something like Uncle Bobs Clean Architecture, then maybe it is not so bad to leave payload inspection in a port/adapter layer for filtering. Or you could even externalize the filtering even more, to keep the core of the consuming service "clean" of filtering out stuff from the producing service that is of no relevance to this context. 2. You might want to consider the design of the messages you transmit on your event infrastructure. On something like Kafka, messages are key/value pairs - with the value being the actual payload. And both key and value can be complex multi-level data structures. So you might want to consider event type/name as part of your key - or for youe value/payload, you might want to do an envelope style, where the ACTUAL payload is put into an envelope with various meta data on it - one being for instance the event type/name. Now, is this still "inspecting and processing the payload"? Well - you could certainly argue that it is - it is just "which part" - and so my suggestion would be to add a concept like Clean Architecture to encapsulate or even externalize the filtering. Kafka stream processing libraries are highly efficient, so if you encapsulate the filtering based on something like an event name, it is really highly performant, and can leave the happy domain logic processing to only take place for those events relevant to that particular service. Now, make no mistake: I am certainly not say that swarm of gnats as an anti pattern is not a thing. Plenty of design decision issues can lead to it manifesting - so no excuses for not thinking carefully about granularity. Domain Driven Designs concepts about aggregates, aggregate roots etc., gives some concepts and ways of discovering how to group things. In Marks example about "profile" - maybe "contact info" is an aggregate of email, phone, postal adress, and so you have an event about "ProfileContactInfoUpdated" - more likely than not, from a business point of view, the granularity of different types of contact info is not relevant. And if it is to a single consumer, it can implement its own logic to keep track of which part of the ContactInfo aggregate changed and react accordingly - leaving all the other consumers out of having to deal with that complexity. As always, there's not absolute right or wrong here - if 80% of your consumers need high granularity, you probably should implement it at that level and it wouldn't be a swarm of gnats that are irritating - rather it is useful information. But if only 5% of your consumers need it - then it probably IS a bad idea... and as always, your requirements may change over time, so make sure you are ready for refactoring :).

@BrianKlausen Ай бұрын

Oh, and just to add: I think a Good Practice (not fan of "Best" practise, as there is often not a single best practice to always apply - just things that make more sense in some contexts, and others that makes more sense in other contexts) is to go the whole DDD thing about aggregates, aggregate root, etc., and have the principle that "An event is something that happened to an information object - i.e. aggregate" (not data object!) - and then do your topic topology around your aggregate structure - what ever it is for your specific context. So in the case of a customer being a simple thing in your enterprise, I would just do a "customer" topic, and have customerUpdated, customerCreated, customerDeleted on that, if you mostly need to transfer state about a core object like customer. If your analysis shows that you have need for a bit more granularity like for instance you identify "ContactInfo" as a separate aggregate related to "customer", then you have a decision to make: stay with the single "customer" topic, and add "CustomerContactInfoUpdated" to the list of events to emit, or do a separate customer.contactinfo topic. Note that in this case, you also have to decide whether the main "customer" topic will get "customerUpdated" events when updating only Contactinfo-part of the aggregate? Personally I think it is too complicated to deal with for something like state transfer, so I would go VERY far to just put as much as possible into one topic, be careful not to add event type for everything (i.e. consolidate), and then encapsulate the filtering in the consumers. It also makes it much easier for the consumers to respond to a change in the list of emitted event types - done right adding a new event type is non-breaking this way.

@BlindVirtuoso Ай бұрын

@@BrianKlausen An event is something that happened to an information object - i.e. aggregate". Events in the example are FraudDetected and NoFraudDetected. How can you tie them to an aggregate?

@BlindVirtuoso Ай бұрын

@@BrianKlausen I mean fraud detection service checks for a fraud and creates Fraud aggregate in case when fraud is detected and emits FraudDetected event. But in case when no fraud is detected, no aggregate is created so we can not tie NoFraudDetected to any aggregate

@BlindVirtuoso Ай бұрын

@@BrianKlausen Another thing is that in the video Mark says that if a service has to interrogate the payload in order to determine whether they should respond and take action, that's also a good indicator that we probably want more events than fewer. Now imagine that we have a single topic for these two event types FraudDetected and NoFraudDetected and a consumer that needs only process NoFraudDetected events. The consumer has to inspect event's payload in order to determine the necessary event type (in our case NoFraudDetected) Isn't it the indication of the fact that we should have a topic per event type here?

@ihorgolovatskiy2512 Ай бұрын

I have another option to overcome this anti-pattern Let’s say I have 'FraudChecked' event, I could supply a custom headers to that event like 'IsFraudDetected', so then consuming services can use filtering/routing to listen only to interested events. As a downside: 1) the 'IsFraudDetected' header is very specific to this events only 2) not every messaging bus can support filtering/routing by header

@markrichards5014 Ай бұрын

Again, this is an anti-pattern IMHO in that EVERY event processor must subscribe, whether they are filtering on header information or not. This unnecessarily statically couples services, and requires services to subscribe unnecessarily.

@stephendgreen1502 2 ай бұрын

Isn’t an event which focusses on outcome actually now more of a command than an event?

@markrichards5014 Ай бұрын

Not at all! The outcome of fraud being detected is "fraud detected". That's not a command, but advertising something it found. Same is true with "no fraud detected"

@Sousleek 2 ай бұрын

Woud it be correct to say that we came to a CRUD based semantics in user update scenario which can be called antipattern by itself too. In my company we have a topic which called "users.update". Inside of our services we have non explicit code which loooks like: if user updated and the name or birthday changed reload all medical records, if user updated and the name or birthday is changed reupload all medical documents. User is a big aggregate which can be updated for many reasons and all this logic applies ONLY to personal data changes. It feels like an underabstracted solution. So you cant reason about what a particular service does if you see that its interested in user update. Moreover you cant even if you see the event processors mapping inside a service. You begin to understand what actually happening only when you read the body of a processor. But at least we have a very "simple" CRUD events on a user.

@karimfouad2145 2 ай бұрын

As I understood from the last sentence in the video, your case is an indicator to have more events.

@BrianKlausen Ай бұрын

@@karimfouad2145 I would say that it depends on the bigger picture. Let's say that there's 40 other services that just needs info about the whole aggregate, consumes it idempotently, etc. - then adding more granularity has a cascading effect of unnecessary complexity on all these consumers. Probably you then want to just isolate that complexity to where it is needed. If it is your single subscriber... well, that's a different story - but in that case, why even consider event driven? Commands in a more monolithic style of some sort would work just as well, and save you the hassle of dealing with complexity of distributed architecture. I'd recommend reading Martin Fowlers article about different styles of event driven architecture. The CRUD semantics is what he calls "Event Driven State Transfer" and can make perfect sense for very "core data objects" that are essential to many specific services in your landscape. On the polar opposite end is pure event sourcing, where your events are - at the most extreme - business natural language about "something that happened" in the real world - out side the system in some cases. This is very the replayability of event driven really shines if you have a good event store.

@karimfouad2145 Ай бұрын

@@BrianKlausen even if you have another 40 services interested in "user.update" event you can keep it and add some new events "user.name.update" & "user.birthday.update", then each service can pickup whatever events they are interested in.

@BrianKlausen Ай бұрын

@@karimfouad2145 yes - you can to that - certainly. But again: would have to look at it in context, your topic topology becomes interesting (there is another thread above about one topic pr. event type, or one for all events on the topic). If you put everything in one topic, it's easy to chose your consumption strategy - but then you are definitely filtering a lot - everywhere. Not that I am opposed to that. Design right, where you do on headers, envelope or other meta data, I actually think it is not that big of a deal in a reasonably modern technology stack. But if that for some reason overwhelms your consumers - or some of them - and you have to separate on topics, then you have to look at how much these overlap. So if the name is updated - do you emit BOTH a user.update AND a user.name.update? That actually extra work on the producers side - and you run into potential issues with ordering again. And also - I personally think this a design that is not very explicit. To make sure you know the answer to this stuff, you would have to look carefully in the documentation. The actual event naming scheme is ambigous on this point, which makes it prone to misinterpretation by those implementing/consuming. And at this point, your use case for the specific/highly granular events will be reduced to those ONLY interested those events, and NEVER the more generic ones. The second a consumer has a need for something in the generic update event, they'd have to consume and process every single anyway - or you would have to add a new specific one - which is a potential slippery slope towards the swarm of gnats ... Again: not saying that there isn't a valid case for something like this - I would just think it through very carefully, and would venture the guess that for +80% of cases, it is not really worth it - but it's a guess of course.

@BrianKlausen Ай бұрын

@Sousleek I think your objection is based on a misconception of the point of events. An event should NEVER tell you ANYTHING about what a consumer intends to do as a reaction to the event - in that case it becomes a command. An event is a fact about something that _has_ _happened_ - nothing more. The logic about what happens as a reaction to that, belongs to the consumer, and the producer should never be bothered with specific requirements such as those - that would be domain leakage and hard coupling. So if you lack insights into what happens as a result of consuming a given event, I think you should look into something like tracing, with transaction and span ID's. Or I could just be completely misunderstanding your comment - in which case I apologize :).