Processing Large Payloads with the Claim Check Pattern

Рет қаралды 7,171

CodeOpinion

Күн бұрын

Пікірлер: 44

@MaxLaurieHutchinson 3 жыл бұрын

I designed this idea into my data pipeline last year. I never knew it had a pattern name. amazing :)

@CodeOpinion 3 жыл бұрын

Cool. I think we all end up using different patterns over the years that we didn't realize had a name!

@terencekent9394 2 жыл бұрын

The video is very clear! However, the design comes with a big gotcha that isn't called out - the double write problem. The application code writes to the object storage and then to the message bus. That leaves the possibility of a content being written to object storage without a message being added to the bus. You either need to use some ACID store to track the progress of both these writes (and ensure retries) or rely on some integration between the object storage and message bus. Most cloud providers (AWS/GCP/Azure) have integrations with their object storage and messaging services to manage reliable message delivery for you.

@CodeOpinion 2 жыл бұрын

Yes, if you write to storage, which you should be doing first, then if you fail to send the message, then you return back to the client a failure. At worst you have a file in blog storage that should be removed from a policy/expiry.

@christopherenriquez9765 2 жыл бұрын

Didnt know that this pattern has a name. Thanks for sharing this one :)

@CodeOpinion 2 жыл бұрын

Thanks for watching! Now you know the name!

@moriazizi Жыл бұрын

Can you share .Net implementation of this pattern please ?

@trocomerlo 3 жыл бұрын

Thanks for sharing. The 1st stage of the solution is kind of a 2 phase commit, a distributed transaction to deliver the file and the message to the queue. How would you avoid this?

@CodeOpinion 3 жыл бұрын

It is, but not really a concern. If you receiving the file from the client, and you upload/save it to blob storage, you need that to complete before you create the message. If you are able to save to blog storage but publishing the message fails, then you'll handle that and return something appropriate back to the user. At that point you have an orphaned file, which isn't problematic. I didn't include in this video that you ultimately have to clean up blob storage after you're done processing. Having an orphaned file isn't problematic because it's not state or modifying your system. It's temporary.

@JCArtuso 3 жыл бұрын

Congrats! Thanks for many interesting content. I use this with many Etl process, for sure!

@CodeOpinion 3 жыл бұрын

Thanks for watching!

@jacksonlloyd1519 3 жыл бұрын

Awesome summary, out of curiosity, if the producer and consumer were hosted as different services, how would you handle authentication for the blob? Would they have the same blob connection?

@CodeOpinion 3 жыл бұрын

Really depends on the storage you're using. If you were using something like AWS S3 then just having a bucket with the relevant access/iam for both producer and consumer.

@sathyajithps013 3 жыл бұрын

Oh, I have used something like it before.I didn't know that it was a pattern. I have an API that accepts excel file and validates it (size only) and store it in a folder. Then OK status is returned. The upload is marked as under processing in db to track the processing. Immediately after uploading an event with the upload details is emitted and is handled/processed (validate rows,columns,data types etc..) and the result of the processing is updated in the db and archives the excel file.

@CodeOpinion 3 жыл бұрын

Sure is the claim check!

@sathyajithps013 3 жыл бұрын

@@CodeOpinionThanks for confirming. Looking forward for more videos like this.

@acegame1452 3 жыл бұрын

Amazing as always! Keep up the great work.

@CodeOpinion 3 жыл бұрын

Thanks! Will do!

@danku1013 3 жыл бұрын

Sorry but still didn't catch the one point. Anyway, to publish the file to the blob storage, the producer should have the file on the server already, right? So in that case we anyway should wait when the large file will be probably not 'uploaded', but went through HTTP to the server(producer) Just didn't work with really large files/objects. I'll appreciate it if someone can explain that

@CodeOpinion 3 жыл бұрын

If you're client is sending the file to the producer, via HTTP for example, then you would want to send the file to blob storage and send the message to the queue for processing. At that point you can return back to the client that the file was accepted. You could do so earlier, but you wouldn't be able to tell them you actually accepted it because there could be a failure uploading to blob storage or producing the message.

@johncerpa3782 3 жыл бұрын

Glad I found your channel, great videos

@CodeOpinion 3 жыл бұрын

Glad you like them!

@xilconic 3 жыл бұрын

Maybe I missed something: if the file is large that you wouldn’t want to synchronously upload it towards a service, how is this pattern not just moving the problem away from the service and towards the message broker? And if that’s because of sending it async: what’s preventing from sending async to the service and why wouldn’t that be a viable solution?

@CodeOpinion 3 жыл бұрын

The workflow could be different if you can directly upload to blob storage. You would then need to send a request to the HTTP API with reference to the location so that you can then process the image/file or whatever you wanted to do with it. The issue is most times because of auth reasons, you can't just it directly but need to proxy it through your HTTP API. But there are various other scenarios, such as pulling a file from your app from another location for processing rather than it being pushed to you.

@robadobdob 3 жыл бұрын

I’m trying this pattern out on an existing project. My plan is to have multiple consumers to service the messages. I’m calling it the “Hungry Hippos” pattern.

@CodeOpinion 3 жыл бұрын

😂 I like it!

@robadobdob 3 жыл бұрын

@@CodeOpinion I gather the actual name for it is the "competing consumers pattern".

@c4m4l340 3 жыл бұрын

Hold on. Does rabbitmq supports this out of the box?

@CodeOpinion 3 жыл бұрын

No. Generally I recommend using a library on top of the broker sdk. They generally will provide implementation for various common patterns such as the claim check. In this example I'm using NServiceBus but MassTransit also supports it.