Great talk. That's some mind boggling scale too at 400TB of daily compressed logs
@jfltech9 жыл бұрын
Very good talk, John Graham-Cummings displays CloudFlare's awesome architecture processing 4 million lines per sec/400TB day of logs efficiently using awesome open-sourced tools such as Nginx-Lua, Capt'nProto, Redis, Go, Kafka, PostgreSQL and not burdened by bloated JVM application frameworks and pricey Databases.
@swagv9 жыл бұрын
This general problem is actually old hat in high-energy physics. You shouldn't have had to reinvent the wheel to solve it. With near-light-speed-sensitive real-time event tracking and processing on a particle collider, there are millions of events per second with ~10Mb or so of logged data per event (in modern collider terms). There's tons of preprocessing (streaming algorithms) that identify what events should be recorded vs. the thousands and millions that should be discarded in real-time: something with strong parallels to what CloudFlare deals with in identifying potential threats.
@calebhyde16559 жыл бұрын
go onnnnn.... Do you blog, can provide some reference papers giving implementation details, how to scale up, hardware requirements, etc.?
@movax20h5 жыл бұрын
Well. It is obvious Capnproto will be faster than JSON. Everything binary will be faster than XML or JSON. JSON is a very poor fit for anything. How Capnproto compares to protobuf? I looked at capnproto and flatbuffers, and they are honestly ugly. You get quite a bit of performance in special cases, but you need to write WAY more logic in your application to deal with it. Protobuf is way more strightforward to deal with. If you look at complex apps, I don't think capnproto will really be better than protobuf, if you take everything into consideration. Processing few millions protobufs per second is extremely easy and a routine.
@bobdvd8 жыл бұрын
ARGH!!! 4MHz not 4Mhz, you're supposed to be technical people!