The Whys and Hows of Database Streaming

  Рет қаралды 5,938

InfoQ

InfoQ

Күн бұрын

Пікірлер: 6
@allmhuran
@allmhuran 5 жыл бұрын
Why do you have to rely on your applications to make sure rows don't get deleted? Tell the DBA's to revoke delete permissions to the application service accounts (or users, if you're delegating credentials). There is another issue with any system which turns every change into an individual event which is then sent to a database (ie, event sourcing): It means turning a set based database change into a row-by-agonising -row database change, which is orders of magnitude slower. If someone makes a change which needs to update thousands of rows (large batch processes like stocktakes, depreciation, etc), you're going to be creating a transaction and issuing a DML statement for every change, instead of a single transaction and DML execution to do all the changes at once. Every RDBMS handles set based updates blindingly fast - thousands or even tens of thousands of rows are one-or-two-digit-millisecond operations. A transaction for every individual row easily blows that out 100 or 1000 times. Putting anything message-oriented in between two RDBMS's creates the same problem. Your application (microservice, etc) writes through its DAL and can do it in a set based manner. Good so far. But then you break that one large data change into multiple messages. Kafka handles this just fine, it can handle absurdly large message volumes. But where is kafka sending those messages? If the answer is "another database", you're dead in the water. The target database will become an extreme bottleneck as it takes those thousands of changes and tries to DML them into the system one by one.
@allmhuran
@allmhuran 5 жыл бұрын
@Joy Gao Yeah, we use CDC in somewhat similar ways, fortunately for us anything with large data volumes is typically nightly (data warehouse), and anything that needs to be near/real time is relatively small, so it either goes through a simple windows service or via sub-hourly SSIS, which allows set based operations. You can actually stream data into SQL Server using a table valued parameter to a stored procedure, backed by an IEnumerable object used by a .net call to the SQLCommand object, but that's obviously a platform specific solution.
@terafl
@terafl 5 жыл бұрын
I think DynamoDB supports something similar out of the box. They call it DynamoDB Streams. It can send new+old values together.
@RonJohn63
@RonJohn63 5 жыл бұрын
2:19 But that's what plain old views *are.* Rdb/VMS has had "streaming" for 15 years, by using a tool to read the redo logs and create CSV files with the old and new values.
@uybabayun
@uybabayun 5 жыл бұрын
Utterly relevant topic, thank you!
@cafeliu5401
@cafeliu5401 5 жыл бұрын
小姐姐帅帅哒!
Patterns of Streaming Applications
47:35
InfoQ
Рет қаралды 7 М.
Algorithms behind Modern Storage Systems
49:03
InfoQ
Рет қаралды 25 М.
Kluster Duo #настольныеигры #boardgames #игры #games #настолки #настольные_игры
00:47
ДЕНЬ УЧИТЕЛЯ В ШКОЛЕ
01:00
SIDELNIKOVVV
Рет қаралды 4,1 МЛН
She's very CREATIVE💡💦 #camping #survival #bushcraft #outdoors #lifehack
00:26
哈哈大家为了进去也是想尽办法!#火影忍者 #佐助 #家庭
00:33
火影忍者一家
Рет қаралды 127 МЛН
Complex Event Flows in Distributed Systems
49:30
InfoQ
Рет қаралды 8 М.
The Anatomy of a Distributed System
37:44
InfoQ
Рет қаралды 37 М.
Kafka: A Modern Distributed System
52:25
InfoQ
Рет қаралды 39 М.
Scaling Patterns for Netflix's Edge
49:46
InfoQ
Рет қаралды 6 М.
Stream Processing 101 | Basics
10:17
Code with Irtiza
Рет қаралды 10 М.
Thomas J. Fan - Time Series EDA with STUMPY
26:24
PyData NYC
Рет қаралды 1,6 М.
The Evolution of Reddit.com's Architecture
33:11
InfoQ
Рет қаралды 91 М.
Kluster Duo #настольныеигры #boardgames #игры #games #настолки #настольные_игры
00:47