What is Low Latency C++? (Part 2) - Timur Doumler

What is Low Latency C++? (Part 2) - Timur Doumler - CppNow 2023

Рет қаралды 12,715

CppNow

Күн бұрын

Пікірлер

@johetajava Жыл бұрын

Thanks for the talk, it was very interesting!

@CppNow Жыл бұрын

Glad to hear that this presentation was helpful!

@kengounited Жыл бұрын

Very entertaining talk!

@JG-mx7xf 4 ай бұрын

1:13:00 . You need an acquire_release fence after read data otherwise in current thread read data could be reordered after seq2 load. You want to ensure to reload the seq after finishing reading the data otherwise you might get a return true with torn data

@Roibarkan Жыл бұрын

11:03 on Intel platforms there are also specific instructions and mechanisms for cache interactions, such as prefetch, demote, and bypassing cache. Theres’s also CAT - “cache allocation technology” that can be used to design what the caches can or cannot hold

@OlliS71 4 ай бұрын

I think that doesn't help with lock-free queues since when you know what data to prefetch you'd have immmediately load it.

@Roibarkan Жыл бұрын

7:21 Sergey Slotin’s cppcon talk where such techniques are discussed: kzbin.info/www/bejne/Z4OsgYCHhrd1jc0

@paulluckner411 9 ай бұрын

48:00 there is some name shadowing for `new_coeffs` within `update_coeffs()`. I believe all but one on the first line should be something else, e.g. `new_storage`. Otherwise, thank you for this excellent talk!

@OlliS71 4 ай бұрын

There's no need for lock-free queues since enqueuing and dequeuing is only a very short time with not much chance of an overlap because most of the time you spend with producing and consuming the data.

@retropaganda8442 Жыл бұрын

I would have expected some chapter talking about what operating systems have to offer to allow for deadline-constraint realtime to be met.

@matias-eduardo Жыл бұрын

For HFT, ideally, there'd be no OS at all. For a space shuttle OS with "hard real time" requirements, it's less about getting the lowest latency and more about making sure the OS is always running predictably at the worst case. So if you have a theoretical main loop that calls "OS_UpdateState(os_state)" to update the entire OS state on each loop, what you want is for that call's timings to be consistent regardless of what data you give it or what resources you create/delete that tick.

@movax20h Жыл бұрын

Timur, about double buffering and SeqLock, there is other way. I work in HFT, and we use it for some monitoring of big values. It is called Chen Algorithm, or Tripple buffering. It makes writer wait free, and read wait free. If there is only one reader, it is very simple code. With more readers it gets more complicated (and writer overheads scales with number of readers, but with one reader it is easy), but also doable. For multiple readers, we just take a lock, so only one reader actually reads.

@matias-eduardo Жыл бұрын

Is the idea that you have one buffer for the writer and then two buffers per reader? So if you have one writer thread and three reader threads, you can use seven buffers for a wait-free sync?

@AhmedTahagg Жыл бұрын

Great presentation! At timestamp 48:52, do we not have a race condition in the case of concurrent updaters?

@Roibarkan Жыл бұрын

Yes. I believe this code only supports a single updater thread (the fact that storage isn't atomic/synchronized in any way is an indication). I guess if there are multiple update threads and all of them aren't "hot", they can synchronize amongst themselves using a mutex

@qwertyuiop-cu2ve Жыл бұрын

Yes, this can only work with a single producer because `storage` is not atomic. I also noticed a missing `)` of the for loop in `update_coeffs()`, which makes it a bit confusing to read.

@Roibarkan Жыл бұрын

55:53 [slide 132] I believe that the find() method is missing “while (p && p->t != t)” right before “p = p->next;”

@qwertyuiop-cu2ve Жыл бұрын

Yes, this find() will just get head->next. Further, I think _front suffix can go from pop/push because it is implicit from the thing being a stack. Also push_front is making an extra copy as it takes input by value.