Implementing Fast Calendar Algorithms: Speeding Date - Cassio Neri - C++ on Sea 2023

  Рет қаралды 8,446

cpponsea

cpponsea

8 ай бұрын

cpponsea.uk/
---
Implementing Fast Calendar Algorithms: Speeding Date - Cassio Neri - C++ on Sea 2023
Given that N days have elapsed since 1st January 1970, what day is today?
Hummm... Since years and months have variable number of days, this problem seems difficult. Breaking down time is easier though: given that N seconds have elapsed since midnight, what time is it? A couple of divisions and modulus operations by 60 gives the answer. What if I tell you the first problem is quite similar to the second one? Indeed using a simple mathematical generalisation of division allows solving both problems in the same way.
That's what I did: I used mathematical results to developed calendar algorithms that, in addition to the cool maths, proved to be faster than alternatives in libc++, boost, OpenJDK, Android phones and other popular open source libraries. So much that these algorithms -- the subject of this talk -- now feature in the Linux Kernel, libstdc++ (<chrono>) and Microsoft's .NET.
We shall see a gentle introduction to the underlying mathematical results and how the optimisations based on them make these algorithms so suited for modern CPUs. The same optimisations might be useful elsewhere. Also in the talk:
1) An implementation of the humble std::chrono::year::is_leap that's usually 3x faster than virtually every other implementation out there.
2) A one-line C/C++ expression that yields the last day of the month M for 1 <= M <= 12 and M != 2 (i.e., disregards February -- the ugly duckling of the months). It doesn't use either branches (if or switch) or look-up tables just... (Shush! No spoilers here.)
3) A brief history of our most used calendar, from its Roman origins, its Greek and Egyptian influences, up to its final form taken in the 16th century. Decisions made hundreds of years ago have a deep impact on the performance of calendar algorithms that run today. A fascinating subject, full of story telling that can amuse guests at any dinner party.
---
Slides: github.com/philsquared/cppons...
Sponsored by think-cell: www.think-cell.com/en/
---
Cassio Neri
I hold a PHD in Applied Mathematics from University of Paris Dauphine. I have been professionally coding in C++ for more than 15 years but my coding experience has started far earlier. I currently work on the financial industry in London but had previously worked in academia for more than a decade.
I am the author of a few research articles and I have published them on peer-reviewed journals on Mathematics (there's an equation named after me), Finance, Computer Science and C++.
---
C++ on Sea is an annual C++ and coding conference, in Folkestone, in the UK.
- Annual C++ on Sea, C++ conference: cpponsea.uk/
- 2023 Program: cpponsea.uk/2023/schedule/
- Twitter: / cpponsea
---
KZbin Videos Filmed, Edited & Optimised by Digital Medium: events.digital-medium.co.uk
#cpp​ #cpponsea​ #cppprogramming #calendar

Пікірлер: 25
@Voltra_
@Voltra_ 8 ай бұрын
Amazing talk! Now I'll just scan through every implementation in other languages to see if we can get better performance out of our date objects
@PeteC62
@PeteC62 7 ай бұрын
Its an interesting exercise in optimization, but I'm having trouble imagining a situation where these sorts of calendar operations are on such a hot path that even a 5x speed improvement would materially affect the overall performance.
@ayrendraganas8686
@ayrendraganas8686 7 ай бұрын
Databases are a likely candidate.
@coolcax99
@coolcax99 6 ай бұрын
Many applications accessed by users with a front end typically have dates associated with data which may need these operations done every second. Think about messaging apps, email apps that constantly try to fetch data then perform a local sort assuming data is not necessarily in order. To be consistent with different timezones, usually a full timestamp with dates are used. Or, think about long running log applications that keep logging with a timestamp every second or so, and you have to now process these logs to find specific regions. Journalctl for example would have to figure out a fast way to search through logs
@leonardofg
@leonardofg 7 ай бұрын
Nice to see Cassio, he was my professor 20+ years ago at Federal University of Rio de Janeiro :)
@cassioneri
@cassioneri 7 ай бұрын
That brings me very good memories. Um abraço 🤗.
@TerjeMathisen
@TerjeMathisen 7 ай бұрын
I'm starting to write this during the video introduction because I believe my own calendar code is very close to the fastest possible, taking 20-40 clock cycles (depending upon cpu version) to convert from Julian day number to Y-M-D, since this is the slowest of all the common date conversion functions. 🙂 EDIT: 20 clock cycles would be 5 ns on a 4 GHz cpu, so my almost 30 years old 32-bit 386 asm (with C equivalent) code is similar in speed to what Cassio Neri is showing here, but I'll admit that he has found a couple of tricks which I never discovered!
@pawello87
@pawello87 6 ай бұрын
Good talk, really nice introduction :) A lot of numbers. After a few minutes, I no longer remember where the number came from and why it has that value :D
@chrismoritz6706
@chrismoritz6706 7 ай бұрын
Brilliant and entertaining. So should programming, be done. In other words, math is the toolkit for the ones, that are to lazy to calculate :)
@musaran2
@musaran2 7 ай бұрын
I get the idea, even if many details flew over my head. In an ideal world, the compiler or IDE would know to do these transforms. Someday maybe…
@blacklion79
@blacklion79 7 ай бұрын
«every day has 24 hours» - leap seconds leave the chat
@cassioneri
@cassioneri 7 ай бұрын
I always expect a question about leap seconds, time zones and Daylight Savings Time 🙂. You're right: not every minute has 60 seconds and because of DST, not every day has 24 hours either. However, these things are usually dealt at a higher levels. These aspects cannot be treated purely in mathematical terms and need information from data bases. At lower levels, the problem is simplified. For instance, std::chrono::minutes is defined as 60 seconds (similarly, std::chrono::hours and std::chrono::days have a constant number of seconds.) My talk lives in this realm.
@Y2B123
@Y2B123 7 ай бұрын
UNIX time does not count leap seconds, UTC does. I think to account for that you just need a continuously updated table to determine if you crossed leap-second boundaries.
@MikkoRantalainen
@MikkoRantalainen 7 ай бұрын
You cannot handle leap seconds without a lookup table because by definition those are added as a result of political decision making. Also, timezones are totally political, especially in Egypt and Israel, so you definitely need lookup tables for those, too.
@TerjeMathisen
@TerjeMathisen 7 ай бұрын
At 49:30 he claims that (5*N_Y+461)/153 needs 2 MUL operations, but in reality that first part in () can (and will be optimized this way in a good compiler!) be implemented in a single cycle with LEA: ;; N_Y in rax LEA RAX,[RAX+RAX*4+461] ; Load Effective Address Slightly before he quotes a Granlund paper from 1994 that shows how to use reciprocal multiplication to replace any division by a constant: Afaik, this was first proven by DEC researchers and then (independently) by Agner Fog and myself. I actually did an exhaustive test of all unsigned 32-bit divisors, which would of course be impossible on 64 bits.
@cassioneri
@cassioneri 7 ай бұрын
Thanks for the comment. It's very true that on x86_64 platforms the expression 5*N_Y+461 can be evaluated by a single LEA instruction. Notice, however, that I don't use the word MUL in this part of the talk and, instead, I say "multiplications" (in the mathematical sense). I also make the common air quote gesture when I says that "the number of multiplications becomes a proxy for performance". I guess what I'm trying to say is that, due to time constraints and for the sake of didactics, the aim of the talk is to give the overall idea behind the results whereas the peer-reviewed published paper and the project github page go deeper into details. In particular, they show the use of LEA but also that our technique is very effective in reducing the number of cycles when **both** quotient and modulus are required. (If just the evaluation of 5*N_Y+461 in isolation is required, then one should not use our transformation.)
@PeteC62
@PeteC62 7 ай бұрын
A coincidence that he used a Casio watch as an example? I think not!
@cassioneri
@cassioneri 7 ай бұрын
You're very right. 😅
@cassioneri
@cassioneri 7 ай бұрын
Think also about that: 1) A leader of the plot to assassinate Julius Caesar was the Roman Senator Gaius Cassius Longinus. 2) In the Eastern Orthodox Church the leap day, February 29th, is the day of Saint John Cassian. (See Wikipedia.) Perhaps, my name is what got me into calendar algorithms at the first place. 😅
@sudeepthought1921
@sudeepthought1921 7 ай бұрын
What about instead of using the reminder of a division by 4 to get into the 75% branch, you just look at the two least significant bits in the least significant byte of the int where the year is stored, i.e. you do an AND 00000011 and then a JMP NZ?
@cassioneri
@cassioneri 7 ай бұрын
This is one of many transformations that compilers do for you. That is, you write n % 4 in your source code but the compiler replaces it with n & 3. So you don't gain anything by doing it manually. The talk shows other optimisations that compilers don't do.
@_lapys
@_lapys 7 ай бұрын
Speed dating 🐢
@ShengLiangSong
@ShengLiangSong 7 ай бұрын
How about version 3 of isLeayyear()? bool isLeapYear3(int y) { int a = (y % 100) == 0; int d = 1 3 ^ m); bool isLeapYear1(int y) { return y %4 == 0 && (y%100 != 0 || y%400 == 0); } bool isLeapYear2(int y) { int d = y % 100 != 0 ? 4 : 16; return (y & (d-1)) == 0; }
@cassioneri
@cassioneri 7 ай бұрын
I did try this very same idea at the time I was writing it for gcc. My measurements at that time showed that isLeapYear3 was slower than isLeapYear2 (which is, in essence, what I end up implementing). I no longer have the results of this benchmarks. You can see at compiler explorer / z / cb3Phjnno (link obscured to trick KZbin to accept this post) that gcc and clang translate isLeapYear3 to 14 assembly instructions and isLeapYear2 to 10. Msvc translates both to 14 instructions. It's worth recalling that, although very correlated, the number of assembly instructions is not a perfect representation of performance. Its always best to measure.
I Built a Shelter House For myself and Сat🐱📦🏠
00:35
TooTool
Рет қаралды 21 МЛН
Sigma Girl Education #sigma #viral #comedy
00:16
CRAZY GREAPA
Рет қаралды 94 МЛН
Шокирующая Речь Выпускника 😳📽️@CarrolltonTexas
00:43
Глеб Рандалайнен
Рет қаралды 11 МЛН
You don't need libraries to write a game engine in C++ | OpenGL | Devlog
2:50
Fast Inverse Square Root - A Quake III Algorithm
20:08
Nemean
Рет қаралды 4,9 МЛН
How AI Discovered a Faster Matrix Multiplication Algorithm
13:00
Quanta Magazine
Рет қаралды 1,3 МЛН
So You Think You Know Git - FOSDEM 2024
47:00
GitButler
Рет қаралды 967 М.
Mastering Dynamic Programming - How to solve any interview problem (Part 1)
19:41
New Algorithms in C++23 - Conor Hoekstra - C++ on Sea 2023
1:25:20
ПК с Авито за 3000р
0:58
ЖЕЛЕЗНЫЙ КОРОЛЬ
Рет қаралды 1,8 МЛН
Mi primera placa con dios
0:12
Eyal mewing
Рет қаралды 525 М.
Will the battery emit smoke if it rotates rapidly?
0:11
Meaningful Cartoons 183
Рет қаралды 3,5 МЛН
How To Unlock Your iphone With Your Voice
0:34
요루퐁 yorupong
Рет қаралды 18 МЛН