Amazing talk! Now I'll just scan through every implementation in other languages to see if we can get better performance out of our date objects
@PeteC62 Жыл бұрын
Its an interesting exercise in optimization, but I'm having trouble imagining a situation where these sorts of calendar operations are on such a hot path that even a 5x speed improvement would materially affect the overall performance.
@ayrendraganas8686 Жыл бұрын
Databases are a likely candidate.
@coolcax99 Жыл бұрын
Many applications accessed by users with a front end typically have dates associated with data which may need these operations done every second. Think about messaging apps, email apps that constantly try to fetch data then perform a local sort assuming data is not necessarily in order. To be consistent with different timezones, usually a full timestamp with dates are used. Or, think about long running log applications that keep logging with a timestamp every second or so, and you have to now process these logs to find specific regions. Journalctl for example would have to figure out a fast way to search through logs
@leonardofg Жыл бұрын
Nice to see Cassio, he was my professor 20+ years ago at Federal University of Rio de Janeiro :)
@cassioneri Жыл бұрын
That brings me very good memories. Um abraço 🤗.
@pawello87 Жыл бұрын
Good talk, really nice introduction :) A lot of numbers. After a few minutes, I no longer remember where the number came from and why it has that value :D
@TerjeMathisen Жыл бұрын
I'm starting to write this during the video introduction because I believe my own calendar code is very close to the fastest possible, taking 20-40 clock cycles (depending upon cpu version) to convert from Julian day number to Y-M-D, since this is the slowest of all the common date conversion functions. 🙂 EDIT: 20 clock cycles would be 5 ns on a 4 GHz cpu, so my almost 30 years old 32-bit 386 asm (with C equivalent) code is similar in speed to what Cassio Neri is showing here, but I'll admit that he has found a couple of tricks which I never discovered!
@blacklion79 Жыл бұрын
«every day has 24 hours» - leap seconds leave the chat
@cassioneri Жыл бұрын
I always expect a question about leap seconds, time zones and Daylight Savings Time 🙂. You're right: not every minute has 60 seconds and because of DST, not every day has 24 hours either. However, these things are usually dealt at a higher levels. These aspects cannot be treated purely in mathematical terms and need information from data bases. At lower levels, the problem is simplified. For instance, std::chrono::minutes is defined as 60 seconds (similarly, std::chrono::hours and std::chrono::days have a constant number of seconds.) My talk lives in this realm.
@Y2B123 Жыл бұрын
UNIX time does not count leap seconds, UTC does. I think to account for that you just need a continuously updated table to determine if you crossed leap-second boundaries.
@MikkoRantalainen Жыл бұрын
You cannot handle leap seconds without a lookup table because by definition those are added as a result of political decision making. Also, timezones are totally political, especially in Egypt and Israel, so you definitely need lookup tables for those, too.
@chrismoritz6706 Жыл бұрын
Brilliant and entertaining. So should programming, be done. In other words, math is the toolkit for the ones, that are to lazy to calculate :)
@musaran2 Жыл бұрын
I get the idea, even if many details flew over my head. In an ideal world, the compiler or IDE would know to do these transforms. Someday maybe…
@TerjeMathisen Жыл бұрын
At 49:30 he claims that (5*N_Y+461)/153 needs 2 MUL operations, but in reality that first part in () can (and will be optimized this way in a good compiler!) be implemented in a single cycle with LEA: ;; N_Y in rax LEA RAX,[RAX+RAX*4+461] ; Load Effective Address Slightly before he quotes a Granlund paper from 1994 that shows how to use reciprocal multiplication to replace any division by a constant: Afaik, this was first proven by DEC researchers and then (independently) by Agner Fog and myself. I actually did an exhaustive test of all unsigned 32-bit divisors, which would of course be impossible on 64 bits.
@cassioneri Жыл бұрын
Thanks for the comment. It's very true that on x86_64 platforms the expression 5*N_Y+461 can be evaluated by a single LEA instruction. Notice, however, that I don't use the word MUL in this part of the talk and, instead, I say "multiplications" (in the mathematical sense). I also make the common air quote gesture when I says that "the number of multiplications becomes a proxy for performance". I guess what I'm trying to say is that, due to time constraints and for the sake of didactics, the aim of the talk is to give the overall idea behind the results whereas the peer-reviewed published paper and the project github page go deeper into details. In particular, they show the use of LEA but also that our technique is very effective in reducing the number of cycles when **both** quotient and modulus are required. (If just the evaluation of 5*N_Y+461 in isolation is required, then one should not use our transformation.)
@PeteC62 Жыл бұрын
A coincidence that he used a Casio watch as an example? I think not!
@cassioneri Жыл бұрын
You're very right. 😅
@cassioneri Жыл бұрын
Think also about that: 1) A leader of the plot to assassinate Julius Caesar was the Roman Senator Gaius Cassius Longinus. 2) In the Eastern Orthodox Church the leap day, February 29th, is the day of Saint John Cassian. (See Wikipedia.) Perhaps, my name is what got me into calendar algorithms at the first place. 😅
@sudeepthought1921 Жыл бұрын
What about instead of using the reminder of a division by 4 to get into the 75% branch, you just look at the two least significant bits in the least significant byte of the int where the year is stored, i.e. you do an AND 00000011 and then a JMP NZ?
@cassioneri Жыл бұрын
This is one of many transformations that compilers do for you. That is, you write n % 4 in your source code but the compiler replaces it with n & 3. So you don't gain anything by doing it manually. The talk shows other optimisations that compilers don't do.
@amoskevitz3 ай бұрын
But what calender system is bce and ce part of? Cause it aint the gregorian?
@cassioneri3 ай бұрын
The BCE/CE (Before Common Era/Common Era) notation is related to both Julian and Gregorian calendar (although for certain dates the year in these two calendars differ by one.) It's a zero based system counting forward from the year that was wrongly believed to be the year of Jesus Christ birth. Its roots are in the works of Johannes Kepler who knew that the previous AD/BC (Anno Domini/Before Christ) system created by Dionysius Exiguus was wrong in assuming that Jesus Christ was born on AD 1. Currently expert historians mostly believe that Jesus Christ was born no later than 4 BC. Hence, according to the old system Jesus was born at least 4 years before his own birth. Despite being very Christian, Kepler knew the nonsense of the old system and decided to "refactor" it. Pragmatically, he kept the year numbers that people were used to (hence the "common" in the name and the relation with the Gregorian calendar) and renamed AD to CE. Kepler also made it more "algorithmic friendly" by setting year zero to the old 1 BC so that the year counting follows integer numbers more closely instead of jumping from 1 BC to AD 1. Finally, the AD/BC notation has another weirdness: AD is a prefix and BC is a suffix. In the CE/BCE system, both are suffixes (which simplifies parsing).
@_lapys Жыл бұрын
Speed dating 🐢
@ShengLiangSong Жыл бұрын
How about version 3 of isLeayyear()? bool isLeapYear3(int y) { int a = (y % 100) == 0; int d = 1 3 ^ m); bool isLeapYear1(int y) { return y %4 == 0 && (y%100 != 0 || y%400 == 0); } bool isLeapYear2(int y) { int d = y % 100 != 0 ? 4 : 16; return (y & (d-1)) == 0; }
@cassioneri Жыл бұрын
I did try this very same idea at the time I was writing it for gcc. My measurements at that time showed that isLeapYear3 was slower than isLeapYear2 (which is, in essence, what I end up implementing). I no longer have the results of this benchmarks. You can see at compiler explorer / z / cb3Phjnno (link obscured to trick KZbin to accept this post) that gcc and clang translate isLeapYear3 to 14 assembly instructions and isLeapYear2 to 10. Msvc translates both to 14 instructions. It's worth recalling that, although very correlated, the number of assembly instructions is not a perfect representation of performance. Its always best to measure.