Want to master Clean Architecture? Go here: bit.ly/3PupkOJ Want to unlock Modular Monoliths? Go here: bit.ly/3SXlzSt
@iteospace5 ай бұрын
Milan, How about "long" Snowflake Id next?
@P1n3apqlExpr3ss5 ай бұрын
I'd highly recommend Jeff Moden's Guid fragmentation talk that dives deeper into this topic looking at page splits, fill factors, and gradually evolving data. It offers a perspective that is much different to the high count one-off insert/read demo shown here, and should be closer to the reality of many production systems
I believe it is this one kzbin.info/www/bejne/p5e0pYxnbMeDh5Ysi=Ls6A110rSMKR5Qbn @MilanJovanovicTech
@vasiovasio2 ай бұрын
Great video, Milan! The fragmentation difference is Insane! Поздрави от България! 🇧🇬❤️🇷🇸
@MilanJovanovicTech2 ай бұрын
Пуно хвала! :)
@WolfspiritMagic5 ай бұрын
While everything in the video is true, there is one thing that somehow nobody mentions when talking about these time based ids: Security. In my opinion there are cases where exposing the creation Date/Time of a database record can lead to security issues. For example with these kind of IDs users will always be able to figure out if they have been created before or after another user if they get their IDs. That's sometimes the reason why you use GUIDs instead of auto incrementing numbers in the first place.
@MilanJovanovicTech5 ай бұрын
Makes sense, but a timestamp leaks less information than an INT/LONG id.
@Whojoo5 ай бұрын
Asume I know too little about security. Why would it matter if I as a user know if my account is created before or after an other user?
@WolfspiritMagic5 ай бұрын
@@Whojoo Before/After might not a big issue but it's still data that you might not want to leak. For example if you have a game and User A invites User B to your game, might even spend some money for User B to give them a starting credit and then figures out based on the ID that User B was already member of that game before User A.
@WolfspiritMagic5 ай бұрын
@@MilanJovanovicTech Actually I'm not sure about that. I think it depends and a timestamp might in some cases be even worse. Lets say you have a database of patients for a doctor treating a specific illness. Knowing the timestamp could give people insight into when that person actually got that illness. Or lets say you're looking for another job and join a job site for that. Your current employer finds you on that website. With an ID in itself of int/long (and no other IDs to compare) they won't have much informations (unless the website provides that to them) and you could be member of that size for years. With an ID based on a timestamp they will be able find out that you just recently started looking for a new job. Also most of randomness is based on time. Having the exact creation time in milliseconds with the entry might in some cases help an attacker to limit the amount of bruteforce required. That's for example how they recently hacked a crypto wallet.
@JohnathonSmith035 ай бұрын
@@WolfspiritMagicstill don’t see how making it harder for users to lie and deceive each other would be a security issue… not to mention it isn’t even reliable at all short scale anyway otherwise time based keys around be guid anyway. So you get a small amount of ordering info which is can think of very few circumstances being an actual security issue. Just makes it harder to lie about sequencing.
5 ай бұрын
This video is really great, and there are so many bad videos on this topic. The only thing I don't think you mentioned (or I might have missed it), is that SQL Server saves and order a UNIQUEIDENTIFIER in a different byte order than C# (and some other databases). That means even with UUID 7, if we use this the UNIQUEIDENTIFIER datatype, there will be fragmentation issues. Storing as string or binary will not have the same type off issues though.
@RobertMcLaws5 ай бұрын
That only happens when you have a clustered UNIQUEIDENTIFIER. Create the ID column as NONCLUSTERED and create a CLUSTERED INDEX on the DateCreated instead, and your database will have almost the exact same performance as INT IDs.
@MilanJovanovicTech5 ай бұрын
"there are so many bad videos on this topic" - I'm not sure why some videos focused on the how fast the ID creation time is. 🤷♂️
@dcernach5 ай бұрын
Never thought about it. Our DBAs complain a lot about our using GUID as a PK. If I understood correctly, you meant to create a table with an ID as a UNIQUEIDENTIFIER and add a column like CreatedOnUtc. Then, create a CLUSTERED INDEX using both columns, or create a clustered index on CreatedOnUtc and a non-clustered index on ID?
@daveanderson83485 ай бұрын
@@RobertMcLaws How about using that ID column as a Foreign Key in other tables? What is the impact of that?
@daveanderson83485 ай бұрын
@@dcernach The GUID type column as PK should be non-clustered index. Otherwise the fragmentation of GUID would have negative performance impact when your table has at least millions of rows.
@robadobdob5 ай бұрын
I think using UUIDs in databases is one of those topics which probably doesn't have a measurable impact to most of us at the scales we work at.
@MilanJovanovicTech5 ай бұрын
Probably true
@AdonikamVirgo4 ай бұрын
We've had issues in a 1mil+ row events table; Jeff Moden's talk is verbose, but boils down to not using the defaults - 90% fill factor for random ids causes expensive page splits, clustered PK is a waste and sequential inserts are not the enemy UNTIL you update and grow the row in some way. kzbin.info/www/bejne/qKe9qICEf92kjNE
@iteospaceАй бұрын
guid 10x slower
@robadobdobАй бұрын
@ it might be but unless your customers are ringing you up complaining about slow database queries, I wouldn’t be losing sleep over it.
@ucretsiztakipci66125 ай бұрын
Adding Postgress to this benchmark would be interesting. Wher is the repo?
@MilanJovanovicTech5 ай бұрын
Was planning to, but didn't have time to add a Postgres benchmark
@ahmettarksahin37775 ай бұрын
When we use ULID or UUID as the primary key in the database, should we check whether ULID or UUID has been created before when adding data to the database?
@MilanJovanovicTech5 ай бұрын
No need, they're random enough that you're unlikely to see a collision. And even then, you'd get an exception for duplicate key values.
@christianrazvan2 ай бұрын
But the sql server knows about the different type of uuids? What if I want to sort in the database query and not in code
@MilanJovanovicTech2 ай бұрын
I didn't test out that part. I think SQL Server does do something strange to UUIDs, and Postgres doesn't
@SKotekarАй бұрын
Using an auto incrementing number as the clustering key instead of datetime might be slightly faster
@MilanJovanovicTechАй бұрын
There is an example of that
@marikselazemaj34285 ай бұрын
10:45 What are those icons before class names?
@MilanJovanovicTech5 ай бұрын
ReSharper detects they're entities in an EF DbContext
@jodainemoore83005 ай бұрын
Great Videos, Please can you do one on Composition and Or with Strategy Pattern, best struggling with these for months.
@MilanJovanovicTech5 ай бұрын
Allright
@FranciscoCastellanodePablo5 ай бұрын
Be carefully with int, it can only support database generation. I have faced problems when your application (not planned in case) began to support external applications that has there own database like mobile applications that need to work with no connection.
@MilanJovanovicTech5 ай бұрын
There's also the HiLo strategy
@FranciscoCastellanodePablo5 ай бұрын
@@MilanJovanovicTech good point, never the less in my experience its better to go with UUID, this Hilo and others strategies that could exists could be needed if UUID creates other issues like performance as mention in the video. But for the majority of the cases I presume UUID it is the better option.
@RomanTurovskyy5 ай бұрын
If using GUIDs introduces less then 2x slowdown I do not see this as an issue. 10x performance penalty would have been a problem, but 2x can be easily covered by the modern hardware for the benefits GUIDs provide.
@MilanJovanovicTech5 ай бұрын
True. Most problems in engineering are amplified by scale.
@thedacian1234 ай бұрын
Nice,but about database sequences?
@MilanJovanovicTech4 ай бұрын
Option 1?
@AhmedKhaled-g8t5 ай бұрын
what is the best order to study clean arc
@MilanJovanovicTech5 ай бұрын
There's a million sources, watch them all. But the source of truth is Uncle Bob's book.
@enricoroselino7557Ай бұрын
aaaaand V7 still causing fragmentation because big / little endian thing in mssql. But it should be same or similar performance as ULID isnt it?
@MilanJovanovicTechАй бұрын
SQL Server 😅
@theeusftn5 ай бұрын
But why I need to use timestamp in my uuid? i dont got it
@MilanJovanovicTech5 ай бұрын
Clustered index
@theeusftn5 ай бұрын
@@MilanJovanovicTech interesting
@DavidSmith-ef4eh5 ай бұрын
do they slow down queries and db in general? an eleven length unsigned int seems much more preferable than a 32-length char entry lol. I assume the index size and look up speed are much slower on the latter, on huge databases.
@DavidSmith-ef4eh5 ай бұрын
although, on the other hand, you don't need a created_at column anymore. I can see msyelf using them, for things that require unique ids accross a system
@Rob_III5 ай бұрын
You shouldn't store a ULID (or (G/U)UID for that matter) as (var)char but as (G/U)UID. And, yes, a ULID can be stored as a (G/U)UID.
@RobertMcLaws5 ай бұрын
Using the NONCLUSTERED UniqueIdentifer + Clustered DateCreated is nearly as fast as integers and has been tested on huge databases with negligible performance impact.
@MilanJovanovicTech5 ай бұрын
They have an impact on table/index size, for sure
@casperhansen8265 ай бұрын
It's 16 bits and the main benefit is that it can be generated on the client and written to the database. The problem with GUID and UUID is that the order of them is more or less random so writing it to the database as a clustered index will cause some heavy manipulation to make it work. ULID should solve that problem, still generating unique values in almost sorted order making it possible to just place the new record at the end of the table
@kenwarner5 ай бұрын
It's probably just me but having the SSMS toolbars on two rows instead of one stuck out like a sore thumb. I always combine to a single row whenever I have to install it. 😬
@MilanJovanovicTech5 ай бұрын
Never bothered me 🤷♂️
@vijayarajan-bt5fk5 ай бұрын
15.26 where to get this benchmark...? Shell script;
@MilanJovanovicTech5 ай бұрын
I share the source code on Patreon
@casperhansen8265 ай бұрын
The problem with ULID it is not compatible with UUID/GUID so it will require a lot of changes to make it work in an existing system If you let the SQL server database create the guid it will be generated in sorted order and work just as well
@MilanJovanovicTech5 ай бұрын
SQL Server isn't the only database people commonly use, so we should look at this from a broader perspective. I'll upload a Postgres benchmark soon!
5 ай бұрын
If you use NEWID and not NEWSEQUENTIALID, then you should have the same problem if MSSQL generates it. NEWSEQUENTIALID is closer to what UUIDv7 does, but compatible with the UNIQUEIDENTIFIER type without getting to much fragmentation. ULID is 128 bits, same as UUID so not sure what you mean with "not compatible", there is always the little-big-endian issues when working with UUID/GUID/ULID, so if your working on a bit-level, you can get confused really fast... For the text representation, they are not presented in the same way. But if you store it as CHARs and not BINARY or UNIQUEIDENTIFIER, then you have about doubled the size already... if that's important for your usecase.
@pavolslamka8219 сағат бұрын
This performance test is not very informative in my opinion, because your tables are empty! I think the point should be that option Table2 should be SLOWER than option Table5, because the entire table data is sorted by random data since it is clustered by random GUID (and you shuffle all the table data on every insert), whereas in Table5 the data is clustered by increasing DateTime, although it has additional NON-CLUSTERED index on Guid (which is shuffled a lot, but is just one GUID so not too large). But because your tables have no additional data, it does not matter whether the index is clustered or not. Of course I could be wrong. If so, I would appreciate a correction ;)
@MilanJovanovicTech9 сағат бұрын
@@pavolslamka821 What do you mean by empty? Can't recall the exact setup, but there are at least 1M rows per table
@sunzhang-d9v5 ай бұрын
SnowFlake ldGenerator
@DavidSmith-ef4eh5 ай бұрын
lol, thats an unfortunate name. does it contain the pronoun of the id as well?
@Rob_III5 ай бұрын
@@DavidSmith-ef4eh Snowflake predates woke stuff by at least a decade; Twitter introduced it in 2010.
@DavidSmith-ef4eh5 ай бұрын
@@Rob_III yeah, read upon it. seems to be used across all the major socials, apparently. must be fastest I guess.
@MilanJovanovicTech5 ай бұрын
Cool, will see if I can include it in some future discussions
@sunzhang-d9v5 ай бұрын
@@MilanJovanovicTech what? feel ,long long ago
@AravindhKumar007Ай бұрын
You should have used snowflake id and called it a day 😅