Notion hit 100 million users recently so I wanted to do quick cover of their database evolution. Hope you get to learn something from this. Thank you again for taking the time to watch this video and for your continued support!
@kevinwu7497Ай бұрын
What tools do you use to make your videos? I love them!
@jackdavenport50112 ай бұрын
This video has made me realise how much of a nightmare it must be to scale up a database in production. But Notion is insanely fast now so it obviously paid off well.
@kikisbytes2 ай бұрын
hahah definitely!!
@Ergydion2 ай бұрын
In which world is notion fast? Always takes notable amount of time to just load my shopping list
@user-jt4hx2 ай бұрын
notion is many thing but not fast 🤣
@everton000rsc2 ай бұрын
I'm passing by this nightmare in my company right now, in our case we're gonna migrate to TiDB instead of sharding
@jackdavenport50112 ай бұрын
@@Ergydion I find the initial load can take a second or two but making edits are basically instant
@smithwillnot2 ай бұрын
What do you want to shard? Notion engineers: YES
@GuRuGeorge032 ай бұрын
coming up with this solution is tough for sure but the real challange is orchestrating all the teams and people involved in this. that job is incredible and I bet there were a few key people who managed all this and had to do a lot of overtime to achieve it, especially when critical errors & bugs popped up
@taylorjohnsonctАй бұрын
Exactly what I was thinking... I was reminescing over what my company went through when we converted from a monolith to a micro-services architecture, but this... this is something you can't do without investor money, the literal best talent, and some of the best management in the world. Whoever these engineers and project managers are should be incredibly proud. Also, can you imagine being a new backend dev or database guy at Notion :D:D:D:D:D:D
@bhaaratsharma6023Ай бұрын
We recently did a db upscale with around 12tb of data which is just a fraction of what Notion did and it was already a herculion task for us. It took us weeks of planning and work to make it a success. Working with data is one of the most challenging things in IT
@тимур_атмосферный2 ай бұрын
bro casually dropped 1mil+ youtuber level content
@kikisbytes2 ай бұрын
hahaha that so nice for you to say!!
@roninreilly2 ай бұрын
Думаю классный коммент, а тут еще и ру
@foreverskeptical12 ай бұрын
Your videos are so short and clean. Even though I am just a recent grad I get a lot of value from these vids. also didnt realize you could scale so much wwith postgresql
@kikisbytes2 ай бұрын
yay I'm so glad! As long as you can learn something new I'm happy!! Are you currently job hunting or already working?
@anirbanpatra30172 ай бұрын
@@kikisbytes I am job hunting. 😂Help me get a job
@andrefu41662 ай бұрын
insanely underrated channel, you're gonna be huge
@kikisbytes2 ай бұрын
hahah thank you!! Just want to make videos that are educational and fun to watch :)
@Kasukke2 ай бұрын
Agreed. I wish there was more of this type of content. In-depth, real problem solving.
@code58342 ай бұрын
Engineering team at notion did a fantastic job !
@kikisbytes2 ай бұрын
for sure!
@Flocksta2 ай бұрын
Yea they did an amazing job hiring a young freelancers underpaying theym by a factor of 2/3.
@code58342 ай бұрын
@@Flocksta yeah that is very true ! i totally agree with you. Talent is used to maximum but compensation is tried to keep at minimum for them to improve the profit margins, sad reality!
@siddair2 ай бұрын
Great video!! Loved this level of detail along with the animations. This is a differentiating factor from many other videos on such topics that don't go into detail but cover such topics at a very high level. You could link to explanations of some of the concepts mentioned for understanding but continue keeping this level of detail as that is what makes it great in the first place!
@captdev2 ай бұрын
This is crasy good content dude! You will be 1+ million views in no time
@kikisbytes2 ай бұрын
awhhh thank you, I appreciate that!! 😭
@ask_carbon2 ай бұрын
Good god I feel tired just going through this can't even imagine the stress on DBAs and System architects in Notion
@kiro_f2 ай бұрын
These videos are always so good, always happy to see when a new one is posted :)
@kikisbytes2 ай бұрын
Awhh thank you so much for your support! I truly appreciate that!
@SPOOKEXE2 ай бұрын
Watched a couple vids and they're wicked! Love the newer videos you've been uploading!
@kikisbytes2 ай бұрын
Thank you!!!
@justlovecode25222 ай бұрын
nice English subtitles, wow. you deserve a like!
@kikisbytes2 ай бұрын
Thank you!
@kaffii12382 ай бұрын
Great video! Just want to appreciate your videos as no one else does good summaries of engineering blogs or writeups, and I appreciate the lack of dilution of the concepts since there's just way too much content catered to beginners and not enough of more mid-level content like yours (digestable, consumable summaries of interesting solution architecture writeups) out there on KZbin.
@kikisbytes2 ай бұрын
Thank you for letting me know! It’s definitely a goal to make videos for people with experience. I was also worried that people wouldn’t be able to follow. But I’m glad that intermediate folks are okay with the pace
@GlynnPowellАй бұрын
This is great PR for Notion. I loved Notion when it arrived, went all in, then it slowed to a painful pace so I jumped to Obsidian.... This has got me buzzed to come back to Notion! Great video
@TokuyuuTV2 ай бұрын
so educational and entertaining at the same time!! i know nothing about systems but the video was so well-paced and funny I kept watching
@kikisbytes2 ай бұрын
Thank you Tokuyuu I'm going to cry now😭 Awaiting your next release!
@t3chnicolor2 ай бұрын
How did you make this video? Was it all AFX from scratch, or something like Prezi?
@69k_gold2 ай бұрын
I think Notion is still pretty slow for a majorly text-oriented application. I mean yes it does support non-text objects, but it's majorly text-based, and it's as slow as OneNote sometimes. Should text really take that long to load? Idk
@F7gi87j63aZq2 ай бұрын
Go, Obsidian, go!!!
@user-dc9zo7ek5j2 ай бұрын
@@F7gi87j63aZq Obsidian is a local application that works with files, while Notion is a shared application that works with databases between million users.
@user-dc9zo7ek5j2 ай бұрын
Its slow not because there is a lot of text, but because they have a lot of abstractions and services that they ask for your data in a representable format. Just like any other big company app, making many requests to many things at once seems like a fine approach. This is probably so that they allow large teams to work independently. I remember a Doordash developer interview that said they have around 500 microservices which is a bit too much for me. Good performant alternative to Notion is MediaWiki. Its design is "old-school" and it runs very quick.
@veryCreativeName0001-zv1ir2 ай бұрын
if you value time you use notion else use obsidian
@yash_renaissance_athlete2 ай бұрын
@@veryCreativeName0001-zv1ir lol that's the stupidest comparison between Notion and Obsidian. I have been using Obsidian aggressively since more than a year, I can't be shifted to any other platform.
@Redyf2 ай бұрын
What an amazing video, production quality at its highest level. 😁
@danser_theplayer012 ай бұрын
Were those inconsistent size blocks within blocks within blocks stored out in the wild instead of belonging to a specific user? Also, having an id for each and every action must be a nightmare especially since they didn't do ULIDs.
@archamondearchenwold8084Ай бұрын
How are these animations made if you dont mind sharing? They are glorious! :) is it motion canvas??
@JeffParker452 ай бұрын
Amazing Video! I'll have to rewatch this over and over to understand it more.
@davidmata31042 ай бұрын
Wouldn't it be easier to use a No-SQL database like Cassandra? Cassandra already manages all the logic to distribute the data in partitions. It also distributes the data into the different nodes and by its nature it scales horizontally.
@cestlacroix2 ай бұрын
that's exactly what i suggested
@DeepDarkier2 ай бұрын
or easier, they could use YugabyteDB or CockroachDB, they are almost 100% postgres compatible and scales horizontally by automaticaly sharding the data
@Aramik-lp5fn2 ай бұрын
My guess is that in their core product they are relying heavily on some sql features that they couldn’t afford to lose and that’s why chose extreme sharding compared to no-sql
@alexander_farkasАй бұрын
Their data is relational, why would they use non-relational database?
@davidmataviejo3313Ай бұрын
@@alexander_farkas you are right. Why would someone would want to use a hammer to drive a nail if they already have drill? 😂
@ruslan_yefimov2 ай бұрын
Great animations! Don't stop this
@kikisbytes2 ай бұрын
ty ty glad you enjoyed this video!
@rigveddesai58432 ай бұрын
amazing engineering and a great video explaining it all, just wondering why you would be happy with ~20% cpu utilization during peak hours, sorry if it sounds like a noob question but i genuinely don't get it
@sakamad48562 ай бұрын
CPU utilization is the amount of the CPU that the application is using up. So high CPU utilization is a bad thing. CPU utilization at 100% means your application taking up all the computational power of the CPU, which is bad because now no other programs can run
@rigveddesai58432 ай бұрын
@@sakamad4856 i assume notion would be running their dbs on dedicated servers? i get why 100% would be bad, but 20 seems too low lol
@jackdanielson19972 ай бұрын
I think they're saying is what used to be 90-100%+ utilization is now 20%, not that 20% is some magical number they landed on
@kikisbytes2 ай бұрын
This is a good question! I used to work at a team where our postgres instance was nearing and sometimes hitting full utilization. This was scary because we were running some critical services and our db performance was soo bad that our queries were super slow to a point where requests were being dropped. So I can see why notion was happy that it dropped to ~20% and not having to deal with these types of issues. On the plus size, it gives room for future growth that they won't have to worry about for a while.
@user-dc9zo7ek5j2 ай бұрын
They allowed the utilization to go down because of the optimization they did. Keeping your utilization high can be dangerous because peak usage can cause bottlenecks and even cascade failures from time contraints. I had a project that was using 10% for 22 hours, but the other 2 hours it was taking 80% CPU. It is always better to have more space than you need. Plus at that scale that they are operating it does not really matter the cost and wastefulness.
@SolomonSunder2 ай бұрын
A company I worked for faced similar issues during Covid. We were IOPS heavy, relied on SMB, Windows nested folders additionally. It was fixed using a technique similar to what Notion did here.
@glowingone1774Ай бұрын
Heh the profile Pic explains why it's so
@groli333516 күн бұрын
Cuz they have very love sync freq, when you update something it doesnt appear for another user for minutes.
@mailtochung2 ай бұрын
Document based database seems the best data architecture here. Notion is very document centric. Having 1 document as a doc in db makes so much sense. Sharding and clustering would be a lot easier because the relationship between documents will be minimized. I guess they had a wrong architecture in the first place and its too hard to change in the middle of the exponential growth.
@leehannigan43742 күн бұрын
At the time they done this I would have most definitely rearchitected for a NoSQL database like DynamoDB. However, it seems like they now have a perfect candidate for Aurora DSQL which essentially abstracts all the sharding complexities for you.
@hooch9122 ай бұрын
I’m curious if any in-memory caching was considered or also used on this expansion odyssey. Not every read needs to go to the database.
@mrexplorerrishabh21852 ай бұрын
Great video. Very nicely explained. Which software do you use to create these kind of animated videos ?
@addie42412 ай бұрын
Very interesting video with some cool networking and ideas related to breaking up problems relating to their datastructures
@kittoh_2 ай бұрын
Awesome content! What did you use for that animation? Very smooth.
@hecker6882 ай бұрын
awesome information so in-depth, would be great if you could explain the research that went behind learning about how they did it and why they did it! insane video 💯
@kikisbytes2 ай бұрын
Thank you for the feedback!! Yeah I definitely cut down some details to try to fit within the time limit but will keep that in mind for the future
@nlama-i7y2 ай бұрын
how to take backup and restore in case of sharding
@vorandrew2 ай бұрын
how do you do your animations?
@ARed11Ай бұрын
how you edit videos
@rockshankar2 ай бұрын
timeline and team size would be nice to know
@tarun-hacker23 күн бұрын
Very well explained, on the scaling aspects.
@kikisbytes18 күн бұрын
Thank you!
@RatonBroyeurАй бұрын
Great video. Great topic. Adapting your infrastructure to your customer growth is one of the hardest thing to do. Sooo many constraints. Great job notion !
@googleaccount72522 ай бұрын
Really nice how do you edit your videos?
@lukasnel4828Ай бұрын
Why didn't they use a database layer like Redis for caching?😊
@ODoyleRules-uh4hcАй бұрын
It's quite possible they wouldn't see enough benefit from caching to justify using it. There might not be enough people sharing the same documents to see much performance improvement, and every time someone made an edit to a document the cache would need to be updated. I suppose it depends on how they check for updates, etc.
@JulioHOR2 ай бұрын
Congrats for the content Kiki!
@kikisbytes2 ай бұрын
Thank you!
@rulofmgАй бұрын
if their user is exploding again then they would need to do the same thing again right? is this the industry standard on scaling the database or did they just stuck on this tech? I feel like this kind of scaling will hit a wall sometimes soon
@BenHouston3D2 ай бұрын
Just continually sharding their DB across more and more machines seems like a linear solution to their exponential user growth. Isn't there something they can change in their architecture to avoid needing 96 separate DB instances? That is sort of ridiculous.
@KenSnyder12 ай бұрын
My thought too. I suspect they could make the application much smarter by putting in-progress work into a non-sql database to avoid frequent writes to postgres. Also, one row for each text block seems over normalized. End armchair analysis.
@user-dc9zo7ek5j2 ай бұрын
Their team is big (It says that they are around 500 total employees), probably around 200, working on different parts of the app. Most of them probably fall into "this is not my job" or "I don't have enough power to say" type of situation and they keep patching.
@Zuriki092 ай бұрын
@@KenSnyder1 seems like it would just shift the problem to another system. OK, your pgsql isn't getting hammered with writes, but now your redis, mongodb, etc. is and then it's still going to push all that data to pgsql anyway and also you have to pull down from both pgsql for committed data and then reconcile that with uncommitted data in your intermediate store in order to get consistency for the user. For users they also tend to notice read delays more than write delays unless the write delay is substantial or catastrophically fails. Besides which, this video is narrowly focused on how they fixed specifically a database problem. We don't know if they already had other performance solutions in place such as caching unchanged blocks or whole documents to avoid database reads.
@JoãoLinharesGomes2 ай бұрын
Yeah, it kind of seems like they should've stuck with writing to a NoSQL database like Dynamo and streamline everything to be stored in the postgres database, maybe. Maybe they didn't do it because Notion needs immediate reads after writing data as events, but that would be probably faster using Kafka. But who am I to tell this is the best solution. That's not easy at all Imao.
@RatonBroyeurАй бұрын
@@JoãoLinharesGomes One of their goal was also to reduce cost. Introducing Dynamo to such a large model would certainly not do that :D
@codingprojects18672 ай бұрын
Thank you for the Heavenly Path cameo!
@atomiccoding2 ай бұрын
Awesome video! How do you make such awesome animations?
@blubblurb2 ай бұрын
Can and do they do backups?
@zweitekonto96542 ай бұрын
Is this the same as db normalisation.
@somedayitsgonnamakesense2 ай бұрын
as a newbie Sol Archi. my brain hurts lmao
@kikisbytes2 ай бұрын
Haha dw some day it’s gonna make sense 😉
@kikisbytes2 ай бұрын
Jokes aside how was your transition to solution architect?
@whistlingtree87562 ай бұрын
beautifully illustrated
@Oakbit2 ай бұрын
This is amazing!
@gibzrival15652 ай бұрын
would mongoDB solve scaling issues or a hybrid of both🤔
@kikisbytes2 ай бұрын
yeah possible but I think with MongoDB you'll eventually hit issues as well. Maybe something like cassandra/scylladb which is what discord uses and they have crazy performance. But then again notion document has a lot of relationships, so might be hard.
@kyratking46732 ай бұрын
Noting to never interview for Notion XD.. But jokes aside, it's a huge effort collaborating with the team all the while maintaining the development of such a feature.. kudos to the team
@scottzeta3067Ай бұрын
This video overwhelms me🤯
@raghavmahajan33412 ай бұрын
tldw: sharding + better connection pooling + pub-sub based migration
@mortal_coder4869Ай бұрын
Hi Kiki. I enjoyed this video. In the future try to slow down a little during presentation & graphics for a better learning experience.
@mohitkumar-jv2bx2 ай бұрын
Awesome video. and definitely great engineering. But i am not getting why they are storing the actual "content" in postgres. I mean they could store the metadata in the postgres but the actual content in something like s3 or maybe minio? could it be because they were trying to use postgres to handle "concurrent updates" to the same blob? I would definitely like to understand their reasoning to do so
@jackdavenport50112 ай бұрын
I assume images and file attachments would probably be in S3 or something, but I'm guessing blocks that only contain text can probably just go directly into the database.
@mohitkumar-jv2bx2 ай бұрын
@@jackdavenport5011 ok. Could be. In the video, the was a “content” column showing jpg, text etc. thats why i asked.
@kikisbytes2 ай бұрын
Jack did a good explanation here! Yeah mb I definitely should of use something else to reference the image. I was too focused on subscribe to kiki's bytes -.-
@mohitkumar-jv2bx2 ай бұрын
@@kikisbytes got it. I was just trying to understand. but a great video. i have this video saved so that i can use it to revise the partitioning and sharding a relational database.
@kikisbytes2 ай бұрын
@@mohitkumar-jv2bx awesome!! Don't mean to self plug here but my last video goes a bit more in detail with partitioning and sharding if you're curious. It just talks about how we can take a single server and scale it up and take a look at system design topics. Not sure if that will be useful at all to you but I'll leave the link below if you're curious :) Thank you again for taking the time to watch this video! ☺ kzbin.info/www/bejne/fKmkoKBobrR4gacsi=DNWcmKsaq0nBIyby
@adamjones960015 күн бұрын
Such a great video. How do you get this info in curious? Subscribed.
@mathesukk14 күн бұрын
the refs are in the description
@PatMofRockies2 ай бұрын
You deserve more subscribers.
@mlocateАй бұрын
Having a record for each block of the document is crazy, I wonder what was the reason behind this decision.
@mateuszz75412 ай бұрын
Why they chose postgres?
@shauryatomer10582 ай бұрын
awesome video dude, thanks for this great video
@kikisbytes2 ай бұрын
Thank you for taking the time to watch this video!
@suttanab17632 ай бұрын
What about firestore ? 😮
@Friendry2 ай бұрын
Really enjoying your videos, keep them up!
@reggielj2 ай бұрын
I'm not smart enough to be here.
@kratosgodofwar7772 ай бұрын
Bro for real I'm gonna shard myself in a minute
@lukababu2 ай бұрын
@@kratosgodofwar777 "Go shard yourself" might be the most CS insult ever
@hd_y2 ай бұрын
yeah same, i'm just nodding the entire time like i know what i'm watching
@mohagungnursalim82192 ай бұрын
Great channel 🎉
@PostMasterNick2 ай бұрын
The things that come to mind when I see this: replication and upgrades. Good luck Notion!
@kiran13792 ай бұрын
What is the Team count at that time ?
@kikisbytes2 ай бұрын
Don't quote me on this but iirc back in 2021 they had around 150 employees. So their engineering team is probably 50-70 I wanna say. I could be wrong
@ayonsamajder2 ай бұрын
Another top level video
@kikisbytes2 ай бұрын
Thank you so much!
@blue_lobster_Ай бұрын
thank you for this good explanation
@mikebean.2 ай бұрын
I wonder why they did not use a document database from the get go
@Alphabet123332 ай бұрын
Notions dark read testing? 7:30 this term is not existing
@Pipe04812 ай бұрын
That was an awesome explanation, I almost understood some of it! Not your fault though, I'm not the brightest
@kikisbytes2 ай бұрын
Thank you for watching and please let me know how I can improve to make it even easier to understand!
@jackdanielson19972 ай бұрын
@@kikisbytes I personally think this video was perfectly paced and is the right length of time for what it covered. You obviously need some background in the concepts to understand them, so making it easier to understand would be to actually teach the concepts / technologies as well which would be an entirely different video, in my opinion.
@leomysky2 ай бұрын
Thanks for the video Amazing job
@yoursweetyguyАй бұрын
how do you know?
@vedangmirashi2 ай бұрын
Awesome in-depth video. As stated in some other feedback comment, it might be a bit overwhelming for beginners or people with non-expert level of tech understanding (who are majority of the target audience on KZbin). You could maybe incorporate some short explainations in about a concept (shard, pgbouncer, etc.). People who are interested in learning that concept can always go to a more detailed in-depth video (you can also route them to your topic related videos if available) More power to you and good luck! Subscribed
@xetera2 ай бұрын
I disagree, it's nice to see a channel just tell an animated story like an engineering blog without watering everything down to a tutorial like every other channel
@tajniak0811Ай бұрын
Why not noSQL?
@stxnw2 ай бұрын
Why would you only want 20% utilisation on your instances? Isn’t that an underutilisation of resources?
@rajumondal42832 ай бұрын
It's left for unwelcomed spikes
@kikisbytes2 ай бұрын
yeah good question! I think that the reason for that is they wanted each instance to have lower load. The 20% utilization was post sharding so it's definitely going to go up as data starts to fill each shard more in the future. Considering how much they've grown, it definitely more than 20% now.
@aleksandrephatsatsia45302 ай бұрын
Hello, thats a amazing content!!! keep doing and you will become 10ml channel soon!!! what do you use for animations?
@kengreeff2 ай бұрын
Amazing video!
@kikisbytes2 ай бұрын
Ken!!! Omg thank you for taking the time to watch this video!!
@aadarsh83062 ай бұрын
Awesome make more videos explaining these stuff
@kikisbytes2 ай бұрын
Thank you, will do for sure!!
@Daniel-i8v2iАй бұрын
why not just use cockroachdb instead of manually sharding
@thiagomiranda39 күн бұрын
I don't get why they use this block model. Aren't those blocks only used in a single document? Why put this in multiple rows instead of creating a single document row that contains all blocks? This would decrease the number of requests to the database so much, they maybe wouldn't even need to scale up so much
@Gabzes17 күн бұрын
Definitely seems to me like a document database would be the solution to store that type of data
@kukukudoes45822 күн бұрын
You are a 1 million KZbinr in disguise
@kikisbytes18 күн бұрын
hahaha that means alot! One day we'll get there!
@_prothegee2 ай бұрын
96 cpu still overwhelmed? w00t?
@srki222 ай бұрын
This shows why it was better to use a distributed DB in the first place. Cassandra, DynamoDB...
@quentin.aventure2 ай бұрын
Exactly, would be interesting to calculate the technical debt due to Postgres in that case vs using a distributed solution
@supersai41982 ай бұрын
"in the first place", oh wow we got a genius over here.
@taylorjohnsonctАй бұрын
Imagine being the new guy on the DB team at Notion...
@taffareldelimaoliveira2 ай бұрын
imagine going to the meeting with stakeholders and explaining to them why the billing jump 400% in one month.
@user-dc9zo7ek5j2 ай бұрын
It seems to me that they have overengineered their architecture and are solving problems the hard way, because they are smart enough to do it. KISS.
@Jumezki2 ай бұрын
Great video and channel overall! Just some feedback: I found the voice-over speed a bit too fast for educational content like this, which made it challenging to fully absorb all the information. Slowing down the player to 0.75x speed makes it too slow and isn't a practical solution. Perhaps a slight reduction in the speaking pace would enhance the learning experience. Hope this helps with finding the right pacing. Keep up the great work, you've just gained a new subscriber! 🤩 Edit: I would say the current speed feels like it's at 1.05x when it should be at 1.00x, just a touch too fast.
@Kylian192 ай бұрын
nah perfect for me
@kikisbytes2 ай бұрын
Thank you for the feedback!!! This is noted and I will try to make the pacing a better for the next video.
@shadowpenguin34822 ай бұрын
I watched this at 2x like most content and I considered reducing the speed to 1.5x but ultimately wasn’t necessary
@abhaykrishna83682 ай бұрын
It was good enough speed
@rafael_nasАй бұрын
I could not disagree more, english is not even my native language and I had no trouble to get all the content at 1x
@halcyonramirez646914 күн бұрын
So they essentially implemented a b-tree?
@xsuritox10582 ай бұрын
What did I just listen to at 4 in the morning
@romeshjayawardene35512 ай бұрын
Great video thanks
@shreyashraj2 ай бұрын
Great video. To the point without any zig zag, but the audio do not feel natural.
@kikisbytes2 ай бұрын
Thank you for the feedback. I'm still trying to figuring out audio so please bear with me while I get the right settings :)
@WaxPaxler2 ай бұрын
db migrations are always painful, great to see they had a solution
@adziak2 ай бұрын
Next level of DB scalability is Decentrailzed Storage solutions.