My average experience while watching Jordan's content: He starts explaining and in the first minute of the video he mentions something from his previous video, I pause and start watching his previous video and he again mentions something from his past content, and this loop goes on until I say enough is enough
@jordanhasnolife51637 ай бұрын
My average answer: start from #1
@adrian333dev7 ай бұрын
@@jordanhasnolife5163 Saw this coming but starting 60 video series after finishing two system design courses on Udemy feels intimidating
@pakkunhatake6 ай бұрын
same
@shreesharao7261Ай бұрын
@@jordanhasnolife5163 You should incorporate some quick recap just like in TV shows so that every video is self contained, unless your intention is to increase views to all your videos ;)
@jordanhasnolife5163Ай бұрын
@@shreesharao7261 haha it depends, there can be a lot of recap
@Unstoppable_gaur Жыл бұрын
Great Content would like some more of this kind. Appreciate the effort and dedication you try to make this system design videos they are helpful. These videos really made me to fall in love with the System design and I just keep reading blogs and looking out for your new videos for this knowledge.
@jordanhasnolife5163 Жыл бұрын
Thanks Joseph, means a lot!
@mayankchhabra307012 күн бұрын
If we take the example of creating a search on top of chats and if we partition it at chat_id wont that lead to an uneven distribution of data? Given elastic search has these shards and it tries to distribute the data evenly across all the shards but if we explicitly route our data to a specific shard (using chat_id in our example) it can lead to uneven distribution of data across shards where one chat might have active and other might be dormant. Just thinking out loud how we would solve for this :P (Probably distribute it evenly by using some composite key but that would defy the purpose to just search chats from one partition)
@jordanhasnolife51638 күн бұрын
Using many small partitions and balancing them appropriately I believe tends to be the preferred approach here
@andyborch98864 ай бұрын
Man I normally don't laugh at your jokes but this one actually made me laugh, I think it was mainly due to your stare at the end of the intro! 😆
@NghiaPham-o7x11 ай бұрын
Hi Jordan, great job, learn from you a lot! One thing I don't understand is you are mentioning the global index might be inefficient because we might need to send the document to many partitions. I'm wondering why do we need to send the document to many partitions? What I am thinking is, when a query comes, and we have a node to handle that query, this node will gather document lists from the indexes and merge it into a set of document ids and then query those documents from partitions. Or am I missing something?
@jordanhasnolife516311 ай бұрын
Hey! I'm saying that when we upload a document, we have to write to multiple partitions. This is because the document has many words in it!
@NghiaPham-o7x11 ай бұрын
@@jordanhasnolife5163 I see, so you are talking about write path. Out of curiosity, I'd like to discuss more on options here, as with global index, we can have other approaches: 1. Write the same document to multiple partitions -> as you said that it will make partition meaningless 2. We save the document in one partition + update the global index from other partition with distributed transaction (e.g. 2PC) Is there any flaw from the second approach or it can be used in real system? The second approach is slow on write, and it might be bad for heavy write system like logging, but I think it will benefit for a light write and heavy read system. What do you think?
@jordanhasnolife516311 ай бұрын
@@NghiaPham-o7x The second appraoch seems doable, but just consider what happens when we have to write the document to 10 partitions instead of just two haha
@msebrahim-0073 ай бұрын
I'm not really understanding the difference between of using the local index instead of a global index. It sounds like the reason not to use a global index is because it is possible for a document to be duplicated to multiple partitions, so instead a local index is used with a pointer to a document in-memory. This is where my confusion lies. It doesn't sound like a local index addresses the issue of the document being duplicated onto multiple partitions but instead just references the document locally (but it is potentially on multiple partitions) by using a pointer. If in both cases the document will be duplicated to multiple partitions, why not just use a pointer in the global index case? That way there is no scatter-gather required for a particular word.
@jordanhasnolife51633 ай бұрын
To be clear, we're denormalizing the documents. It's not a pointer to the document in the local index case, you're actually storing the document itself there. Otherwise I'd agree with you.
@sahilguleria69763 ай бұрын
@@jordanhasnolife5163 can you please explain what does denormalizing the documents means here?
@jordanhasnolife51633 ай бұрын
@@sahilguleria6976 I'm not just holding a document id in the search index, I'm holding a decent amount of document data in it
@sahilguleria69763 ай бұрын
Elasticsearch partitoning section : In partition 1 we have cherry: 47, 39. So this partition has these documents in memory. Now do the two document 47, 39 stay only in partition 1? If yes, is this how we prevent duplication? Also do all the other tokens in 47, 39 reside in the same partition?
@jordanhasnolife51633 ай бұрын
Yes, those documents just stay in that partition, as do the other tokens in 47, 39. Confused what you mean - the same document will always be hashed to the same partition, ideally.
@theblobincАй бұрын
I too have no life, thats probably why I find myself here learning about elasticsearch....
@GANJIMAN1234 ай бұрын
not clear if elastic search uses local index or global index?
@jordanhasnolife51634 ай бұрын
Local
@ryan-bo2xi Жыл бұрын
Great job sir !!
@Summer-qs7rq10 ай бұрын
Amazing video. Thanks for these informative videos. However i have a question related about elastic search. given that scatter gather is difficult to avoid in elastic search. So how much data can it scale to ? like if i want to build search on twitter now the data is growing at rapid pace. Will it be okay to store all the tweets in elastic search ? or if we need retention then what happens to the tweets that are not found in the elastic search ? could you please help answer above questions ?
@jordanhasnolife516310 ай бұрын
I think that for twitter, for example, what they would do for example is to index data by timestamp. That way, when you search for something on elastic search, it'll mainly hit the indexes for the last couple of days of data. That way there are fewer posts and there are less things to perform a "scatter/gather" for. You basically just have to be clever about how you want to shard your data.
@Summer-qs7rq10 ай бұрын
@@jordanhasnolife5163 in case of timestamp are you suggesting to search the key word for latest time and then if it not found then look into different time stamp index ? Wouldnt this make more time consuming ?
@Spyrie7 күн бұрын
Didn't know Kylo Ren do tech stuffs
@jordanhasnolife51635 күн бұрын
Surprisingly this is not the first Adam Driver comment I've gotten