A&DS S04E14. Parallel Algorithms

  Рет қаралды 9,700

Pavel Mavrin

Pavel Mavrin

Күн бұрын

Пікірлер: 21
@MehbubulHasanAlQuvi
@MehbubulHasanAlQuvi 2 жыл бұрын
This was an emotional moment :) Finally the end of a long, fully FREE course on A&DS. This is the best course there is! Pavel, we want more from you. This time contents about CP will be appreciated. Also, AtCoder evening streams are great. Please continue those whenever you get time.
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
I really loved this video. I will try to implement all algos in this lecture, and see the performance improvement of the parallel algos. This is very important as almost every mainstream application is multi-threaded
@HasanAhmed-jn8zs
@HasanAhmed-jn8zs 2 жыл бұрын
Will u discuss combinatorics ?
@luckytima2315
@luckytima2315 Жыл бұрын
Видео новые будут ? :( Можете на русском снимать ? Привет из Астаны
@gattuchandrashekar5963
@gattuchandrashekar5963 2 жыл бұрын
It will be helpful if you could post all the videos in english
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
Parallel Merge Sort Hi Pavel, I've implemented 2 variations of this algo -- a fake implementation -- i.e. a single threaded program only : (1) merge sort using additional memory (2) inline merge sort, with no extra memory Both worked gr8 But I realized that lot of corner cases have to be carefully handled. In comparison, even though the concept of FFT algo is more complex, FFT algo implementation is "relatively" easier than "Parallel Merge Sort" :)
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
Filter parallel OP Hi Pavel, I understood ur style, where we first apply the filter to build the binary array (0 for odd item, 1 for even item), and then build the prefix-sum array, and using the 2 array combination add even items at appropriate index positions in result array. So, 1st OP of filtering : W, D = O(n), O(1) 2nd OP of building segment tree : W, D = O(n), O(logn) 3rd OP of building prefix-sum array : W, D = O(n), O(logn) How is below approach? : 1st OP of filtering : W, D = O(n), O(1) Instead of filling 2nd array with 0s and 1s, add empty list for odd item, and list with even item index if even item 2nd OP of building segment tree : W, D = O(n), O(logn) Instead of adding 2nd array items 2 at a time, concatenate the 2 lists. When this 2nd OP ends, at the final level of segment tree having single node, we will have the list of all index positions of even items from the original 1st array And finally build the result array with the even items at those index positions. Time complexity is same in both approaches, but we avoid building the prefix-sum array (3rd OP) in my style. Is there any defect? Please check and advise. Thanks :)
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
Parallel 2-3 BST build() algo Hi Pavel, About the "Pipelining" improvisation… When the node at level k among those logm levels of input recursively calls parallel_add() twice, for it's left and right segments, in 2 parallel threads... each of those 2 parallel threads at levels k+1... while trying to move their items from level j to level j-1 among those logn levels of 2-3 BST, should proceed only if their parent thread at level k (of input) had already moved it's item to level j-2 or lower (in 2-3 BST), else be in wait() state, and come out of wait() when their parent does notify() to it's 2 child threads This might not be too straight forward to implement! Am I correct? Please clarify :) Code : parallel_add(left_index, right index, parent_thr_id) : middle_index = (left_index + right_index) / 2 run parallel_add(left_index, middle_index + 1, current_thr_id) on thread #1 run parallel_add(left_index, middle_index + 1, current_thr_id) on thread #2 add_23BST(sorted_array[middle_index], parent_thr_id, current_thr_id) add_23BST(item, parent_thr_id, current_thr_id) : init next_level = logn while true loop if next_level == -1 -- i.e. current level of 0 i.e. root level has been handled already by current_thr_id break if parent_thr_id at level ≥ next_level wait() -- comes out of wait() when parent_thr_id runs notify() else if parent_thr_id at level < next_level if next_level == logn get greatest leaf item item0 ≤ item let parent0 be item0's parent add item as parent0's parent, and immediate right sibling to item0 if item's parent has > 3 kids split parent into 2 parents split kids into 2 groups of atmost 3 kids each assign left kids-group to left parent assign right kids-group to right parent item = item's parent --next_level notify() -- to bring any child_thr_ids out of wait() state endloop
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
Parallel Merge Sort Hi Pavel, In order to merge a pair of logn sized blocks, we've to first get the starting indices of all blocks for both arrays. We already know 1 + n/logn indices (1 is for index 0) for each array. But through binary search, we get n/logn more indices for each array, which we've to insert amongst the already known indices. So, 1 + 2 * (n / logn) indices max, associated with each array = O(n/logn) indices If we insert the new index normally into the array of indices (i.e. do linear search to get position to insert, and move few indices to the right, and finally insert index), linear search takes O(n/logn) time, and n/logn indices to insert, and logn levels -- making W = O(n²/logn) But, if we use SBBST to store the indices associated with each of the 2 arrays (ex: TreeSet in java) -- search runs in O(log(n/logn)) = O(logn) time (assuming log(logn) is close to 1; a constant value; O(1)), and insertion of each new index runs in O(1) time. So, W = “O(logn) to add each index to SBBST” * “n/logn indices” * “logn levels” = O(nlogn) Only with above improvements the overall W=nlogn and D=log²n is feasible. So, we should get T(W,D) = O(nlogn, log²n) in each aspects : (1) binary search on sorted array... to find appropriate index positions in one of the 2 arrays for few values in the other array (n/logn values) (2) insert into SBBST... to insert the new indices (found in (1) above) into an SBBST associated with each of the 2 arrays (3) merge logn-sized segment pairs Could you please review and lemme know if I've complicated things, if there is a more easier way to implement? Thanks :)
@jashdkj4902
@jashdkj4902 11 ай бұрын
Здравствуйте, Павел Юрьевич, хотел узнать можно ли у вас взять персональные занятия, если да, то подскажите куда лучше написать что бы обсудить. Я работаю уже разработчиком и хочу повысить уровень по АИСД. Чувствую что не хватает знаний )
@Agent-kt7sv
@Agent-kt7sv 2 жыл бұрын
Добрый день,Павел,скажите,пожалуйста,в каком порядке корректен просмотр серии DSA?Благодарю Вас.
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
Speed of light Hi Pavel, You say 3GHz is upperlimit, and I understood ur logic. But I googled and there is the latest Intel processor of 5.5GHz !! Nothing can be faster than speed of light is the rule. If 5.5GHz is feasible, shouldn't it mean the physical speed of anything within the processor to be able to do 5.5 billion operations per second is still below the speed of light? -- like (obviously) the speed of fan, speed of signal transfer within the chip, etc But, I'm not confident with my logic above! Cud u please clarify?
@pavelmavrin
@pavelmavrin 2 жыл бұрын
Yes, of course, 5-6 GHz is still possible. I didn’t say 3GHz is the limit, i said it’s very close to the limit, meaning that it’s very unlikely to have processors much faster.
@madhukiranattivilli2321
@madhukiranattivilli2321 2 жыл бұрын
@@pavelmavrin ok
@floppa-fy2qh
@floppa-fy2qh Жыл бұрын
Здравствуйте, Павел Юрьевич, будут ли у Вас лекции по пространственным структурам данным ? И в идеале ещё cache oblivious структуры данных )
@pavelmavrin
@pavelmavrin Жыл бұрын
А пространственные это какие? Cache oblivious знаю, но они не поместились в курс(
@floppa-fy2qh
@floppa-fy2qh Жыл бұрын
@@pavelmavrin Вообще, может быть, "пространственные" это жаргонизм, просто я где-то однажды увидел такое и сам стал так называть :) Это структуры данных где используются алгоритмы т.н. "spatial subdivision" для их построения, например: квадро/окто дерево, AABB дерево, k-d дерево, R-дерево и т.д.
@pavelmavrin
@pavelmavrin Жыл бұрын
@@floppa-fy2qh о, вот это я плохо знаю кстати, надо почитать
@floppa-fy2qh
@floppa-fy2qh Жыл бұрын
@@pavelmavrin на самом деле, в моём понимании, они являются обобщением обычного бинарного дерева поиска с операциями lower/upper bound, только в случае бинарного дерева поиска n (количество измерений) = 1, а в пространственных структурах n > 1, а смысл операций тот же :)
@madhukiranattivilli2321
@madhukiranattivilli2321 Жыл бұрын
This is a fantastic lesson Thanks Pavel :) I watched 2nd time now, after 9 months gap -- time just flies. I thought I watched just 3-4 months ago
@nikhilgupta8941
@nikhilgupta8941 2 жыл бұрын
Use English
Making an Algorithm Faster
30:08
NeetCodeIO
Рет қаралды 89 М.
Rich Sutton, Toward a better Deep Learning
31:36
Amii
Рет қаралды 2,9 М.
Spongebob ate Michael Jackson 😱 #meme #spongebob #gmod
00:14
Mr. LoLo
Рет қаралды 10 МЛН
Миллионер | 1 - серия
34:31
Million Show
Рет қаралды 2,1 МЛН
Nastya and balloon challenge
00:23
Nastya
Рет қаралды 69 МЛН
إخفاء الطعام سرًا تحت الطاولة للتناول لاحقًا 😏🍽️
00:28
حرف إبداعية للمنزل في 5 دقائق
Рет қаралды 28 МЛН
АиСД S04E12. Быстрое преобразование Фурье
1:06:14
A&DS S01E10. Dynamic programming
1:31:30
Pavel Mavrin
Рет қаралды 12 М.
Top 7 Algorithms for Coding Interviews Explained SIMPLY
21:22
Codebagel
Рет қаралды 396 М.
Is Computer Science still worth it?
20:08
NeetCodeIO
Рет қаралды 348 М.
Fast Inverse Square Root - A Quake III Algorithm
20:08
Nemean
Рет қаралды 5 МЛН
How Binary Search Makes Computers Much, Much Faster
6:51
Tom Scott
Рет қаралды 1,4 МЛН
A Graphene Transistor Breakthrough?
15:23
Asianometry
Рет қаралды 132 М.
A&DS S01E13. DP on subsets, DP on profiles
1:31:43
Pavel Mavrin
Рет қаралды 7 М.
Spongebob ate Michael Jackson 😱 #meme #spongebob #gmod
00:14
Mr. LoLo
Рет қаралды 10 МЛН