Learn MapReduce with Playing Cards

  Рет қаралды 343,472

Jesse Anderson

Jesse Anderson

11 жыл бұрын

The special extended preview of my new MapReduce screencast available for purchase at pragprog.com/screencasts/v-jam....
To get access to my updated and in-depth course, go to my site at www.jesse-anderson.com/big-dat... and sign up. You'll get a free mini-course and then have the option to purchase the full 8 week course.

Пікірлер: 135
@kart00nher0
@kart00nher0 8 жыл бұрын
This is by far the best explanation of the MapReduce technique that I have come across. I especially like how the technique was explained with the least amount of technical jargon. This is truly an ELI5 definition for MapReduce. Good work!
@jessetanderson
@jessetanderson 8 жыл бұрын
+Subramanian Iyer Thanks!
@smushti
@smushti 5 жыл бұрын
An innovative idea to use a pack of cards to explain the concept. Getting fundamentals right with an example is great ! Thank you
@djyotta
@djyotta 8 жыл бұрын
Very well done - not too slow, yet very clear and well structured.
@Useruytrw
@Useruytrw 10 жыл бұрын
Jesse may you get all SUCCESS and BLESSINGS
@doud12011990
@doud12011990 8 жыл бұрын
really cool one. It is always nice to come back to the basics. Thanks for that one
@sukanyaswamy
@sukanyaswamy 9 жыл бұрын
Great presentation. The visualization makes it so much easier to understand.
@vivek3350
@vivek3350 7 жыл бұрын
Really liked your way of presentation....."Simple" and "Informative". Thanks for sharing!!
@ekdumdesi
@ekdumdesi 8 жыл бұрын
Great explanation !! You Mapped the Complexity and Reduced it to Simplicity = MapReduce :)
@vamsikrishnachiguluri8510
@vamsikrishnachiguluri8510 2 жыл бұрын
what a great effort, i am astonished by your teaching skills.we need teachers like you.Thanks for your best explanation .
@thezimfonia
@thezimfonia 6 жыл бұрын
That was very helpful Jesse. Thank you for sharing this!!
@aravindsingirikonda1569
@aravindsingirikonda1569 5 жыл бұрын
Wonderful explanation ! Made it very simple to understand! Thanks a ton!
@prasann26
@prasann26 9 жыл бұрын
Wow.. You have made this look so simple and easy... Thanks a ton !!!
@menderbrains
@menderbrains 4 жыл бұрын
Great explanation! This is how a tutor should simplify the understanding! Thanks
@ahmedatallahatallahabobakr8712
@ahmedatallahatallahabobakr8712 9 жыл бұрын
Your explanation is majic! Well done
@vscid
@vscid 8 жыл бұрын
and that's how you explain any technical concept. simple is beautiful!
@mgpradeepa554
@mgpradeepa554 9 жыл бұрын
The explanation is wonderful.. You made me understand things easily.
@rodrigofuentealbafuentes695
@rodrigofuentealbafuentes695 3 жыл бұрын
Really good illustration.... really easy to understand for people as me that we are not computer experts.. thanks
@abhishekgowlikar
@abhishekgowlikar 10 жыл бұрын
Nice video explaining the Map Reduce Practically.
@tkousek1
@tkousek1 7 жыл бұрын
Great explanation!! worth a bookmark. Thank you sir!
@rohitgupta025
@rohitgupta025 9 жыл бұрын
Just wow...very nicely explained
@amitprakashpandeysonu
@amitprakashpandeysonu 2 жыл бұрын
loved the idea. Now I understood how map reduce works. Thank you.
@rahulx411
@rahulx411 9 жыл бұрын
an ounce of example is better than a ton of precept! --Thanks, this was great!
@davidy2535
@davidy2535 3 жыл бұрын
amazing explanation! I love it. Huge Thanks!
@user-or7ji5hv8y
@user-or7ji5hv8y 5 жыл бұрын
best explanation of mapReduce. Thanks!
@gboyex
@gboyex 6 жыл бұрын
Great video with good explanation technique.
@nkoradia
@nkoradia 7 жыл бұрын
Brilliant approach to teach the concept
@arindamdalal3988
@arindamdalal3988 8 жыл бұрын
really nice video and explain the terms in a simple way...
@krupakapadia2498
@krupakapadia2498 7 жыл бұрын
Great Explanation! Thanks!
@rogerzhao1158
@rogerzhao1158 8 жыл бұрын
Nice tutorial! Easy to understand
@mahari999
@mahari999 8 жыл бұрын
Superb. Thank you Jesse Anderson
@AlexChetcuti
@AlexChetcuti 7 жыл бұрын
Thanks this really helped me for my exam !!
@anandsib
@anandsib 9 жыл бұрын
Good Explanation with simple example
@abdulrahmankerim2377
@abdulrahmankerim2377 7 жыл бұрын
Very useful explanation.
@amandeepak8640
@amandeepak8640 8 жыл бұрын
Thank You sir for such a wonderful explanation. :-)
@hazelmiranda8587
@hazelmiranda8587 8 жыл бұрын
Good to understand for a layman! So its quite crucial to identify the basis of the grouping i.e. the parameters based on which the data should be stored in each node. Is it possible to revisit that at a later stage?
@scottzeta3067
@scottzeta3067 2 жыл бұрын
The only one I watched which can clearly introduce mapreduce to newbie
@go_better
@go_better 4 ай бұрын
Thanks! Great explanation
@patrickamato8839
@patrickamato8839 9 жыл бұрын
Great summary - thanks!
@tichaonamiti4616
@tichaonamiti4616 9 жыл бұрын
Thats wonderful ..... you are a gret teacher
@TheDeals2buy
@TheDeals2buy 10 жыл бұрын
Good illustration using a practical example...
@anmjubaer
@anmjubaer 4 жыл бұрын
Great explanation. Thanks.
@trancenut81
@trancenut81 9 жыл бұрын
Excellent explanation!
@vincentvimard9019
@vincentvimard9019 8 жыл бұрын
just great explanation !
@Dave-lc3cd
@Dave-lc3cd 4 жыл бұрын
Thanks for the great video!
@rodrigoborjas7727
@rodrigoborjas7727 3 жыл бұрын
Thank u very much for the explanation.
@victorburnett6329
@victorburnett6329 2 жыл бұрын
If I understand correctly, the mapper divvies up the data among nodes of the cluster and subsequently organizes the data on each node into key-value pairs, and the reducer collates the key-value pairs and distributes the pairs among the nodes.
@jessetanderson
@jessetanderson 2 жыл бұрын
Almost. Hadoop divvies up the data, the mapper creates key value pairs, and the reducer processes the collated pairs.
@sebon11
@sebon11 3 жыл бұрын
Great explanation!
@gypsyry
@gypsyry 5 жыл бұрын
Best explanation. Thanks a lot
@grahul007
@grahul007 8 жыл бұрын
Excellent video explanation
@arnavanuj
@arnavanuj 2 жыл бұрын
Good illustration. 😃
@mmuuuuhh
@mmuuuuhh 8 жыл бұрын
To wrap this up: Map = Split data Reduce = Perform calculations on small chunks of data in parallel Then combine the subresults from each reduced-chunk. Is that correct?
@jessetanderson
@jessetanderson 8 жыл бұрын
+mmuuuuhh Somewhat correct. I'd suggest buying the screencast to learn more about the code and how it works.
@alphacat03
@alphacat03 8 жыл бұрын
+mmuuuuhh merge-sort maybe?
@kemchobhenchod
@kemchobhenchod 7 жыл бұрын
divide and conquer
@BULLSHXTYT
@BULLSHXTYT 5 жыл бұрын
Map transforms data too
@dennycrane2938
@dennycrane2938 5 жыл бұрын
No no... Map = Reduce the Data, Reduce = Map the Data . .... ....
@vigneshrachha8362
@vigneshrachha8362 7 жыл бұрын
Superb video....thanks a lot sir
@irishakazakyavichyus
@irishakazakyavichyus 6 жыл бұрын
thanks! that is an easy explanation!
@rrckguy
@rrckguy 9 жыл бұрын
Great lesson. Thanks..
@piyushmajgawali1611
@piyushmajgawali1611 3 жыл бұрын
I actually did this with cards.Thanks
@Luismunoz-jf2zv
@Luismunoz-jf2zv 9 жыл бұрын
Now I get it, thanks!
@sarthakmane2977
@sarthakmane2977 4 жыл бұрын
great video by the way!!
@alextz4307
@alextz4307 5 жыл бұрын
Very nice, thanks a lot.
@LetsBeHuman
@LetsBeHuman 5 жыл бұрын
4:51 - - i'm kind of lost. so you said two papers as two sets of nodes. left is node1 and right is node2. then you said, "I have two nodes, where each node has 4 stacks of cards". I also understood that you are merging two varieties of cards in node1 and another two varieties of cards in node2. " a cluster is made of tens, hundreds or even thousands of nodes all connected by a network". so in this example, let's say two papers(nodes) are one cluster. the part I get confused is , when you say " the mapper on a node operates on that smaller part. the magic takes the mapper data from every node and brings it together on nodes all around the cluster. the reducer runs a node and knows it has access to everything with same key ". So if there are two nodes A and B that has mapper data, then the reducer part will happen on two other nodes C and D. I'm confused when you say "on nodes all around the cluster".
@LetsBeHuman
@LetsBeHuman 5 жыл бұрын
When you say nodes and clusters, does an input file of 1TB should definitely be run in more than one computer or we can install hadoop in a single laptop and virtually create nodes and clusters ?
@logiprabakar
@logiprabakar 9 жыл бұрын
Wonderful, you have used the right tool(cards) and made it simpler. Thank you. Am i correct in saying, in this manual shuffle and sort, the block size is 52 cards where as in a node it would be 128.
@SamHopperton
@SamHopperton 7 жыл бұрын
Brilliant - thanks!
@devalpatel7243
@devalpatel7243 5 жыл бұрын
Hat's of man. very well understood
@AnirudhJas
@AnirudhJas 5 жыл бұрын
Thanks Jesse! This is a wonderful video! I have 2 doubts. 1. Instead of sum, if it is a sort function, how will splitting it into nodes work? Because then every data point should be treated in one go. 2. The last part on scaling, how will different nodes working on a file and then combining based on key, be more efficient than one node working on one file? I am new to this and would appreciate some guidance and help on the same.
@jessetanderson
@jessetanderson 5 жыл бұрын
1. This example goes more into sorting github.com/eljefe6a/CardSecondarySort 2. It isn't more efficient, but more scalable.
@AnirudhJas
@AnirudhJas 5 жыл бұрын
@@jessetanderson Thank you!
@guessmedude9636
@guessmedude9636 6 жыл бұрын
i like this technique nice keep it up
@ZethWeissman
@ZethWeissman 8 жыл бұрын
Might be a bit clearer to understand the advantage of this if instead of having the same person run the cards on each node sequentially and have two people do it at the same time. Or go further and have four people show it. Then each person can grab all the cards of the suit from each node and can sum their values up, again, at the same time. Show a timer showing how long it took for the one person to do everything on one node and the time of having all four running at the same time.
@kabirkanha
@kabirkanha 3 жыл бұрын
Never trust a man whose deck of playing cards has two 7s of Diamonds.
@furkanyigitozgoren3847
@furkanyigitozgoren3847 2 жыл бұрын
It was very nice. But I could not find the video that you showed the shuffling "magic part"
@bijunair3807
@bijunair3807 9 жыл бұрын
Good explanation
@bit.blogger
@bit.blogger 10 жыл бұрын
6:16 got a question! Would you please elaborate more on those moving data? Since there is two separate reduce task on those two nodes how does two different reduce tasks combine together? How do we choose which cards move to which node?
@jessetanderson
@jessetanderson 10 жыл бұрын
That is called the shuffle sort. See more about that here www.inkling.com/read/hadoop-definitive-guide-tom-white-3rd/chapter-6/shuffle-and-sort.
@chandrakanthpadi
@chandrakanthpadi 3 жыл бұрын
Does the actual data in the node moves or copies of the data is moved?
@IvanRubinson
@IvanRubinson 6 жыл бұрын
Well, that explains the interview question: How would you sort a ridiculously large amount of data?
@user-ho2kf2xr7v
@user-ho2kf2xr7v 8 жыл бұрын
Great video
@hemanthpeddi4129
@hemanthpeddi4129 4 жыл бұрын
awesome explanation super
@hexenkingTV
@hexenkingTV 6 жыл бұрын
So it follows mainly the principle of divide and conquer?
@jessetanderson
@jessetanderson 6 жыл бұрын
Following that analogy, it would be divide, reassemble, and conquer.
@MincongHuang
@MincongHuang 9 жыл бұрын
Great video, thanks for sharing !
@ajuhaseeb
@ajuhaseeb 9 жыл бұрын
Aiwa. Simply explained.
@__-to3hq
@__-to3hq 5 жыл бұрын
wow this was great
@paulfsch3108
@paulfsch3108 6 жыл бұрын
Hi Jesse, can I use map reduce only on document-oriented DBs, or also e.g. on Graph databases?
@jessetanderson
@jessetanderson 6 жыл бұрын
Hessebub you can use it for both, but the processing Algorithms are very different between them.
@paulfsch3108
@paulfsch3108 6 жыл бұрын
Alright, thanks very much for answering & doing the video in the first place!
@iperezgenius
@iperezgenius 7 жыл бұрын
Brilliant!
@yash6680
@yash6680 6 жыл бұрын
awesome
@amirkazemi2517
@amirkazemi2517 9 жыл бұрын
greta video. why is there performance issues with hadoop however?
@jessetanderson
@jessetanderson 9 жыл бұрын
I'm not sure what you mean by performance issues.
@ShoaibKhan-hy5nf
@ShoaibKhan-hy5nf 6 жыл бұрын
The magic part u mentioned in the video resides in reducer or Map?
@jessetanderson
@jessetanderson 6 жыл бұрын
Shoaib Khan mostly in between those two phases
@urvisharma7243
@urvisharma7243 Жыл бұрын
What if the node with clubs and hearts breaks down during the reduce operation? Will data be lost? Or will the complete Map Reduce job be repeated using the replicated data?
@jessetanderson
@jessetanderson Жыл бұрын
The data is replicated and the reduce would be re-run on a different node.
@pamgg1663
@pamgg1663 9 жыл бұрын
excellent!!!
@RawwestHide
@RawwestHide 6 жыл бұрын
thanks
@niamatullahbakhshi9371
@niamatullahbakhshi9371 8 жыл бұрын
so nice
@moofymoo
@moofymoo 9 жыл бұрын
huge 1Tb file.. anyone watching this in 2065?
@NuEnque
@NuEnque 5 жыл бұрын
February 2019 (Go RAMS)
@anmjubaer
@anmjubaer 4 жыл бұрын
@@NuEnque July 21 2019
@devanshsrivastava4265
@devanshsrivastava4265 4 жыл бұрын
feb 2020
@jonathannimmo9293
@jonathannimmo9293 4 жыл бұрын
more like 2025
@omrajpurkar
@omrajpurkar 3 жыл бұрын
August 11, 2020!!
@mudassarm30
@mudassarm30 8 жыл бұрын
spade clubs ... I think you used the wrong suite names for them :)
@covelus
@covelus 6 жыл бұрын
awesome
@abdellahi.heiballa
@abdellahi.heiballa 4 жыл бұрын
my friend: i wish i had ur calm we having an exam tomorrow you watching how playing cards....
@sarthakmane2977
@sarthakmane2977 4 жыл бұрын
dude, whats the name of that magic??
@lerneninverschiedenenforme7513
@lerneninverschiedenenforme7513 9 жыл бұрын
little bit long explanation. could be done faster (e.g. card-sorting). But after watching, you know what's happening. So all thumbs up!
@ZFlyingVLover
@ZFlyingVLover 5 жыл бұрын
The 'scalability' of hadoop has to do with the fact that the data being processed CAN be broken up and processed in parallel in chunks and then the results can be tallied by key. It's not an inherent ability of the tech other than HDFS itself. Like most technology or jobs for that matter the actual 'process' is simple it's wading through the industry specific terminology that has makes it unnecessarily complicated. Hell you can make boiling an egg or making toast complicated too if that's your intent.
@jessetanderson
@jessetanderson 5 жыл бұрын
Sorry, you misunderstood.
@ZFlyingVLover
@ZFlyingVLover 5 жыл бұрын
@@jessetanderson I didn't misunderstand you. Your explanation was great.
@thiery572
@thiery572 6 жыл бұрын
Interesting. Now I want to request a bunny comes out from a hat.
@haroonrasheed9739
@haroonrasheed9739 9 жыл бұрын
Great
@MuhammadFarhan-ny7tj
@MuhammadFarhan-ny7tj 3 жыл бұрын
Which music is this in start of this video
@jessetanderson
@jessetanderson 3 жыл бұрын
I'm not sure where they got it from.
@sumantabanerjee9728
@sumantabanerjee9728 6 жыл бұрын
Easiest explanation.
@varshamehra8164
@varshamehra8164 4 жыл бұрын
Cool
@kart00nher0
@kart00nher0 8 жыл бұрын
IMO the key takeaway from the video is that MR only works when: a. There is one really large data set (e.g. a giant stack of playing cards) b. Each row in the data set can be processed independently. (e.g. sorting or counting playing cards does not require knowing the sequence of cards in the deck - each card is processed based on information on the face of card) To process real-world problems using MR, the data sets will need to be massaged and joined to satisfy the criteria listed above. This is where all the challenges lie. MR itself is the easy part.
@jessetanderson
@jessetanderson 8 жыл бұрын
+Subramanian Iyer agreed MR is difficult, but the understanding of how to use and manipulate the data is far more complex. This is why I think data engineering should be a specific discipline and job title. www.jesse-anderson.com/big-data-engineering/
@glennt1962
@glennt1962 5 жыл бұрын
This is a great example video without the accent to deal with.
Understanding HDFS using Legos
15:03
InfoQ
Рет қаралды 148 М.
1🥺🎉 #thankyou
00:29
はじめしゃちょー(hajime)
Рет қаралды 84 МЛН
MapReduce - Computerphile
6:41
Computerphile
Рет қаралды 250 М.
Map Reduce Paper - Distributed data processing
9:26
Defog Tech
Рет қаралды 48 М.
HBase with Playing Cards
7:01
Jesse Anderson
Рет қаралды 17 М.
What is MapReduce?
5:37
internet-class
Рет қаралды 242 М.
Basic Introduction to Apache Hadoop
14:00
Hortonworks
Рет қаралды 235 М.
Apache Hadoop & Big Data 101: The Basics
16:56
Cloudera, Inc.
Рет қаралды 224 М.
Big Data Technologies. Лекция 3. MapReduce
8:58
Михаил Ровнягин
Рет қаралды 6 М.
Introducing Apache Hadoop: The Modern Data Operating System
1:16:44
What is MapReduce?
8:34
Tutorialspoint
Рет қаралды 70 М.
WWDC 2024 - June 10 | Apple
1:43:37
Apple
Рет қаралды 10 МЛН
i like you subscriber ♥️♥️ #trending #iphone #apple #iphonefold
0:14
TOP-18 ФИШЕК iOS 18
17:09
Wylsacom
Рет қаралды 713 М.