Apache Hadoop YARN: How YARN changed Hadoop from v1 to v2

Рет қаралды 34,555

Hortonworks

Күн бұрын

Learn about the impact of Apache Hadoop YARN on Hadoop, and how it transforms Hadoop 2 into a Data Operating System.

Пікірлер: 9

@roguelitedev 9 жыл бұрын

I literally have goosebumps I'm so excited!! :D

@jameskpl 11 жыл бұрын

Horton works, thank you so much for the video. A quick question - is there a way to manage the data that's going into hdfs like to check for duplicates. For an eg: we upload data (several GB's and all structured) for the day. And we are asked to upload data after couple of weeks. Is there a way to check/compare the data that's being uploaded now to the data that was uploaded before. So we don't end up having 6 copies of the same data (limit to 3 with replication). Would really appreciate any feedback. Thank you, James.

@dukegaming2231 7 жыл бұрын

jameskpl in hadoop 2 if over replication is done among datanodes, it will thow overReplicatedBlock exception therefor Replication balancers should be run ie define threshold or specify datanodes

@charleygrossman8368 9 жыл бұрын

One cluster to store them all.

@homoudalshammari9139 11 жыл бұрын

Hi I like the question that jameskpl posted. I would add a simple point which is since the data source file has the same and needed to be uploaded into the same NameNode? Is that can be considered as a duplication or overwritten ? Thank you... Hamoud

@MAZEN_TAEMIN 8 ай бұрын

here cuz i'm studying hadoop and it's version at the moment in 2024

@vivek2319 7 жыл бұрын

Arun looks pissed :D What's the matter Arun? Somebody give him Hadoop to play with ;) #IYKWIM :D

@sn20 11 жыл бұрын

the life of me... I still cannot understand why in the hell they call YARN as MR2? To me it sounds a like a layer of abstraction for resource management. & Now you have to go through YARN if you need something done on hdfs. (May be another secondary name node in the making...) in other words - Dismantle existing MR and reorg it. More importantly open up the processing unit underlying HDFS to other applications. Let them all fight for cpu time via YARN