One-hot Encoding explained

Рет қаралды 113,985

Күн бұрын

In this video, we discuss what one-hot encoding is, how this encoding is used in machine learning and artificial neural networks, and what is meant by having one-hot encoded vectors as labels for our input data.
🕒🦎 VIDEO SECTIONS 🦎🕒
00:00 Welcome to DEEPLIZARD - Go to deeplizard.com for learning resources
00:30 Help deeplizard add video timestamps - See example in the description
05:30 Collective Intelligence and the DEEPLIZARD HIVEMIND
💥🦎 DEEPLIZARD COMMUNITY RESOURCES 🦎💥
👋 Hey, we're Chris and Mandy, the creators of deeplizard!
👉 Check out the website for more learning material:
🔗 deeplizard.com
💻 ENROLL TO GET DOWNLOAD ACCESS TO CODE FILES
🔗 deeplizard.com/resources
🧠 Support collective intelligence, join the deeplizard hivemind:
🔗 deeplizard.com/hivemind
🧠 Use code DEEPLIZARD at checkout to receive 15% off your first Neurohacker order
👉 Use your receipt from Neurohacker to get a discount on deeplizard courses
🔗 neurohacker.com/shop?rfsn=648...
👀 CHECK OUT OUR VLOG:
🔗 / deeplizardvlog
❤️🦎 Special thanks to the following polymaths of the deeplizard hivemind:
Tammy
Mano Prime
Ling Li
🚀 Boost collective intelligence by sharing this video on social media!
👀 Follow deeplizard:
Our vlog: / deeplizardvlog
Facebook: / deeplizard
Instagram: / deeplizard
Twitter: / deeplizard
Patreon: / deeplizard
KZbin: / deeplizard
🎓 Deep Learning with deeplizard:
Deep Learning Dictionary - deeplizard.com/course/ddcpailzrd
Deep Learning Fundamentals - deeplizard.com/course/dlcpailzrd
Learn TensorFlow - deeplizard.com/course/tfcpailzrd
Learn PyTorch - deeplizard.com/course/ptcpailzrd
Natural Language Processing - deeplizard.com/course/txtcpai...
Reinforcement Learning - deeplizard.com/course/rlcpailzrd
Generative Adversarial Networks - deeplizard.com/course/gacpailzrd
🎓 Other Courses:
DL Fundamentals Classic - deeplizard.com/learn/video/gZ...
Deep Learning Deployment - deeplizard.com/learn/video/SI...
Data Science - deeplizard.com/learn/video/d1...
Trading - deeplizard.com/learn/video/Zp...
🛒 Check out products deeplizard recommends on Amazon:
🔗 amazon.com/shop/deeplizard
🎵 deeplizard uses music by Kevin MacLeod
🔗 / @incompetech_kmac
❤️ Please use the knowledge gained from deeplizard content for good, not evil.

Пікірлер: 119

@deeplizard 6 жыл бұрын

Machine Learning / Deep Learning Tutorials for Programmers playlist: kzbin.info/aero/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU Keras Machine Learning / Deep Learning Tutorial playlist: kzbin.info/aero/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL

@harshtiwari9588 5 жыл бұрын

KZbin recommendation system must be really broken since it is not recommending me this magic!

@deeplizard 5 жыл бұрын

Thanks Harsh! Hopefully soon 🦎🚀

@audioplatform6299 3 ай бұрын

They are hypocrites,, money money money,,, they have turned KZbin like a third class blog site which is flooded with ads,,, This video is phenomenal, thank you

@bhang96 3 жыл бұрын

- One Hot encoding is one way of converting categorical data into a (sparase) vector - Each one hot code represent the category of the data. So that machine can interpret each word. - As you add more category, the length of the vector increases.

@John-vm2zi 3 жыл бұрын

It reminds me of dummy variables.

@s3arajgupta 3 жыл бұрын

OMG. Such a brilliant explanation. I have been watching your videos for almost 3 hours and couldn't stop myself to watch the next one each time. Subscribed, liked and commented. Thank you so much.

@fazankabir2029 5 жыл бұрын

OMG. Such a brilliant explanation. I have been watching your videos from past almost 1 hour and couldn't stop myself to watch the next one each time. Subscribed, liked and commented. Thank you so much. You are life saver!

@NotBob223 6 жыл бұрын

Just went through the entire series to date and found it very helpful and easy to follow. I'm sure you have.a plan all mapped out but further topics might be in why you would want/need to add layers, how many nodes/neurons you'd want in each layer and how to optimize those (I.e. find out you don't have enough or too many), finding out when you have hit local minimums, choosing different functions such as relu, tanh, etc. You're really on a good roll here and I look forward to more!

@deeplizard 6 жыл бұрын

Thanks for the suggestions, Brent! I'll definitely keep these topics in mind for future videos. I'm glad you're finding the videos helpful so far!

@odds8564 Жыл бұрын

Within 3 minutes my minds been completely opened

@Wopara 9 күн бұрын

Thank God I came across your channel, scientist these days just like to sound smart

@Shkencetari 5 жыл бұрын

This was really well explained. Thank you very much.

@cryptorick8109 6 жыл бұрын

Love the channel!!! Killin it! Subscribed!

@deepcodes 4 жыл бұрын

Good quality of both video and blogs, Really amazed by your work!!

@SaidElnaffar 4 жыл бұрын

A very high-quality video -- I got the idea in no time. Thanks.

@sterlinglozalee9926 5 жыл бұрын

While the concept itself is incredibly simple, I keep hearing so many people over-complicate or not explain why it's specifically called "one-hot" encoding. Great video! Exceptionally well explained!

@deeplizard 5 жыл бұрын

Thank you, Sterling!

@ashita1130 4 жыл бұрын

Thank you so much :) Clear and to the point explanation!!!!!

@synthstatic9889 2 жыл бұрын

I'm in a coding bootcamp and this explained the concept far better than the lesson.

@tymothylim6550 3 жыл бұрын

Thank you so much for this video! Now I finally understand what this term means!

@muhammadtalhabaig4908 4 жыл бұрын

Thank you! You explain things really well! 😃

@cloudkungfu 4 жыл бұрын

This explanation was super clear! Awesomeee

@milenaramirez1341 6 жыл бұрын

Easy to understand, even for foreigners. Thanks a lot! :)

@princendhlovu2502 3 жыл бұрын

You did a great job explaining!

@shilpashreeshivappa8942 3 жыл бұрын

Very helping...Thank you.

@yousraizountar5768 4 жыл бұрын

this video is just amazing, thank you very much

@John-wj4yf Жыл бұрын

very helpful- thank you!!

@1travel29 3 жыл бұрын

That was soo helpful! Thanks so much for the animations and detail in the explaination

@qusayhamad7243 3 жыл бұрын

thank you very much for this clear and helpful explanation.

@firdaousslotfi9616 4 жыл бұрын

thank you, this was really helpful !!

@MohammadMahadiHassain 5 жыл бұрын

best & clear explanation

@carlasanchez4349 Жыл бұрын

This was great, thank you for sharing

@galfrasian 3 жыл бұрын

Thank you really cleared my concept. Big Fan

@vatsal_gamit 3 жыл бұрын

Just Woow!! Thank You for this video!! Keep Doing the Good work :)

@billywhite1403 3 жыл бұрын

great video! really good teaching, simple and engaging. bless you for making this

@mastermax7777 11 ай бұрын

2:40 watching starting here for those 10 seconds just clicked it for me. Thanks

@broncioaguilar1757 Жыл бұрын

😊perfect explanation! Loved the animals too 🐈‍⬛🐕🦎🦙 thanks!!

@sergijlebedyn6011 5 жыл бұрын

brilliant explanation

@ujjwalkumar8173 3 жыл бұрын

To the point explanation.. Thank you for this quality content :)

@agrisuccess6698 3 жыл бұрын

It's really very well explained

@ahmeddiderrahat3415 2 жыл бұрын

Thanks a lot. Saved my 3 hours.

@ylazerson 5 жыл бұрын

great video - thanks!

@TD-ph3wb 3 жыл бұрын

great explanation, thanks

@sushilarya6994 4 жыл бұрын

Thanks for explaining in layman terms. Liked and subscribed.!

@aydinahmadli7005 5 жыл бұрын

great explanation!

@Waleed-qv8eg 6 жыл бұрын

Thumbs up! Thanks

@souravsarkar488 3 жыл бұрын

Thanks a lot!

@robind999 5 жыл бұрын

it answered my question, thanks,

@1matzeplayer1 2 жыл бұрын

Great content!!!

@xavierloux 6 жыл бұрын

Thanks !

@michelleli8751 5 жыл бұрын

Thank you :D

@nuwanchathuranga7226 5 жыл бұрын

Good explanation, keep doing more videos :)

@cw9249 5 жыл бұрын

great explanation

@adilm9857 4 жыл бұрын

well explained!!

@narendraparmar1631 3 жыл бұрын

Thanks a lot😁

@noorhassanwazir8133 Жыл бұрын

Nice explanation

@leesuiping9725 4 жыл бұрын

Well explain

@horace7763 5 жыл бұрын

namemes que bonito vídeo!!

@shiva_kondapalli 5 жыл бұрын

This is an awesome Series. Clear, concise and crisp. A question though, I trained a Vgg19 network using Pytorch to classify species of flowers. The categories were 102: 0 through 101. I din't have to one hot encode them, so when do you have to do this? I get that you may want to do it when you have multiple classes for the same input. You could then read off a vector like [1 0 0 1] as 1st and 4th class. The other reason given is that the model may give greater weight to a higher number, so does that mean it will predict the higher integer more often, can anyone shed some light on this? I got decent results (~ 95% accuracy on validation) without one-hot encoding. Do you have to always one-hot encode? Thanks!

@bisnar1307 4 жыл бұрын

Thank youuu :D

@pmtycoon 5 жыл бұрын

subscribed. no questions asked ;)

@gaykarsamir 5 жыл бұрын

Nice Video :)

@chann2467 3 жыл бұрын

4:26 I thought [0,0,0,1] is going to be a turtle 😂. thanks again for the videos

@chawza8402 4 жыл бұрын

is there an encoding type that uses all possible variations of binary numbers? for example, we can use 2 binary to represent 4 classes (00,01,10,11). I just thought that would make more space efficient.

@mattbutawan8764 3 жыл бұрын

what if you want to have both categorical and continuous inputs to the same NN?

@fernandojackson7207 2 жыл бұрын

Another great video. Just curious, though, I keep hearing that at least 80% of machine learning involves Data Cleaning. Is that the case? If so, do you have any video training on that?

@whalingwithishmael7751 4 жыл бұрын

Do gradient boosted trees also use one hot encoding?

@arshakzakaryan2773 5 жыл бұрын

Great explanation. You mentioned about your previous video to check. Can you post the link?

@deeplizard 5 жыл бұрын

Yes, here you go! kzbin.info/www/bejne/povSqpKgaL6mh8k

@silenta237 6 жыл бұрын

You rock!

@deeplizard 6 жыл бұрын

Appreciate that, silenta :)

@tribo327 3 жыл бұрын

i'm still not clear on why the image tags couldn't be encoded as numerical values just going up from 0? What is the benefit behind this?

@arminharper510 4 жыл бұрын

Much easier than i expected. But why can't the NN take these inputes as decimal numbers? Like 1 2 3 or as the sum of the ascii equivalents of each Character in their name?

@lifemystery7992 2 жыл бұрын

I have a question: if i have 4 columns in my dataSet, so the input layer of my NN will be as [x1, x2, x3, x4] right ! suppose we have categorical column with 4 categories... my imagination is, when we apply the one-hot encoding to the categorical column we will get more 4 columns for [cate1, cate2, cate3, cate4] right? the question is that i'm right? or it will be compressed into one label. if i'm right that's mean our input layer in the NN will be [ numerical columns + all categories ]

@isaquemelo8134 6 жыл бұрын

Thank you again for these videos! I would like to know if this is the best approach for large data.. for instance, I have 10000 categories, which approach would be adequate?

@deeplizard 6 жыл бұрын

Hey Isaque - Yes, one-hot encoding is still be feasible for a larger category space, like the one you mentioned.

@MrAwesomenes5 5 жыл бұрын

Cool but what happens if we add a fifth elemend? How does the vector look like then? /s

@sushilarya6994 4 жыл бұрын

one more vector is added of 0 and 1 with length 5. it could be [0,0,0,0,1]

@leonardodavinci4259 4 жыл бұрын

@@sushilarya6994 woooosh

@LightAura 4 жыл бұрын

Shouldn't we always leave one category out when one-hot encoding? as it impacts our models by introducing correlation. Is it not always the case or was this left out not to over complicated the concept?. Thank you for the explanation and the whole series.

@theone3746 4 жыл бұрын

But how do you code this?

@anto1756 3 жыл бұрын

What about integer Encoding?

@NisseOhlsen 5 жыл бұрын

Great vid, thanks, subscribed. Now, is that lizard ([0, 0, 1]) called Elizabeth?

@deeplizard 5 жыл бұрын

Haha no, it's not 🦎

@NisseOhlsen 5 жыл бұрын

@@deeplizard hadda ask....

@danwe3237 4 жыл бұрын

I do understand one-hot-encoding but why is it necessary to use random vectors for classification instead of a random variable? Why is it not sufficient to use a mapping 1 for cat, 2 for dog, 3 for lizard and so on?

@tanyamalhotra9505 4 жыл бұрын

Hey thanks for the amazing explanation!! But could I know why we simply can't assign them integer values? For instance map dog to 0, lizard to 1, cat to 2 and so on.... Why is one hot encoding needed?

@shahhuzaifa2746 2 жыл бұрын

Because these names of animals are categorical variables i.e., there is no relationship between cat and dog. They are discrete names. And the same goes for the city's name e.g., Boston, New York etc. So when you encode them integer wise, it is a misrepresentation of the data, since the algorithm would interpret them as the interval variables i.e., you could 3 is bigger than 1, 2 is bigger than 1. So the algorithm will try to harness this relationship which is not correct. Therefore we encode categorical variables into one-hot encoding.

@DrKhan-hd4cd 5 жыл бұрын

what about a snake or a horse? when we use softmax the overall probability must be equal to 1. How will the neural network know to output all 0s?

@deeplizard 5 жыл бұрын

Hey Dr. Khan - The network will only predict classes for which it has a corresponding output class. For example, if the network only knows about dogs and cats (2 output nodes), and we pass it a snake, then the network will give corresponding probabilities only for whether it thinks the snake image is more like a dog or a cat. In the snake case, the network may output around 50% probability for dog and 50% for cat. This indicates that the network is not confident at classifying this sample, and the prediction is no better than chance, which makes sense because the image is neither a dog or cat.

@thespam8385 4 жыл бұрын

{ "question": "In supervised learning, the index of the \"hot one\" corresponds to a(n):", "choices": [ "Label", "Action", "Function", "Neuron" ], "answer": "Label", "creator": "Chris", "creationDate": "2019-12-13T04:14:11.847Z" }

@deeplizard 4 жыл бұрын

Thanks, Chris! Just added your question to deeplizard.com

@HoustonPL 4 жыл бұрын

Deeplizard, I did a "deep learning" course on udemy. Your course is incomparably more thought-out, refined and the graphical side beats the hell out of all other courses. And all this for free! I am glad that there are such people and I hope that you will never run out of motivation to continue sharing your knowledge. All the best for you and keep up the great job. P.S. your voice is sexy

@iulianflester 3 жыл бұрын

When should we use this encoding? why not just simply assign dog to nr 1 cat to nr 2 and so on?

@logangraham2956 9 ай бұрын

now i just need a vector to contain every single english world plus any words that don't exist yet or/and typos. it's going to be one really really long vector... (ya not using this method😅)

@belliumm 4 жыл бұрын

What's the benefit of doing one-hot encoding as opposed to label encoding

@yensteel 4 жыл бұрын

Suppose cat is 1 and dog is 2, lizard is 3 and it predicts 1.5 as a single variable? Would it be reliable? Even more so, if a picture has ambiguity between dogs and lizards in the image (stupid analogy), this could pose a problem and the predictor may make an entirely incorrect prediction of a value close to 2. One hot encodings put independent, separate, equal weights for every categorical prediction. As encoding is binary, the prediction could be: [0.6, 0, 0.4] and the prediction would select the highest number, so choice 1 with the second choice 3. This is more robust because it selects the outcome it's most confident in, as the NN is a probalistic model, and each of the outcomes does not interfere with each other in an unusual way. The neural network works with numbers, and Categorical values are not really numbers. It has to be translated into a pure, numerical value that it can work with. A binary set of "yes it is" or "no it's not" array works well

@karanbari5712 5 жыл бұрын

What if I have features in factors( cat,dog,lizard,llama) itself like what if I want my algorithm to differentiate like black dog ,white cat, small lizard,big llama? @deeplizard

@deeplizard 5 жыл бұрын

Hey Karan - For this, you would need labels and output classes for each possible category. For example, rather than the class being cat, you would instead have more descriptive and granular classes like black cat, white cat, orange cat, etc.

@asasdasasdasdasdasdasdasd 6 жыл бұрын

What is an case where One-hot encoding isn't the preferred method?

@deeplizard 6 жыл бұрын

Hey Anton - Good question. One-hot encoding wouldn't be preferred in a scenario where your labels have a natural order to them, _and_ you want to preserve this order. If your labels were, say, height of individuals, then you may not want to one-hot encode them because, if you did, then your network wouldn't be able to extract/learn anything regarding the order.

@danielparrado3605 6 жыл бұрын

great explanation! Thank you

@neilmedina997 6 жыл бұрын

i think the better explanation is one hot works better on classification rather than regression

@hakuma2527 3 жыл бұрын

Why One-hot Encoding is used? Why the label can't be 1, 2 ,3 and so on?

@rrestoring_faith Жыл бұрын

This is not specific to neural networks. Other than that good work.

@takeyoshix 4 жыл бұрын

No, this explanation is not complete. I would have expected to hear WHY one should use One-Hot (e.g. categorical cross-entroppy in TensorFLow) instead of ordinal category indexes (sparse categorical entropy). Following Richard Feynmans statement, "You cannot explain stuff well until you really understand it."... so I wonder if the channel owner knows what the one-hot encoding is, but not really how it could help). And one of the first question of a beginner (who sees the first set of tutorials and examples) will be, "Why one-hot [ 1, 0, 0 ] and not just [ 0, 1, 2 ] (ordinal)??". (In terms of TensorFlow training, e.g. CNN on an multi-category image dataset like CIFAR or MNIST, both approaches are possible, but which one would yield a better NN performance? Would there be a difference at all?)

@ameynaik2743 2 жыл бұрын

Why is one-hot encoding used?

@Paul-qn7qv 3 жыл бұрын

Why would you use one-hot encoding when you can use integers? For example, 1 maps to cat, 2 maps to dog, and 3 maps to lizard.

@ashwinv8305 3 жыл бұрын

The normal integer encoding can be done for ordinal data. Your example does not have any ordinal relationship between them so one hot encoding is preferred.

@swaekim9402 3 жыл бұрын

can you please replace my professor

@lerneninverschiedenenforme7513 4 жыл бұрын

Hm.. I think this is a very complex looking explanation for a very simple topic "one-hot encoding". Maybe you could add "deep learning" to the title. That way people would see, that this is not (only) a simple explanation on the very old and easy one-hot encoding.

@bhz8947 2 жыл бұрын

I’ve never seen something so simple so over-explained.

@deeplizard 2 жыл бұрын

Yeah. It is. Big 🧠 💪

@grimonce 5 жыл бұрын

(...) who came up with this name?

@robertc6343 3 жыл бұрын

All 6 minutes could have been easily squeezed to 6 seconds by saying that the labels are passed in a vector. OK, max 30 secs. The term one-hot is not explained btw.

@blackshirtsquad7883 5 жыл бұрын

I love the channel. Kinda dislike this video tho :O .... it's not complicated what you're saying. This should be a two minute video, not drawn out to six.

@Danny-sj6es 4 жыл бұрын

Spent too much time talking about how categories map to the vectors and not enough time on why you one hot encoding even matters. Like, why don't we just assign a number for each categories rather than making it a vector?