Huffman Codes: An Information Theory Perspective

  Рет қаралды 237,374

Reducible

Reducible

Күн бұрын

Пікірлер: 318
@SamBrickell
@SamBrickell 3 жыл бұрын
I love the stories where students solve unsolved problems just because the professor neglected to tell them it was difficult.
@CaioMizerkowski
@CaioMizerkowski 3 жыл бұрын
Me too, those are the best stories.
@AA-gl1dr
@AA-gl1dr 3 жыл бұрын
"Just call it entropy, nobody really knows what it is" Truly an intelligent man.
@stevenneuberger4323
@stevenneuberger4323 2 жыл бұрын
I think either John Von Neuman was joking or just being a dickhead
@busterdafydd3096
@busterdafydd3096 10 ай бұрын
10:59 yea what they mean is that entropy seems to be what they are explaining, and if someone disagrees, you can ask them, make a mathematical proof (different to ours) that defines "entropy". what Shannon found I think is entropy. but sadly I don't think @Reducible is explaining it quite right. Entropy and prediction should not be in the same sentence. Entropy is looking at data/environment and asking how much information can be used to define this information. high entropy is when the information cannot be described with less information. repeating patterns of course the easiest defined by less information the the original information. Prediction (inference-ing) has a time/event element to it. Or if you did approach a repeating pattern it doesn't look at the whole pattern and is always asking from what I can see of the pattern so far what would be the next part of it? something is not very predictable if you can't guess what comes next. it sounds the same but really should be careful to overlap these ideas.
@paradoxicallyexcellent5138
@paradoxicallyexcellent5138 3 жыл бұрын
The section leading up to 8:00 epitomizes a problem solving technique that sets apart mathematicians: rather than directly search for a formula that describes a thing, instead list what *properties* you expect that thing to have. Make deductions from those properties, and eventually all kinds of insights and formulas fall on your lap.
@sixhundredandfive7123
@sixhundredandfive7123 3 жыл бұрын
Would encoding something in Base Three (using either "0, 1, 2" or "0, 1, .5" ) increase the amount of information transmitted because there's an extra number or decrease the amount of information transmitted because you can make more complicated patterns?
@klobiforpresident2254
@klobiforpresident2254 2 жыл бұрын
@@sixhundredandfive7123 Base three (or any alphabet of three symbols) can encode more information *per symbol*, which means fewer symbols are needed to represent some information (a message). That would mean messages are shorter but have the same overall amount of information.
@user_2793
@user_2793 2 жыл бұрын
Yeah that's also how you actually obtain the determinant, the most striking example to me. You prove that the 'set' of all alternating (swapping two rows makes it negative) multilinear (breakup of the determinant based on the row property) forms on matrices (with elements belonging to a commutative ring with identity), and which outputs 1 for the identity matrix is a unique function, the determinant. From this definition you can obtain a nicer result that is every alternating n linear form D on n n-tuples which give the matrix A on stacking must satisfy D(A) = detA D(I). This is used to prove a lot of properties concerning determinants.
@jimboli9400
@jimboli9400 3 жыл бұрын
The legend is back. I love your work, the production quality, the content, everything! You are the computer science equivalent of 3b1b.
@muskyoxes
@muskyoxes 3 жыл бұрын
Hopefully not the equivalent of the 3b1b who apparently stopped making videos
@Mogwai88
@Mogwai88 3 жыл бұрын
@@muskyoxes looks like he is that too lol
@multiarray2320
@multiarray2320 2 жыл бұрын
@@muskyoxes b...but he still makes videos...
@strictnonconformist7369
@strictnonconformist7369 2 жыл бұрын
The wisest thing Huffman’s professor did was not mention it was an unsolved problem. Great, clear presentation, others on youTube are so fast they require you to pretty much already know how it works before you watch it, which is insane.
@TheQueencircleAcademy
@TheQueencircleAcademy 6 ай бұрын
👌🏽
@chopper3lw
@chopper3lw Жыл бұрын
Many years ago after studying this algorithm I sent a fan letter to D. Huffman. He wrote back in appreciation because he'd never gotten a fan letter before.
@mementomori7160
@mementomori7160 3 жыл бұрын
20:26, "optimal binary representation must have the two least likely symbols at the bottom of the tree" My first thought: "So can we count it as one node and repeat the process? Nah, that can't be right, that would be too easy" XD turns out it really is easy
@r75shell
@r75shell 3 жыл бұрын
You actually need to proof that you may build node with two of those. Also, author of video didn't prove why they should have longest number of bits, it's not so obvious, you need to apply exchange argument, most basic one though.
@OmarChida
@OmarChida 3 жыл бұрын
At that moment I thought "just apply the same process recursively" which turns out to be the case
@alexismandelias
@alexismandelias 3 жыл бұрын
Having also learned the Huffman encoding algorithm as an example of a greedy algorithm (along with its accompanying greedy-style proof), this video provided me with a new perspective which, combined with the interesting history of alternative algorithms, gave me a fuller understanding and appreciation for the topic, which I have to admit is extremely well presented!
@erumabo
@erumabo 3 жыл бұрын
I really love this algorithms that look so obvious and simple yet I know that even with all the time on the world I couldn't have invented them myself.
@wampwamp1458
@wampwamp1458 3 жыл бұрын
This video is an amazing compilation of information, in a well presented order and in an engaging way. Information well encoded! Cheers
@aethrya
@aethrya Жыл бұрын
Man it's so amazing that any of this works at all. Being an autodidact philomath and self educated amateur mathematician and cryptographer, I am so glad to have all of this information available to learn at my own pace. Thank you.
@emmamovsesyan
@emmamovsesyan 2 жыл бұрын
The smallest shift in perspective can sometimes make all the difference
@kwinvdv
@kwinvdv 3 жыл бұрын
The Huffman encoding algorithm looks very similar to dynamic programming. So I wondered which came first, and it seems that they where both development around the time, namely around the year 1952.
@smolboi9659
@smolboi9659 3 жыл бұрын
Yea I googled and apparently, this is a greedy algorithm that converges to the global optimal. Greedy algos and dynamic programming are similar in that they utilise the subproblems to build up the full solution. In general greedy algorithms only converge to local optimizers as they do not exhaustively check all subproblems and is faster. DP is slower but global optimality is guaranteed. For this problem the greedy algo guarantees the global optimal and we have the best of both worlds.
@aethrya
@aethrya Жыл бұрын
10:50 JvN's second and most important reasoning for the term _entropy_ is truly the genius of a mad mathematician.
@yakov9ify
@yakov9ify 3 жыл бұрын
21:09 S_1 S_2 don't HAVE to be on the same node on the optimal tree, in fact they can be exchanged with any other nodes on the bottom layer. What is instead true is that you always can construct an optimal tree with them paired from one where they are not through the aforementioned exchange.
@grahamhenry9368
@grahamhenry9368 2 жыл бұрын
Huffman encoding is only optimal if each symbol is conditionally independent of all previous symbols, which is almost never the case, which is why Huffman encodings didn’t solve compression, though they are typically used as part of a broader compression algorithm. Ray Solomonoff did a lot of important work in this area with his invention of algorithmic complexity theory
@redpepper74
@redpepper74 2 жыл бұрын
Ooh interesting. In the equation constraints section it was mentioned (almost as just a side note) that the events had to be independent of each other, and for a moment I wondered what if they weren’t independent? Now I want to know :0
@grahamhenry9368
@grahamhenry9368 2 жыл бұрын
@@redpepper74 You can compress a lot more when the symbols are conditionally dependent upon previous symbols, but you need to use other methods, typically those that rely upon conditional probability
@grahamhenry9368
@grahamhenry9368 2 жыл бұрын
@@redpepper74 Huffman encodings are based purely on symbol frequencies, and don’t take into account what symbols came before. So for example, if you see the sequence of symbols “Mary had a little” you can immediately recognize that the next four symbols are very likely to be “lamb”, with a high degree of probability. Huffman codes do not see this
@redpepper74
@redpepper74 2 жыл бұрын
@@grahamhenry9368 I suppose what you could do then is get a separate Huffman encoding table for each symbol that tells you how likely the next symbol is. And instead of using single characters you could use the longest, most common groups of characters
@grahamhenry9368
@grahamhenry9368 2 жыл бұрын
@@redpepper74 Yeap, I actually coded this up once, but only looking back at the last character. The important idea here is that the ability to predict future symbols directly correlates with how well you can compress the data because if a symbol is very probable then you can encode it with fewer bits via a Huffman encoding. This shows that compression and prediction are two sides of the same coin. Markus Hutter has done a lot of work in this area with his AI system (AIXI) that uses compression algorithms to make predictions and attempts to identify optimal beliefs
@FreestateofOkondor
@FreestateofOkondor 3 жыл бұрын
Your presentation is entertaining, thought-provoking and truly educational. One of the best channels on KZbin in my opinion.
@ShefsofProblemSolving
@ShefsofProblemSolving Жыл бұрын
I just recently learned about Huffman encoding and this video is absolutely AMAZING. You really motivated the solution incredibly well! Congratulations and hope you make more videos
@niofer7247
@niofer7247 3 жыл бұрын
Great video. Though I was a bit confused in the end and going further in depth with an example of using the huffman encoding and decoding, while comparing it to uncompressed encoding would make it a bit clearer. Still an amazing video!
@Mutual_Information
@Mutual_Information 3 жыл бұрын
This is very impressive. It takes a lot of hard to present so much so well. Excellent vid!
@RequiosWoW
@RequiosWoW 3 жыл бұрын
I'm taking numerical analysis course right now, and my chapter was just on Huffman Codes, great timing!
@viveksomani8462
@viveksomani8462 2 ай бұрын
This video does an excellent job giving an intuition. It's still complicated for someone who hasn't done much with information theory, so I had to pause to really understand some parts of it -- but this video helps a lot and makes the topic really interesting. This idea of entropy in information theory is also used in seemingly unrelated areas, like deciding which splits to take with a decision tree (or random forests).
@Some.username.idk.0
@Some.username.idk.0 3 жыл бұрын
Yoooo, your videos are awesome, great day when you upload
@rccowboys
@rccowboys 2 жыл бұрын
This has got to be the most amazing video I have ever seen. Seriously! It was absolutely the most intriguing thing I have never pondered. Great explanation skills for slow people like me but yet very entertaining at the same time. Thank you sir for your effort and information!
@NeotenicApe
@NeotenicApe 3 жыл бұрын
People be sleeping on this channel. Incredible content.
@anasbekheit5479
@anasbekheit5479 3 жыл бұрын
I think there's a missing negative sign in the lengths at 16:11,anyway one of the best videos on the topic, I subscribed awhile ago and i don't regret it .
@pafloxyq
@pafloxyq 3 жыл бұрын
Really loved it! It was knowing about Huffman codes that made me take information theory classes. It is really an elegant piece of mathematics.
@joeeeee8738
@joeeeee8738 3 жыл бұрын
This channel is pure gold
@gplgomes
@gplgomes 2 жыл бұрын
Guy, thank you for this amazing video.
@maxheadrom3088
@maxheadrom3088 5 күн бұрын
When I first learned how Huffman made a variable bit length signal get decoded I was, like, really happy!
@alexrhinefield123
@alexrhinefield123 3 жыл бұрын
How am I just finding this channel? This guy’s awesome
@tru1988
@tru1988 2 жыл бұрын
Your voice is great, the visuals are on point, and I finally understand Huffman codes. Great job; subscribed!
@tejaswibiduru5131
@tejaswibiduru5131 3 жыл бұрын
Dude long time no see.. I am so happy that you are back. Please make more videos...
@17ashwinipatel72
@17ashwinipatel72 3 жыл бұрын
woahhhhhhhhhh 🔥🔥🔥🔥🔥🔥 I absolutely have no words for this masterpiece. The way you explained it, now I'm going to take my information theory classes seriously 😂 Thanks a lot. Great work! Sharing this with my friends as this deserves lot more appreciation ♥️
@typingcat
@typingcat 2 жыл бұрын
23:40 I have seen other videos about the Huffman tree construction before watching this video, but this explanation is clearest. I was not sure how the nodes after the initial two nodes should be added.
@cj96up
@cj96up Жыл бұрын
Thank you, it's really helpful!
@Danielle_1234
@Danielle_1234 3 жыл бұрын
Best explanation of Huffman Encoding I've seen. Bravo!
@huffmancollins614
@huffmancollins614 2 жыл бұрын
Hey its me Huffman. I made these codes.
@antiraedus
@antiraedus 3 жыл бұрын
Wow. Great video! The way you explained it was so clear and succinct that I guessed "Oh shit it's recursion!!" at 23:12. Good work on the organisation of explaining topics & pacing, love this channel!
@fredg8328
@fredg8328 3 жыл бұрын
Very good video. That's the first one I see on youtube that put together Shannon-Fano and Huffman coding and tries to explain the differences between them. Most of them only explain the Huffman algorthm and discard all the history behind it
@pellythirteen5654
@pellythirteen5654 Жыл бұрын
I've watched many of your beautiful presentations and found them very instructive. This one not only clearly demonstrates Huffman's idea , but also the "bottleneck" in any communication channel formalized by Shannon himself.
@buzzyproton5885
@buzzyproton5885 3 жыл бұрын
Underrated channel
@Darth_Pro_x
@Darth_Pro_x 3 жыл бұрын
Just yesterday I wondered when you'll upload next. Awesome video as usual!
@rexlin28
@rexlin28 2 жыл бұрын
Holy!! Thanks for the journey and nice video! :D
@mystic3549
@mystic3549 9 ай бұрын
I already knew huffman's algo before watching this video....6:53 had me goosebumps :) i mean what---?!?!?!? the correlation omg ... tysmmm for this _/\_
@abu-bakrmohamed1707
@abu-bakrmohamed1707 6 ай бұрын
What an exceptional video , that was so fun !
@JayLikesLasers
@JayLikesLasers 2 жыл бұрын
I'm only just starting to learn about Information Theory - but this was very accessible. Thanks, subscribed.
@moopoo123
@moopoo123 3 жыл бұрын
This was so fun to watch. Please keep making more videos.
@leyasep5919
@leyasep5919 3 жыл бұрын
The Huffman algo is so brilliant... GIF's LZW is another system that sounded obscure until I dared looking and it was in fact much simpler than I thought. It would be a great subject for your next videos.
@adodge2
@adodge2 3 жыл бұрын
3:20 I think these are misrepresented in the video. 1) "single symbol -> unique binary code' doesn't mean you can't map multiple symbols to a single code, it means you can't map the same sequence of symbols to multiple codes. 2) "Lossless" means the decoded message has to be exactly the same as the source, not an approximation of the original. Nothing to do with deleting random bits.
@telemanchos7986
@telemanchos7986 3 жыл бұрын
Incredible educative video, I respect the amount of work you put into this! I enjoyed the overview of the field and the connection between information and probability theory was splendidly shown! I'm looking forward to your future videos!
@romanpleshkov1125
@romanpleshkov1125 2 жыл бұрын
man I love this channel. so well made.
@PeterDebney
@PeterDebney Жыл бұрын
There was a lot of information in that video, and I could decode it all! 🙂 It is definitely not easy to explain something so well, so thank you.
@aaryunik
@aaryunik 9 ай бұрын
Lovely explanation and storytelling!
@Ouuiea
@Ouuiea 3 жыл бұрын
As always, amazing quality of a video!! loved it. Please keep doing them
@stefanioan7569
@stefanioan7569 3 жыл бұрын
Great video. I really appreciate the hard work and explanations you put in your videos.
@dixztube
@dixztube 2 жыл бұрын
You broke this down soooo well
@redpepper74
@redpepper74 2 жыл бұрын
This was a really cool watch! The more I immerse myself into information theory the more it interests me :D I do disagree with the part about the importance naming things though- names have power! The roots in a word tell you what it means, the meanings of a word in some contexts hint at the meaning in others, and the connotation that a word has affects the way that people think about it. (Yes my passion for writing clean code has turned me into someone who thinks way too much about naming conventions :P )
@jursamaj
@jursamaj 2 жыл бұрын
On the other hand, words can develop meanings well beyond their origins. Also, even the original meaning doesn't always perfectly correlate to the roots. It just isn't as simply as you claim.
@atonurc
@atonurc 3 жыл бұрын
27:05 I think it would be more eye-pleasing if the codes were written in a monospace font.
@madvoice3703
@madvoice3703 3 жыл бұрын
Thank you sir because of your vedio I learned how uncertainty is compressed in our nature
@ゾカリクゾ
@ゾカリクゾ 3 жыл бұрын
The day Reducible uploads is a good day
@devashishmittal
@devashishmittal 2 жыл бұрын
Brilliant video! Thanks for your wonderful effort.
@emperorpingusmathchannel5365
@emperorpingusmathchannel5365 3 жыл бұрын
Thank you for keeping it precise but also intuitive!!!
@ViktorFerenczi
@ViktorFerenczi 3 жыл бұрын
Good visual explanation. But the part actually encoding a message into bits an decoding the message was completely missing. So while the method how we get the optimal encoding is clear, how to apply the Huffman encoding is not explained at the end. Also, this compression rate applies only in the limiting case (infinite stream lengths) where sending the distribution (tree) to the decoder is negligible. In real data compression the compressed stream, file or block must include the distribution as well.
@Reducible
@Reducible 3 жыл бұрын
Good feedback, thank you. These details were part of the script at one point, but I kind of went back and forth on whether I should include it in the final version of the video. I decided to not include them to maintain focus on the actual motivation of Huffman codes, but thank you for bringing these details up in this comment. They are indeed an essential part of doing this in practice.
@wChris_
@wChris_ 3 жыл бұрын
@@Reducible maybe in a follow-up video you can explain how you would implement this in practice and maybe how its done in the real world like the Deflate algorithm
@atrus3823
@atrus3823 3 жыл бұрын
I completely disagree with this critique. There are already a lot of videos showing how to compute Huffman codes and plenty of tutorials on programming an encoder/decoder. This focuses on the what is missing from the usual teaching (you could say it maximizes information entropy): an intuitive, high-level overview of how the code evolved, where it came from, what it means, etc., as well as beautiful visuals. Usually, videos on these kinds of topics are dry, textbook descriptions with nothing but formulas and step-by-step instructions on computation. I also think that when being introduced to new ideas, introducing too many practical, real-world considerations can muddy the waters and make things really hard to understand. Once a good understanding is in place, then as you introduce complexity, it makes sense where it is coming from and why it matters.
@drabart6121
@drabart6121 3 жыл бұрын
@@atrus3823 It was not a video premise to show how to compute them, but a brief explanation of decoding would have been nice.
@ItsDrike
@ItsDrike 3 жыл бұрын
I disagree, he did explain how to turn binary trees into bit representations at the start, while this wasn't done in the code example, in my opinion, that's fine, it's easy enough to figure out, i.e. assign 0es to left branches, assign 1s to right branches (or vice-versa), since the Node class is keeping track of what's on right/left, it really wouldn't be hard to get the binary representations, and including this in the video would just be distracting and not as important to understand how the algorithm works.
@carly09et
@carly09et 3 жыл бұрын
Nice, very clear on the logic.
@MissPiggyM976
@MissPiggyM976 3 ай бұрын
Very well done, many thanks!
@sanjaychhetri3407
@sanjaychhetri3407 3 жыл бұрын
Excellent explanation! Thanks!
@vojtechsejkora1554
@vojtechsejkora1554 2 жыл бұрын
It is really impressive how he came up with this solution - how many concept he have to understand to garantee optimal static encoding. I wonder how he can transpit particular codig for that message. And if we you modification during decoding, is there possibility to shring coding a little bit more? Like when we has message ABDCBBB, first letter we encode only A, then AB, then ABD and so on (last letter is encoding as we would encode ABDCBBB). Decoder then has to start from the end and know how many times each letter should apper.
@saulalonsopalazuelos9594
@saulalonsopalazuelos9594 3 жыл бұрын
Congratulations for posting an excellent info content!
@agustindiaz3361
@agustindiaz3361 3 жыл бұрын
Amazing video. You are very good at this.
@zuggrr
@zuggrr 2 жыл бұрын
This is really good ! Thank you ! I learned a lot.
@user-or7ji5hv8y
@user-or7ji5hv8y 3 жыл бұрын
This is truly informative.
@user-fp6dt1os1l
@user-fp6dt1os1l 3 жыл бұрын
16:14 those logarithms need negative signs in front of them log2(0.5) = -1 not 1
@kinldk2638
@kinldk2638 5 ай бұрын
Your video is amazing! Thanks for the interesting presentation
@leocelente
@leocelente 3 жыл бұрын
This is great! Amazing work
@AT-27182
@AT-27182 3 жыл бұрын
Great explanation. I appreciate your efforts very much.
@prodigy-343
@prodigy-343 3 жыл бұрын
KZbin algorithm usually sucks, just put on your feed some shitty latin music or just the stupid trending stuff, but today feels generous, long long time ago that I don't see content like this, man keep going, you have an incredible way to explain things, simple but clear and clever, amazing channel pal
@etienneboutet7193
@etienneboutet7193 3 жыл бұрын
Thank you for this video. Very informative !
@WildEngineering
@WildEngineering 3 жыл бұрын
wow, very well explained. great job sir
@3ractnodi
@3ractnodi 2 жыл бұрын
I wish I could like this twice.
@wunkewldewd
@wunkewldewd 3 жыл бұрын
Outstanding, as always.
@01k
@01k 3 жыл бұрын
Great stuff, thanks for sharing!
@caleberibeiro8673
@caleberibeiro8673 2 жыл бұрын
awesome! thank you very much for the explanation
@Sanjism-y9k
@Sanjism-y9k Жыл бұрын
Thanks for this awesome explanation !!!
@G12GilbertProduction
@G12GilbertProduction 3 жыл бұрын
This data compression algorithm presented in this video, reminds me one of a Recursive Neural Network paper contain in the one of Yannic Klicher videos.
@cfalguiere
@cfalguiere Жыл бұрын
Amazing work
@Lucky10279
@Lucky10279 2 жыл бұрын
11:00 Actually, entropy in thermodynamics is defined almost exactly the same way as information entropy in information theory. I'm not sure of the historical development or which formal definition came first, the modern day mathematical definition of entropy in thermodynamics differs from information entropy by a factor of Boltzmann's constant, and that's only there to make the SI units work out. Really, thermodynamic entropy should be dimensionless, and then the definitions would be exactly the same, but since, for historical reasons, temperature in the SI system has different units than energy, the SI unit of entropy is Joules/Kelvin and the Boltzmann constant in the definition of thermodynamic entropy just serves to make that work out.
@csvegso
@csvegso 2 жыл бұрын
what a great and still simple algorithm
@askemervigbahnson333
@askemervigbahnson333 3 жыл бұрын
Wow, great video man! I forgot multiple times that you aren't 3b1b
@hamedmirzaei9713
@hamedmirzaei9713 3 жыл бұрын
24:53 you need to be consistent in your tree representation. I mean, either always put the bigger node on the right or left. The final optimal case is still ok with what you have done but the representations are not those that huffman coding suggests.
@monsieuralexandergulbu3678
@monsieuralexandergulbu3678 3 жыл бұрын
Yayyy, new videos, so exited
@GregBakker
@GregBakker 3 жыл бұрын
Great video, thanks!
@rajivshukla7892
@rajivshukla7892 3 жыл бұрын
great video man content on your channel is awesome !!!
@kiase978
@kiase978 2 ай бұрын
Thanks for sharing the manim scripts.
@DDvargas123
@DDvargas123 3 жыл бұрын
I've always wondered about huffman codes is: don't you have to have a way to send the constructed tree to the decoder? I doubt that you can decode any arbitrary text message in exactly the same way. Like, at what point isn't it better to just have like a 6bit encoding for the 52 letters so you dont have to send the tree? Also don't you need a way to send the exact symbol as well? Like how does the receiver know that 001 is B without u saying so in the encoding of the tree you're sending (via ascii or unicode most likely)? It feels like huffman only works when you're constrained into exact specific symbols with exact and known probabilities.
@Reducible
@Reducible 3 жыл бұрын
Fantastic questions! Let me address them one by one: You do have to find a way to send the constructed tree to the decoder. Usually how this is done in practice is through an array representation of the binary tree. So it turns out to store the compressed version of the text, you need to allocate some extra memory to store this array representation. So in cases where the text can't be compressed that well or the text file is not even that large, it may not even make sense to use Huffman encoding since the overhead of storing the tree may not make it that great at all. But most of the time, in large text files with lot's of redundancy, the overhead of storing this array is pretty negligible. But you're right, in some cases a fixed-length encoding might be the best solution. You can actually use an estimate of the entropy to figure that out. The higher the entropy, the less compressibility there is in the text. And yeah that second question is a good one too. Usually you only have a constrained set of symbols whose size is significantly smaller than the total text in the file, but if it was the case that the set of symbols was almost equal to the size of the text file, Huffman encoding wouldn't really be able to help that much at all. In practice though, that's usually not the case, because the number of characters will generally be significantly larger the the set of symbols.
@unrealzocker
@unrealzocker 3 жыл бұрын
You distribute the tree beforehand or in-place - at some point your receiver must know that this bit pattern is responsible for ASCII character 'B'. In some sense you can think of it as a lookup table.
@DDvargas123
@DDvargas123 3 жыл бұрын
@@Reducible ah . would the size of the array be nearest power of 2 above the total number of unique symbols . or is there a way to consistently trim off the end (like getting all the nulls to line up)?
@r75shell
@r75shell 3 жыл бұрын
You need to have some "protocol" regarding to tree. It's either encoded in message as some kind of header, in some simple way or tricky stuff like left-to-right tree traversal, and write symbol each time you visit leaf. Or it might be initialized into some tree, and then reconstructed after each occurrence of event (or just each symbol output) - these are so called Adaptive Huffman Coding.
@txorimorea3869
@txorimorea3869 3 жыл бұрын
Just send a bit stream: 0 means node, 1 means leaf and the next N bits identify the symbol. After a node there are two elements that maybe a node or a leaf, after a leaf there is nothing more. That makes easy to reconstruct the tree, and is optimal in space.
@catapillie
@catapillie 3 жыл бұрын
great video! question: how do you then actually decode the encoded information? you'd need the tree, so, what's the point of encoding it in the first place? do you include the tree in the message?
@calvindang7291
@calvindang7291 3 жыл бұрын
Yes, you need to include the tree (or some way for the decoder to generate the tree) in the message. Thus, compression usually isn't effective for very small messages.
@catapillie
@catapillie 3 жыл бұрын
@@calvindang7291 thanks for your answer! i guess long messages can be greatly compressed if they follow a specific structure that doesn't make every character as frequent as the others. also interested in knowing how you'd send the tree over, and have it interpreted by the receiver/decoder.
@calvindang7291
@calvindang7291 3 жыл бұрын
@@catapillie Most actual compression systems do compress sections of multiple characters over as one, which means that a file with the same common patterns of characters (like a document that has words) can have a word that appears multiple times set to some value rather than only individual characters. But doing this requires optimizing sending the tree over, as that basically just expands the total alphabet. (One way I've seen is to have entries for taking [x] characters [y] characters back, with some of the info being in separate entries in the tree and others just given as extra bits in the actual compressed document) The most obvious (but very large) way to send the tree over is to go through each character that the file type supports (potentially with more bits stating that first) in sequence and simply state what it gets encoded as (with a maximum length of like 12 bits or something). This can be done in less bits, though. Every character can be categorized solely by the amount of bits it has, and exactly which character in a certain bit count that goes to each value within the bit count doesn't really matter. So, instead of providing the full encoding, it's possible to only give the bit count for each character in sequence (which is usually around 4 bits/char) and then have the decoder construct the original tree from that. But that's still not really enough to make it small. So to make it smaller, you can treat each of those values in the tree as its own "character" in another Huffman tree, which has an entry for each of the possible lengths of an entry in the actual tree, with the encoding given in the same way as above (but there's only like 15 "characters" for this set, making it much smaller). And that gets things small enough, but there's other small optimizations from there that you can find based on the algorithm.
@gdclemo
@gdclemo 3 жыл бұрын
You just need to send the tree depth for every symbol and you can build a tree that is equally efficient, assuming the sender does the same before encoding.
@PrajwalSingh15
@PrajwalSingh15 3 жыл бұрын
Amazing explanation :)
@ikituriflash
@ikituriflash Жыл бұрын
Thanks again fantastic video. 17:37 with the ShannonFano code I thought wow this thing is just guessing, it will way too often give B(0.35) two bits, although 1 bit would be better. Try with an even more extreme distribution like this D(0.14), E(0.14), A(0.15), C(0.17), B(0.4) And it will still assign two bits to B. So this must be non optimal, just from the first impression. 24:29 yeah, this is it, this Huffman looks like an optimal recursive function, just intuitively right
@asdddddaaaaaaaaa
@asdddddaaaaaaaaa 2 жыл бұрын
This was so cool. At 18:03 i was like "oooooooooooooooooh"
@Fsoza2008
@Fsoza2008 3 жыл бұрын
Great as always!
But what are Hamming codes? The origin of error correction
20:05
3Blue1Brown
Рет қаралды 2,5 МЛН
快乐总是短暂的!😂 #搞笑夫妻 #爱美食爱生活 #搞笑达人
00:14
朱大帅and依美姐
Рет қаралды 14 МЛН
How Many Balloons To Make A Store Fly?
00:22
MrBeast
Рет қаралды 173 МЛН
Information Theory Basics
16:22
Intelligent Systems Lab
Рет қаралды 70 М.
3.4 Huffman Coding - Greedy Method
17:44
Abdul Bari
Рет қаралды 1,7 МЛН
Solving Wordle using information theory
30:38
3Blue1Brown
Рет қаралды 10 МЛН
How PNG Works: Compromising Speed for Quality
32:00
Reducible
Рет қаралды 641 М.
Entropy in Compression - Computerphile
12:12
Computerphile
Рет қаралды 393 М.
10 weird algorithms
9:06
Fireship
Рет қаралды 1,3 МЛН
A* Search: How Your Map Applications Find Shortest Routes
16:17
The Most Important (and Surprising) Result from Information Theory
9:10
Mutual Information
Рет қаралды 93 М.