Pragmatic Unicode, or, How do I stop the pain?

  Рет қаралды 75,962

Next Day Video

Next Day Video

12 жыл бұрын

Ned Batchelder
Python has great Unicode support, but it's still your responsibility to handle it properly. I'll do a quick overview of what Unicode is, but only enough to get your program working properly. I'll describe strategies to make your code

Пікірлер: 59
@AlSweigartDotCom
@AlSweigartDotCom 4 жыл бұрын
Nearly a decade later, I'm still recommending this talk as the best introduction to unicode.
@rajdeepdas4517
@rajdeepdas4517 3 жыл бұрын
Are you really AL Sweigart Love reading your book('Automate Booring stuff with python')
@dannydj2908
@dannydj2908 3 жыл бұрын
just finished the 6th chapter of Automate Boring Stuff with Python :)
@JesseHolt205
@JesseHolt205 3 жыл бұрын
I'm here watching it right now on your recommendation!
@ishankarn9523
@ishankarn9523 3 жыл бұрын
@@dannydj2908 ✌
@philj8205
@philj8205 2 жыл бұрын
Just got here from your book, Al! I'm brand new to programming and it's been invaluable so far. Ty! :)
@TommyCarstensen
@TommyCarstensen 11 жыл бұрын
FOL = Fact of Life 2:20 FOL1 - I/O is always bytes 3:25 FOL2 - The world needs more than 256 symbols 17:08 FOL3 - Need both bytes and Unicode 22:22 FOL4 - You cannot infer the encoding of bytes 23:34 FOL5 - Declared encodings can be wrong PT = Pro Tip 21:11 Unicode on the inside (decode), bytes on the outside (encode) 21:50 Know what you have - Bytes or Unicode? If bytes, what encoding? 23:58 PT3 - Test Unicode 9:07 Python 2 17:29 Python 3 24:44 Summary
@oneofmany2573
@oneofmany2573 8 жыл бұрын
I've been reading about Unicode for two days, and this talk provided the breakthrough I needed to truly understand and fix the UnicodeEncodeError that my Python 2.7 program was throwing. Thank you for taking the time to record and post this talk!
@qwe123727
@qwe123727 7 жыл бұрын
Really? Is it really a break through? There is something more hiding behind than what he explains. There are very less people who come from C background explains these things pretty well. For me, this video will add more confusion, to new comers
@DUANEYAISER
@DUANEYAISER 7 жыл бұрын
If a new comer is more confused after watching this (ok, perhaps needing to watch it twice, while running the codes themselves too), then they are not ready to deal with the issue yet. I found this very helpful.
@XinhLe
@XinhLe 6 жыл бұрын
I agree. the solution "Pain relief" is so abstract for me.
@abdoelrahmanhegazy
@abdoelrahmanhegazy 9 жыл бұрын
One of the most useful talks that exposes and explains the encoding/decoing pain clearly, Thanks Ted!!
@KrishnanNagarajan
@KrishnanNagarajan 8 жыл бұрын
An excellent talk indeed ! Many thanks for this talk - opened my eyes.
@rezamostafid8810
@rezamostafid8810 Жыл бұрын
A well rounded and concise presentation that points out all the major headings one needs to know about Unicode, Python and encodings. Very valuable...Thank You!
@kotepruidze7361
@kotepruidze7361 8 жыл бұрын
You are awesome! cool talk, cool presentation & cool explanations!
@omerdagan3083
@omerdagan3083 5 жыл бұрын
By far the best explanation on this topic, thanks!
@maloman1989
@maloman1989 8 жыл бұрын
Nice tips for a common problem, thanks man.
@VigneshSKannan
@VigneshSKannan 8 жыл бұрын
Thank you so much. This was amazing!
@ranelpadon8834
@ranelpadon8834 6 жыл бұрын
Excellent presentation!
@J2897Tutorials
@J2897Tutorials 7 жыл бұрын
10 minutes in and I feel as though I'm heading in the right direction of a complex maze.
@Jasonmadcow666
@Jasonmadcow666 Жыл бұрын
really cool video! Thanks!
@srinidhiskanda754
@srinidhiskanda754 7 жыл бұрын
awesome explanation thank you sir
@davidliu7314
@davidliu7314 7 жыл бұрын
very good python learn presentation,very thanks!
@xerxys8710
@xerxys8710 7 жыл бұрын
Thank you so very much!
@aklitv9712
@aklitv9712 9 жыл бұрын
Excellent explanation better than the official documentation, in 36 minutes he swept a hundred articles
@MrKestess
@MrKestess 10 жыл бұрын
Thank you!
@knpatel86
@knpatel86 10 жыл бұрын
Really nicely explained ...
@kantakt
@kantakt 9 жыл бұрын
Ned Batchelder - it is really cool. My pain - is stopping! Cool.
@drygordspellweaver8761
@drygordspellweaver8761 Жыл бұрын
Currently doing a deep dive into Unicode Hell. Not for the faint of heart!
@debuti
@debuti 9 жыл бұрын
OMG THANK YOU
@kostasnikoloutsos5172
@kostasnikoloutsos5172 7 жыл бұрын
Big thanks.
@TommyCarstensen
@TommyCarstensen 11 жыл бұрын
If I ever have anyone asking me about unicode encoding/decoding, then I will point them to this presentation. I feel like switching to Python 3 after watching this presentation. 6:05 "Klingon is not in Unicode. I can explain why later." :) 29:26 "Do you know how you get on the committee that decides pile of pooh?" :)
@ultiumlabs4899
@ultiumlabs4899 5 жыл бұрын
great explanation, come here because it's referred by 'fluent python' book.
@djbokoboko
@djbokoboko 6 жыл бұрын
Someone correct me if I am wrong but Ned said ascii are the first 96 unicode code points, I believe he wanted to say that the first 96 AFTER the first 32, so from 32-128?
@Neceros
@Neceros 9 жыл бұрын
This makes sense! strs in py3 are unicode from the beginning. I can see my error. You don't have to use u"string" in p3, it's already implied.
@XinhLe
@XinhLe 6 жыл бұрын
wow, that means to solve problems in python2, you just put everything with u' ' before any string? I am looking for solution.Thanks.
@evgeniyborisov2933
@evgeniyborisov2933 8 жыл бұрын
Thank you! p.s.: У вас очень понятный английский.
@evgeniyborisov2933
@evgeniyborisov2933 7 жыл бұрын
Ну, тот комментарий скорее для русских учащих английский)
@rainerwahnsinn3262
@rainerwahnsinn3262 11 ай бұрын
22:22 It’s a bit ironic that he says “you can’t look at a stream of bytes and know what encoding it is” while simultaneously showing encoding declarations like the HTML meta tag. Because that’s exactly what it does. It tries to read the file with a default encoding like ASCII and hopes it sees a encoding declaration at the beginning.
@hrvooje
@hrvooje 6 жыл бұрын
For me as a beginner in programming and Python, this is an eye-opener. I just don't understand why Notepad in WIndows has Unicode option for encoding which is actually UTF-16 Big Endian and “ANSI” means the system’s native legacy encoding, e.g. the 8-bit windows-1252 encoding in Western versions of Windows. Why on purpose name wrong names?!
@muratcan__22
@muratcan__22 4 жыл бұрын
thanks
@blenderpanzi
@blenderpanzi Ай бұрын
Pretty sure the default system encoding is defined with the LANG environment variable (under Unix), and these days it's always UTF-8. (Under Windows it might be UTF-16.) Oh, this talk is really old and about Python 2! Welp. It also claims at one point that characters are represented as codepoints. Well, a character might span multiple codepoints. To be more clear people talk about graphemes.
@tavo2099
@tavo2099 12 жыл бұрын
Is that Alex Martelli on the second row ?
@XinhLe
@XinhLe 6 жыл бұрын
Great talk. I'm new to programming. Can someone please give me an example of what does he means by "bytes on the outside, Unicode on the inside". I hope he gave an example, it would be great for a novice like me. Many thanks!
@XinhLe
@XinhLe 6 жыл бұрын
When he introduces himself with > 10 years python programming. It makes me think it will be useful. Yes, it's useful, but (seems that), it makes some novices in the room leaving early before he finished the talk even though it's a great talk. this is for advanced user only - i think.
@djbokoboko
@djbokoboko 6 жыл бұрын
Imagine you accept some sort of input(bytestring) with value "Hello"(using ascii characters) from "outside", you don't deal with it and your program adds that input to a unicode(talking python 2 here) ex: input + my_variable_that_holds_some_unicode. Python 2 will decode(using ascii) implicitly the input to unicode so it can make the concatenation and you would think that everything is ok and you know what your code is doing 100%. What if the next day that input can NOT be decoded implicitly(using ascii), for example the value is "Ηελλο"(Greek letters btw) ? Python 2 will try to decode "Ηελλο" using ascii characters and will fail and you dont want your program to crash! Thats the reason. So every time you need to explicitly decode your inputs and deal with any failures. Bytes are transferred outside through wires and at some point arrive in our programs and we want to deal with unicodes in our programs and not with bytes and when we are done then we encode them to make them bytes to send them outside again. I am not expert and actually I came today to watch it once again and I think I understood it this time, anyway this video is gold so save it for when you get more experience to come back and learn something new :)
@MichaelLockhart
@MichaelLockhart 11 жыл бұрын
and yet, both Klingon and Tengwar have not progressed beyond proposals. Perhaps this is how the PoP snuck in? :D
@MickZeller
@MickZeller 2 жыл бұрын
👏
@aaaalexanderrrr
@aaaalexanderrrr 12 жыл бұрын
Right.
@yaru22
@yaru22 8 жыл бұрын
You can find he slides and the text at nedbatchelder.com/text/unipain.html.
@abdelrhman562
@abdelrhman562 7 жыл бұрын
1:33 why they laugh here ? :o
@satainter
@satainter 7 жыл бұрын
Meaning every"picture" you see is actually unicode
@ruydorantes
@ruydorantes 6 жыл бұрын
Because all the "images" are indeed unicode characters.
@Tapajara
@Tapajara 11 жыл бұрын
The pile of pooh proves that the Unicode team has too much idle time.
@humanity2809
@humanity2809 4 жыл бұрын
FOL = Fact of Life 2:20 FOL1 - I/O is always bytes 3:25 FOL2 - The world needs more than 256 symbols 17:08 FOL3 - Need both bytes and Unicode 22:22 FOL4 - You cannot infer the encoding of bytes 23:34 FOL5 - Declared encodings can be wrong PT = Pro Tip 21:11 Unicode on the inside (decode), bytes on the outside (encode) 21:50 Know what you have - Bytes or Unicode? If bytes, what encoding? 23:58 PT3 - Test Unicode 9:07 Python 2 17:29 Python 3 24:44 Summary
The Art of Subclassing
39:48
Next Day Video
Рет қаралды 39 М.
Gym belt !! 😂😂  @kauermtt
00:10
Tibo InShape
Рет қаралды 17 МЛН
Summer shower by Secret Vlog
00:17
Secret Vlog
Рет қаралды 12 МЛН
I've been using Redis wrong this whole time...
20:53
Dreams of Code
Рет қаралды 344 М.
Unicode and Python: the absolute minimum you need to know
24:17
Why I can type  ±©♥🔥🂱Ʊ in this title
11:08
Phil Edwards
Рет қаралды 141 М.
Can you solve the paper clip question for 2nd grade students?
11:06
MindYourDecisions
Рет қаралды 16 М.
What is a Monad? - Computerphile
21:50
Computerphile
Рет қаралды 597 М.
Parsing Horrible Things with Python
30:31
Next Day Video
Рет қаралды 67 М.
Ned Batchelder: Getting Started Testing - PyCon 2014
42:44
PyCon 2014
Рет қаралды 57 М.
The moment we stopped understanding AI [AlexNet]
17:38
Welch Labs
Рет қаралды 807 М.
Why Good Developers Write Bad Tests
30:09
Next Day Video
Рет қаралды 3,6 М.