Never Use Emojis In Your File Names

  Рет қаралды 23,037

Brodie Robertson

Brodie Robertson

Күн бұрын

Пікірлер: 595
@GoobsterGooberGoo
@GoobsterGooberGoo Ай бұрын
4:07 It was already mentioned here, but that ß is a so called "sharp S", basically a mix between an S and a Z in german. It is often being written as a double S like in the example in the video. The beta symbol is longer at the bottom & not opened up that much.
@jon-partlee-sayne
@jon-partlee-sayne Ай бұрын
Good you mentioned it! A disaster like that can't be taken lightly.
@GoobsterGooberGoo
@GoobsterGooberGoo Ай бұрын
@@jon-partlee-sayne literally unwatchable, germans will never recover from this
@ericbarlow6772
@ericbarlow6772 Ай бұрын
Yeah and it used to exist in English until the 19th century. FYI I also leaned that German letter as an ess-zett (sz) as well as a sharp s.
@toraxmalu
@toraxmalu Ай бұрын
And to be transformed to ß=>ss / ẞ=>SS is totally correct. So what about he's crying here?! also ä=ae, ö=oe and ü=ue... Mostly: emojis has to be usable in utf8-filenames. No if and what.
@sasjadevries
@sasjadevries Ай бұрын
The β does look similar to the ß 🤷‍♂. A disc that can't handle ß is komplete Scheiβe 😂😂.
@HuskyNET
@HuskyNET Ай бұрын
As a software developer for 25 years, I’m regularly using typographic characters and actually even emojis in some of my file names, and you can’t stop me. I paid for the whole Unicode, I’m gonna use the whole Unicode.
@stupidburp
@stupidburp Ай бұрын
🔥
@Zeawi
@Zeawi Ай бұрын
You paid for it...?
@shantilkhadatkar1195
@shantilkhadatkar1195 Ай бұрын
​@@ZeawiYour pfp is cool
@__Merchant
@__Merchant Ай бұрын
I spotted a psychopath.
@jadesprite
@jadesprite Ай бұрын
Y'all pay for your unicode? I got mine for free.
@szaszm_
@szaszm_ Ай бұрын
You should absolutely use emojis in a file name, if you're a developer, to test that your software handles them correctly.
@aqua-bery
@aqua-bery Ай бұрын
No better time to test then on your personal machine
@szaszm_
@szaszm_ Ай бұрын
@@aqua-bery yes
@aelsi2
@aelsi2 Ай бұрын
​@@aqua-bery By adding emojis to the name of your home folder
@AmirHosseinHonardust
@AmirHosseinHonardust Ай бұрын
Actually I was thinking of ensuring that my application crashes with an explicit message, when those files are supposed to be shared with others.
@OMGclueless
@OMGclueless Ай бұрын
@@AmirHosseinHonardust I think that ship has flown. Most OSes and IDEs and the like have decided to support them so you shouldn’t try to force people not to use them. “.🔥” is the default file extension for the Mojo programming language, as one example. For better or worse emojis in file names are here to stay.
@isaacbarahonahidalgo9427
@isaacbarahonahidalgo9427 Ай бұрын
Can't stop me, using emojis on my fstab
@Terra101
@Terra101 Ай бұрын
xd
@Pandacier
@Pandacier Ай бұрын
wait no what the actual fu-
@darukutsu
@darukutsu Ай бұрын
/home/¯\_(ツ)_/¯
@siz1700
@siz1700 Ай бұрын
💞❤️‍🩹😏
@TheKevinGDX
@TheKevinGDX Ай бұрын
xd
@jaakkohintsala2597
@jaakkohintsala2597 Ай бұрын
I have never wanted to rename my home folder to an emoji of a house more than now
@unitrader403
@unitrader403 Ай бұрын
but which House do you want? the blue one or the yellow one? :D
@Megalomaniakaal
@Megalomaniakaal Ай бұрын
@@unitrader403 yes
@inertia_dagger
@inertia_dagger Ай бұрын
​@@unitrader403🏚️ this one
@_nishantk_
@_nishantk_ Ай бұрын
@@unitrader403 the blue one
@aurelia_the_jelly
@aurelia_the_jelly Ай бұрын
Use Starship and you can have that without all the hassle of an emoji filename ;) I have my home path substituted by the icon too.
@dragonwisard
@dragonwisard Ай бұрын
According to POSIX, any character is acceptable in a Unix filename, except for forward slash (path separator) and the null byte (string terminator).
@lhpl
@lhpl Ай бұрын
Right. The only proper advice to programmers is to handle this correctly, as in "just assume names are some bytes".
@gdclemo
@gdclemo Ай бұрын
You can even have filenames which correspond to invalid UTF-8 encodings, such as [byte 255] which is not valid UTF-8.
@dragonwisard
@dragonwisard Ай бұрын
Using control characters in filenames is also fun, and can break a lot of scripts and utilities. From the perspective of a C programmer, I can see how this might have made sense initially, but once Bourne shell came along and people started using line-oriented utilities this straight up breaks things. A filename can contain a new line or carriage return or bell or tab or any other byte. In Bash, the best practice is to set the inter field separator to the null byte, but I can tell you with confidence that it's rarely done (correctly) in production scripts, and even a lot of C utilities make false assumptions about which characters are valid. Great way to catch vendors with their pants down. So simple and effective it should be as well known as buffer overflows and race conditions. With a little creativity, you can absolutely use this for privilege escalations.
@animowany111
@animowany111 Ай бұрын
@@dragonwisard I actually have an extremely cursed file with invalid UTF8 in my home folder. It also starts with a space and includes a newline. It's a crash report log from a severely memory corrupted program. I love seeing what breaks and what doesn't break when it encounters that file.
@kebien6020
@kebien6020 Ай бұрын
My reason: The anime is called "Fate/kaleid liner Prisma☆Illya". So that is what I'll use as my folder name, and you can't stop me.
@RedSntDK
@RedSntDK Ай бұрын
Speaking of anime, that's the one time my little brother messaged me wanting a solution. He had, ahem, acquired an anime with such a long title that the program he used to download it with was too long for regular windows explorer to handle it. Good thing that copy utilities like ultracopier and fastcopy exists. Since I moved to linux in January I haven't really needed to use a copy utility, but I do have ultracopier installed in case. Shame it doesn't integrate into dolphin.
@temari2860
@temari2860 Ай бұрын
When I was first learning Godot about 2.5 years ago by making a mobile game, I got stuck for like 2 days trying to make a testing build for my game. The error messages were completely useless, docs, forums, google search didn't help either. Finally on the third day I decided to make a new empty project, copy the changes from my game and make a build after every change to see when it fails. This new project succeeded in building at every step from an empty project to a full exact copy of the one that doesn't build. Then I copied the only thing that remained different: project's directory name. The build failed. It was a colon in the project directory name.
@RadikaRules
@RadikaRules Ай бұрын
That is actually painful
@aqua-bery
@aqua-bery Ай бұрын
Oh my god... If that's a problem, Godot definitely should've complained to you when you were making the project first...
@temari2860
@temari2860 Ай бұрын
@@aqua-bery It wasn't Godot's problem directly to be fair. If you build for Android in Godot it uses the Gradle build tool, which was the failing link in this case.
@Rudxain
@Rudxain Ай бұрын
I remember being so frustrated trying to run an MKSH script on Android (not Termux, it was Llamalab Automate) and the problem was that my editor used a non UTF-8 encoding and MS-DOS line endings
@bryanpedini
@bryanpedini Ай бұрын
sorry, but after having seen enough projects use it and how it fails in spectacular ways, I can only say: fuck gradle, with all my ❤️
@CEOofGameDev
@CEOofGameDev Ай бұрын
4:08 >eszett Brodie:"the beta symbol" Germans in shambles.
@Ralzone
@Ralzone Ай бұрын
I mean... Not a lot of Germans who only speak dutch
@unitrader403
@unitrader403 Ай бұрын
SCHEIẞE
@kreuner11
@kreuner11 Ай бұрын
@@Ralzone a beta symbol is a visually different symbol
@MateuLeGrillepain
@MateuLeGrillepain Ай бұрын
Based and not austerity-pilled /j
@alpacamale2909
@alpacamale2909 Ай бұрын
I'm not a German so I always called it 'the fat B'
@lyranem
@lyranem Ай бұрын
Software handling Unicode incorrectly is software problem, not user problem. Saying “you shouldn’t do that” is a bad practice
@JonBrase
@JonBrase Ай бұрын
Software choking on emojis is a software problem. Users using emojis in filenames is a user problem. Two separate problems, but they interact.
@danielrhouck
@danielrhouck Ай бұрын
Software should absolutely handle it. Especially because things that fail on emoji often fail on anything outside the BMP and I donʼt want to tell anyone using non-BMP languages that sorry, they canʼt name files something reasonable in their language. That said, it is still worth telling people that they should be *aware* that something will more likely break if they do stuff thatʼs farther outside the test conditions. As an analogy, if I break into your house and steal your stuff, thatʼs obviously my fault and I should go to jail. You did not “have it coming” or anything like that. But if I was only able to do that because you didnʼt lock your door or only used a MasterLock lock, then you should have been warned that this is less safe.
@JonBrase
@JonBrase Ай бұрын
@@danielrhouck Indeed it should. But users using emojis in filenames is problematic in ways that have nothing to do with the ability of software to handle it.
@AmirHosseinHonardust
@AmirHosseinHonardust Ай бұрын
That is a fair point. But also, using unicodes that are not easily typeable using your keyboard, or require third-party keyboards, when that file should be used by others, is an asshole move. Much more than using spaces in the name. So if it is on your own computer‌, sure go ahead. But if you are dealing with a software that deals with shared files, I would want to gate-keep the hell out of these assholisms. If the unicode is fairly easy on some keyboard layouts, sure.
@lhpl
@lhpl Ай бұрын
@@JonBrase Users naming their files however they like is not a problem. If you think it is, then _you_ have a problem. If you develop software for others, that is based on your views, then you _are_ a problem.
@deefdragon
@deefdragon Ай бұрын
You should be able to use emoji in files, not for emoji specifically, but because extended unicode should be supported. limiting to just a-zA-Z0-9_-. and space etc. is very anglo-centric, and I think other languages should be able to use files in their native language. Emoji are simply a subset of utf8, and so should be as valid as nearly any other unicode character.
@yuvalne
@yuvalne Ай бұрын
yup
@angeldude101
@angeldude101 Ай бұрын
The problem really isn't emoji, but rather inconsistent case-folding. Filesystems need file names to be normalised in the same way every time. Case-folding however is ambiguous and can change depending on not just the version, but also the region, which is a _terrible_ thing for compatibility. The solution is to not have filesystem-level case folding and to leave case-insensitive behavior to be handled in userspace. Generally if a piece of software requires a filesystem to be case-insensitive, then either the programmers are idiots, or the software is so old it probably predates unicode, and so casefolding for all of unicode is complete overkill.
@crusaderanimation6967
@crusaderanimation6967 Ай бұрын
Dziękuje
@bleack8701
@bleack8701 Ай бұрын
Exactly
@jonathanbuzzard1376
@jonathanbuzzard1376 Ай бұрын
Users have always done idiotic things with filenames. Let's start with Mac users starting file names with a space to change the sorting order. Then again we have Mac users using \ to put dates in the file name rather than ISO 8601. Of course notably \ is illegal in SMB. Then of course is putting newline characters in file names. I mean that really takes some effort. Every couple of months I have to take a look at the failed files in the backup and start emailing users if they want their stuff backing up to fix there file names.
@alarii2582
@alarii2582 Ай бұрын
I have a bunch of archived KZbin videos with emoji in their filenames
@remixedcat
@remixedcat Ай бұрын
Same. Specially lofi mixes
@Zadig
@Zadig Ай бұрын
The lower-case beta you mentioned as an example isn't a beta. It's a eszett, a character used only in German that's (more or less, I'm simplifying) interchangeable with a double s.
@somenameidk5278
@somenameidk5278 Ай бұрын
ß vs β
@DeronJ
@DeronJ Ай бұрын
@kuhluhOG
@kuhluhOG Ай бұрын
since the last spelling reform, it isn't anymore double s and ß have different purposes since then that doesn't mean it isn't still sometimes done when the computer system you interact with is outdated (or shit if it's new)
@vincentschult1725
@vincentschult1725 Ай бұрын
​​@@kuhluhOGAfaik double s is a valid replacement for ß if you cannot type it. In Swiss High German it is even the rule that everywhere where German High German would use an ß, a double s is used. In German High German iirc the only difference between a double s and ß is that the preceding vowel is not shortened when using ß, while both produce (ignoring exceptions ofc) a sharp s sound.
@kuhluhOG
@kuhluhOG Ай бұрын
@@vincentschult1725 the ß has one additional thing in German High German: the preceding vowel of lengthened (similar to a silent h)
@SophiaGlencairn
@SophiaGlencairn Ай бұрын
The Fire emoji is a valid file extension for mojo files.
@linusbrendel
@linusbrendel Ай бұрын
Was about to comment about Mojo
@GSBarlev
@GSBarlev Ай бұрын
Another reason to dislike Mojo, lol (srsly, fam, just use *numba)*
@johnpenner5182
@johnpenner5182 Ай бұрын
mojo is amazing! it allows you to write fast portable GPU agnositc code. 🔥
@theredtechengineer1480
@theredtechengineer1480 Ай бұрын
You can't stop me. I use emojis and accent characters in file names.
@stupidburp
@stupidburp Ай бұрын
🎩
@unknowntotherestoftheworld
@unknowntotherestoftheworld Ай бұрын
when you name a file with a character that normalizes into slash and the file now points a different file in a different directory
@goaserer
@goaserer Ай бұрын
Emoji? I don't even put whitespace in my filenames. Staying DOS compatible, just in case...
@Emayeah
@Emayeah Ай бұрын
relatable, I hate how Windows has program\ files...
@seto007
@seto007 Ай бұрын
Not even just for compatibility with older software and file systems; it just makes specifying the path in a terminal significantly more convenient
@Ganerrr
@Ganerrr Ай бұрын
bro ill use like half of unicode but never anything that forces bash to use quotes lol
@jello3456543
@jello3456543 Ай бұрын
8.3 for ever
@athomashowe
@athomashowe Ай бұрын
Honestly lack of white space wouldn't be the end of the world but the char limit is brutal
@vsmash2
@vsmash2 Ай бұрын
4:07 My brother in Arch, that's a Esszett. also known as "sharp S" german thing.
@JamesR624
@JamesR624 Ай бұрын
Yeah... if someone doesn't know what something is and just make up something so they SEEM knowledgable, I loose respect for them and instantly stop watching since they've shown they can't be trusted to actually be knowledgeable.
@mme725
@mme725 Ай бұрын
I don't think he was pretending to know, or that he saw it and consciously thought he had to make something up. I think it was a simple case of him mistaking it for the lowercase beta. ​@@JamesR624
@myhandleiswhat
@myhandleiswhat Ай бұрын
@@JamesR624 my brother in youtube comments, it isn't that big of a deal. This take literally takes "someone was wrong on the internet" to another level.
@akam9919
@akam9919 Ай бұрын
@@myhandleiswhat my little brother in youtube comments...you're not wrong
@whohan779
@whohan779 Ай бұрын
What do you expect? He literally said “Firstly, most people don't really have an emoji selector on their system” while those _most people_ (probably well >70%) literally are on Windows 10/11 where you can just use 🪟+[ . ] or KDE where the shortcut may be different by default but a selector is still a core component that can be searched for. Normally I like this channel but this vid is an L.
@hoi-polloi1863
@hoi-polloi1863 Ай бұрын
Emojis are just unicode characters, so they're fair game for filenames. If nothing else, having an emoji in the name will keep most of the Linux kids out of your files, because they don't know how to type 'em. Side-note... way, *way* back we kids would "protect" our Apple II files by having the filenames have the bell character (ctrl-G?) in them. You couldn't see the bell character, so when you listed the directory you'd see the filenames, heard the bell chime, and you couldn't access the file unless you knew where in the string the bell was.
@TakeApartLab
@TakeApartLab Ай бұрын
Thats what tab completion is for. ❤ if i need to deal with a weird char ill just Type what i can and let bash type the rest of it for me.
@benkato_
@benkato_ Ай бұрын
I do have some cronjobs and scripts that will yt-dlp some unarchived youtube streams that I can't watch when they are live... So sometimes it happens that some files may have emojis and japanese characters in it... But it always worked in VLC and never paid too much attention to it xD Sometimes I remove special characters so I can work with them in bash, but that's about it xD Edit: Oh nyo, the ß my beloved got confused for a beta character ;-;
@danielrhouck
@danielrhouck Ай бұрын
Ugh, yt-dlp is especially bad here because it does name mangling into what it thinks is safer characters but those can sometimes cause their own problems, and itʼs still impossible to turn it off. EDIT: And now after ages that’s finally fixed!
@UlvicanKahya
@UlvicanKahya Ай бұрын
Oh boy... This brings back all the horrible memories I have about the infamous "Turkish i" problem. I can't even count how many programs out there just crush if you use Turkish locale beause of casefolding a single character handled differently in Turkish.
@Pandacier
@Pandacier Ай бұрын
4:07 I believe this is the german Eszett and not "beta"
@markus321xyz
@markus321xyz Ай бұрын
Yes this is a ß, in German (used I Austrian German & Germany German) this is called "Schafes S" (sharp S) or Eszett. If somebody writes sz it has the same meaning. Some systems/programmes/Databases don't like this letter very much 😅
@voyager-tc9dz
@voyager-tc9dz Ай бұрын
and it has no upper case
@PathinX
@PathinX Ай бұрын
​@voyager-tc9dzthat is not true, there indeed is a capital ẞ. Here the lowercase for comparison ß
@TeaMaster420
@TeaMaster420 Ай бұрын
The weird B!!! Like literally, why is this even needed?
@Veetrill
@Veetrill Ай бұрын
@@PathinX Capital Eszett got added to Unicode not so long ago, and more like an afterthought. Before then this letter has been considered as exclusively lowercase, and every time someone wanted to write a German word in caps, they needed to replace ß with SS.
@nezu_cc
@nezu_cc Ай бұрын
I know Unicode is hard, but that's not a beta.
@pi_ist_toll
@pi_ist_toll Ай бұрын
Do anything you can think of. You'll learn from anything dumb. Just DON'T DO IT IN PRODUCTION!
@hopelessdecoy
@hopelessdecoy Ай бұрын
Do it all in prod got it!
@pi_ist_toll
@pi_ist_toll Ай бұрын
@@hopelessdecoy Actually, learning from prod mistakes is even more effective because you know you actually broke something and never want to repeat that.
@__Brandon__
@__Brandon__ Ай бұрын
If you don't test prod you don't know that prod is working
@blarghblargh
@blarghblargh Ай бұрын
Go ahead and test in prod. Our project will be happy to accept your users
@salazar1554
@salazar1554 Ай бұрын
I kind-of feel adopting Unicode file names makes sense for one and done language compatibility. Emojis are an inevitable result of doing that. Why implement the languages separately? Plus I imagine some systems exist that name files based on the first word or few characters of the file, and those would benefit from Unicode. I personally don't have a use case for emoji file names, but if you are doing multilingual file-names anyway it makes sense to just do Unicode (even if file managers and terminals give up on a unified font style for uncommon Unicode characters, just grab any old svg of the character from a Unicode server or local sqlite database or something).
@2kadrenojunkie
@2kadrenojunkie Ай бұрын
i love linux. using emojis in filenames is a absolutely horrible idea, but you can do it and nothing is in place to tell you no. go ahead, label all your folders with emoji.
@TakeApartLab
@TakeApartLab Ай бұрын
Yea, windows has some WILD path stuff, its black magic. linux is boring in comparison lol (in a good way).
@MattiasA-t5l
@MattiasA-t5l 20 күн бұрын
The only rule there is: Always put a at the end of the filename, if not possible put it at the beginning (you may include additional s).
@tlhIngan
@tlhIngan Ай бұрын
The problem is not emoji. The problem was exposed by an emoji. We can debate having an emoji in a filename (and there probably will be instances where someone may feel it is appropriate as a filename), but the key point is Unicode. Just because it affects emoji today, doesn't mean tomorrow there won't be another codepoint which exhibits the same issue, except it's the equivalent of say, "the" or "a" or other common word in some language and you've just horribly corrupted it. And Linus is right - you should not be casefolding in a filesystem - because you're reducing namespace and that can cause hash collisions. If you want to case fold a filename, that should be done in userspace where the program can figure out how to uniquely identify each item, or to do things like a case-insensitive search. Also, Unicode case folding rules may change - it's an ever-evolving standard
@radswfiihq
@radswfiihq Ай бұрын
1:09 on windows: Win+. on Mac: the globe key on Plasma: Super+. (Can be changed in settings)
@kuhluhOG
@kuhluhOG Ай бұрын
also, the newest revision of the german keyboard layout has an interesting key combination it's supposed to open an emoji picker, and if not available, type 😀
@tresf
@tresf Ай бұрын
Yes, the person that says "most people don't have a way to type emojis on their computer" is someone hasn't seen the default repurposing of the Fn key on all modern Macs.
@theairaccumulator7144
@theairaccumulator7144 Ай бұрын
I use win11 but win+. doesn't work for me no matter what I do.
@MarcinKralka
@MarcinKralka Ай бұрын
I never even considered putting an emoji in filename.
@chaos.corner
@chaos.corner Ай бұрын
It happens easily when using tools like yt-dlp. It can break things pretty well like some files become invisible over samba (or maybe the vifs implementation in vlc, I forget). Fortunately it can be told to use a restricted character set which makes things ugly but at least they work.
@ababcb3005
@ababcb3005 Ай бұрын
It's not just emojis that cause issues, I once ran into a program where spaces in the path caused it to stop working. The issue was fortunately fixed (very recently as a matter of fact), but it definitely got me thinking about keeping my folder names as "variable-like" as possible moving forward.
@lhpl
@lhpl Ай бұрын
Developers who are so incompetent that their code can't handle all legal file names should not be fǔcking allowed anywhere near a computer.
@coyo_t
@coyo_t Ай бұрын
its amazing to me that cmake (or was it make. or both.) to this day still shits itself on paths with spaces like we live in the software stone age still literally told me "paths cant have spaces in them" when i tried rebuilding like mf >:[
@lhpl
@lhpl Ай бұрын
@coyo_t do the c in cmake stands for crap, I guess...
@Lampe2020
@Lampe2020 Ай бұрын
Interestingly the JDownloader2-installer script always put "/JDownloader 2" at the end of the specified path to install it in and then complained about there being a space in the chosen installation path. And I couldn't ever find the string "Jdownloader 2" in the script…
@ziv132
@ziv132 Ай бұрын
A use case that I saw that's acceptable is in Obsidian it names the markdown files based on the title, if you have a framework for organising your notes that you use an emoji for different types then you end up with emojis in your filenames
@bloody_albatross
@bloody_albatross Ай бұрын
ß is not a lower case beta, its a lower case German Eszett/sharp S. ẞ is its upper case variant, but that is a recent development. Before SS was used as the upper case variant.
@gunnargu
@gunnargu Ай бұрын
the file name rule on linux is what, no nulls? no slash? that's it? entire paragraphs, even newlines... fine...
@__Brandon__
@__Brandon__ Ай бұрын
255 character max on most filesystems
@SaHaRaSquad
@SaHaRaSquad Ай бұрын
@@__Brandon__ 255 bytes, not characters. Those just happen to be the same size if you only use ASCII symbols.
@fomxgorl
@fomxgorl Ай бұрын
ive not thought of this before. have considered using emojis in my passwords as a potential usecase that i need to watch out for as i develop my software. very useful with a password manager. will have to test emojis in my software for filenames now
@0xDEADBEEF
@0xDEADBEEF Ай бұрын
2 Brodie Robertson: Actually this patch can be removed easily, then ppl who has that lovely file names MUST run fsck to automatically rename invalid file names to correct ones.
@IAmPattycakes
@IAmPattycakes Ай бұрын
I have archives of various streams, videos, etc. with the title of those as the file names, with timestamping at the start. Those titles sometimes have emoji and im not gonna change the title because sometimes the emoji are important contextually.
@szirsp
@szirsp Ай бұрын
10:55 Date ranges wouldn't even solve the problem. It't not when the file was created, but with which kernel version was the file created. And I don't think that information is stored in the file system...
@ThatTrueCJ201
@ThatTrueCJ201 Ай бұрын
As someone who uses Japanese, downloads Japanese files and programs, I concur that unicode outside the ASCII specification is a pain.
@stupidburp
@stupidburp Ай бұрын
It was so bad that Japan created several other text encoding alternatives before unicode standards were approved and continued to support them because unicode was never fully supported universally. They even made their own OS with government support just because the language support was so bad in Windows earlier on.
@SeralyneYT
@SeralyneYT Ай бұрын
4:25 - That's not the Beta symbol. β is Beta. ß is a double S. β != ß
@quinten01
@quinten01 Ай бұрын
Eh. Looks the same to me
@SeralyneYT
@SeralyneYT Ай бұрын
@@quinten01 Yeah so do I (uppercase i) and l (lowercase L). That doesn't mean they are the same.
@tutacat
@tutacat Ай бұрын
Thank you bug submitters for triaging all the patches, or using the search function properly
@terranbyte2619
@terranbyte2619 Ай бұрын
I wasn't gonna use emoji in file names before, but after watching this... I might as well start doing it because it's cool.
@jasper265
@jasper265 Ай бұрын
😎 It's basic, I know. I pretty much use it for anything involving things being cool, whether that's sarcastic, mostly neutral or enthusiastically positive. So it's very ambiguous in a way that I like. The time based approach wouldn't even work. You'd need to know when kernel versions were installed on the system. And even then, you could boot into different installs that mount the same filsystem but have different kernel versions. The best I can come up with is to try one way of case folding and if the file doesn't exist, try the other way. That's very messy though and can impact performance in certain cases.
@jello3456543
@jello3456543 Ай бұрын
It would be somewhat ugly and time consuming, but a third way would forcing the case folding file systems to perform a fsck that fixes the file names where casefold gives different result. Of course, that would make kernel downgrades impossible...
@CloudCuckooKing
@CloudCuckooKing Ай бұрын
Pedant here, Japanese has hiragana, katakana, kanji, romaji, man'yougana and "variant kana", which I won't say in Japanese proper because it contains as a substring something KZbin will probably filter. so it really makes things even more of a trainwreck, though in terms of computer transcription, variant kana is the only real problem and I don't know if any of them are even encoded in Unicode.
@TrabberShir
@TrabberShir Ай бұрын
yes: if your OS allows Unicode characters in file names and your app handles files, you need a test which loads every Unicode edge case you can find. That test necessarily includes at least 7 files with emoji in the name. Sadly, very few projects actually have that test. And I cannot really blame them because when it fails, you are usually dealing with a bug in your OS or a bug in a library that is used by a library that is used by a library that you use. But not always.
@gokhanersumer2273
@gokhanersumer2273 Ай бұрын
Huh. I dont even use accented characters in my native language when naming. Old habits.
@mactan_sc
@mactan_sc Ай бұрын
ran into a program that couldnt handle unicode copyright symbol in device driver names, that was an interesting one to get around
@czos9239
@czos9239 Ай бұрын
You certainly see emojis when sailing the high seas. At least that's what a friend told me.
@gdclemo
@gdclemo Ай бұрын
File names? I'll name my kids with emojis and you can't stop me.
@linuxguy1199
@linuxguy1199 Ай бұрын
Fun fact, you can also put newlines, spaces, backspace characters, and even ANSI escape codes in your filenames. If youre really skilled you can make files with null characters in the name, but you'll need some hand rolled assembly and a filesystem that supports it.
@FengLengshun
@FengLengshun Ай бұрын
3:24 I think if you want even more pedantic-er, you can double that by counting Full-width and Half-width as separate system. There is also jpn_vertical which I think is separate enough as it warrants its own tesseract file.
@examancer
@examancer Ай бұрын
A compatible solution is possible by doing a fallback approach: use the new hashing algorithm, and if the hash is not found try again with the old hashing. No date logic required and would let us actually fix this bug while giving support for old files for now. Down the road, after file systems have migration strategies in place for long enough, the fallback could be removed
@nisonatic
@nisonatic Ай бұрын
Databases have been dealing with this for ages. The specific collation is defined as part of the table schema, and if you want to fix a buggy collation, you add a new version and keep the old one around forever. If a user wants the new collation, they have to rebuild those tables. That's most likely how you'd have to do it in a filesystem: the filesystem would know its casefolding version, and you'd need to run a userland utility to rehash everything, as well as figure out how to rename files if any collisions were detected.
@anon_y_mousse
@anon_y_mousse Ай бұрын
That's like asking if you should use an 'i' or 'B' or 'k' in a filename. The question itself is pointless and really you should use whatever letters or emojis you want. The only character I really question that is a valid character is that of the newline, and truthfully, it makes sense to have it so you can name your files with a Haiku.
@jan_harald
@jan_harald Ай бұрын
yes you should use emojis in filenames so other people can't actually open them >;P extra points if you also use zero-width spaces to create several files that look to have the same name, visually >;P
@lhpl
@lhpl Ай бұрын
Just use NFC and NFD in the same directory. Strictly speaking, two names can - literally (not binary) - _be exactly_ the same.
@wagyourtai1
@wagyourtai1 Ай бұрын
I wish linux had the windows alt+numpad thing for typing special chars.
@muellerhans
@muellerhans Ай бұрын
You can only access the source code of my git project if you can deal with its emoji name.
@darkwinter7395
@darkwinter7395 Ай бұрын
The correct answer is to fix the case folding, and build a file system upgrade tool that re-hashes any filesystems that need it when you install the new kernel.
@angeldude101
@angeldude101 Ай бұрын
The only use for case folding in the filesystems that I'm aware of is compatibility with older software designed for case-insensitive filesystems. I highly doubt any software that relies on case insensitivity even knows that non-ASCII characters exist, and as such I'd argue to filesystem-level case folding should only ever apply to ASCII letters, and even then, it's expressly for compatibility. If you want a case-insensitive interface for the user, then that's on the developers of the user-space application being used to perform case-insensitive matching.
@PanduPoluan
@PanduPoluan Ай бұрын
For the casefolding change, I think the right choice is to keep "the right method" in the kernel, but then write a low-level tool to fix those files using the older (wrong) method.
@chaos.corner
@chaos.corner Ай бұрын
I'm not a huge fan of emojis as characters but they characters they are and so should be handled appropriately. What this indicates to me is that a boneheaded decision was likely made somewhere else.
@sinom
@sinom Ай бұрын
Case folding in the filesystem is something windows does and it's extremely annoying. E.g. i was trying to compile a c++ program and it included both a folder name, and a library where the only difference was capitalization. (So something like include vs include) on a linux system i tried to compile this this worked no issues, but when trying to compile it on windows it kept mixing up what to include where and it was just a huge mess
@mirailuv
@mirailuv Ай бұрын
this reminds me of that thing where someone put an emoji in their back account nickname and the entire system just crashed
@LokiCDK
@LokiCDK Ай бұрын
I am a fan of having a wider character set available for use in file name, only because it could potentially interact weirdly with attempts to read that drive by third parties. For instance by use of autopsy or various file system scalpels. For that same reason, why is there case folding in a file system? Case sensitivity in file naming is a long-held standard. Finally, if you were going to change the way something fundamental like that operates. That is something you do during a full major version number update.
@salazar1554
@salazar1554 Ай бұрын
Custom kernel where syscall functions contain emojis
@SaHaRaSquad
@SaHaRaSquad Ай бұрын
An OS built on Emojicode.
@hubertnnn
@hubertnnn Ай бұрын
Its usually a bad idea to use anything other than lowercase english letters and digits (and maybe "_" (underscore)) for filenames. While most applications can handle other characters in filenames, many don't, and some handle those differently than others. This can cause unexpected behaviors and weird problems starting from sorting (is ą before or after a), up to file corruption.
@chaos.corner
@chaos.corner Ай бұрын
It's generally better to harden code to handle it though because you're going to run across it sooner or later. Not taking care to handle input properly is how SQL injection attacks are born. But yeah, for quicky get-the-job-done scripts, it's easier to keep it simple. Just be careful of scope creep.
@kuhluhOG
@kuhluhOG Ай бұрын
and then you get people who aren't English-native using your system some may not even be able to understand English and even if, they obviously (and understandably) want to use their native language how would you behave as a user if it wouldn't have been English which became THE IT language but let's say Japanese and the letter commonly used for it would be Japanese characters; I would guess you would use English letters nonetheless
@lhpl
@lhpl Ай бұрын
No it's not. What _is_ a bad idea is using bad software that can't handle all legal file names.
@kuhluhOG
@kuhluhOG Ай бұрын
@@lhpl yep, on Linux there are only two bytes which aren't allowed in filenames: 0x00 and 0x2F the nullbyte the / as path separator
@SaHaRaSquad
@SaHaRaSquad Ай бұрын
@@lhpl You're not wrong, but being right doesn't fix broken software.
@makramc
@makramc Ай бұрын
As a German, I use ß in my filenames all the time. Also ss is actually kinda the right way to do it.
@mirzahadzic8666
@mirzahadzic8666 Ай бұрын
So, people watching this video will now start putting emojis in filenames, and keyboards will be enlarged with stupid images. Thanks Brodie!
@TakeApartLab
@TakeApartLab Ай бұрын
The comments having to idea of having /home be a home emoji is the funniest shit ever. i would die if i ssh'ed into a computer and saw that.
@pikachulovesketchup666
@pikachulovesketchup666 Ай бұрын
Developers should stop assuming that file name (or any input from user and any string displayed to the user) is collection of characters using Latin alphabet. Text is not something rendered from left-to-right and top to bottom character by character. One keypress doesn't mean single character.
@slycordinator
@slycordinator 25 күн бұрын
One difference between Macs vs Linux and Windows for filenames is that on HFS+ (and HFS, I assume), they use a different utf encoding. Linux and windows use NFC, where all the codepoints are stored in a fully-composed form. On HFS, they're in a variant of NFD, where some codepoint ranges are in composed form like in NFC and many others stored in decomposed form. I didn't know about this and was confused when I downloaded a Korean guy emailed me. The name looked like normal Korean in the browser, but when I opened it on my PC, the name was weird. Similarly, if you were on an HFS+ Mac and used python to create a file named "한글.txt", the filename in the file system will be automatically re-encoded. So, if you were to try and open the file again, it would likely fail, because the string will be NFC and its bytes won't match the file name
@PredatoryQQmber
@PredatoryQQmber Ай бұрын
That's what names are for. If I wanted to use some minimalistic dumbified system then I would have used inode numbers directly.
@Maramowicz
@Maramowicz Ай бұрын
Why just not check for... both? First for new, and if filesystem don't know what's wrong it's probably knows what's wrong, and if it is in old then just automatically update to new one.
@nullplan01
@nullplan01 Ай бұрын
6:53 I know these symptoms. The directory entry exists, but the kernel cannot find the file when given the name. You can also get the same problem when mounting a FAT partition with files that have names containing '/'.
@rogo7330
@rogo7330 Ай бұрын
Emoji must be a great thing to use when naming files in viruses. Everybody ignores that you can put ANY octec, not even a valid UTF-8, to the filename. Always remember the ctrl-v key and how to put any number from 0 to 255 into the terminal, boys. Or use latest bash and $'...' strings (which are on the way to be included into the POSIX shell, along with a '-o pipefail').
@johannes7856
@johannes7856 Ай бұрын
Mojo file extension goes BRRR 😂
@JamaicaWhiteMan
@JamaicaWhiteMan Ай бұрын
I don't know how I'd even do that - my keyboard doesn't have emojis.
@eDoc2020
@eDoc2020 Ай бұрын
Copy and paste, various IMEs, or just download a file with an emoji in the original name. Social media users often use emoji in titles, downloading such media will often copy it as (part of) the filename.
@ws_stelzi79
@ws_stelzi79 Ай бұрын
Just to put my 2 cents to the 'ß' case folding: especially in Switzerland it is very common to case fold it to 'sz' instead of (the more frowned upon) 'ss'! The 'ß' is basically the very best example in the German language when "just" case folding can fail because of some unknown nuances.
@moocatmeow
@moocatmeow Ай бұрын
i have found that an emoji in a file or folder name will usually break almost everything it comes into contact with
@howqso2885
@howqso2885 Ай бұрын
Use it. Linus is wise to really enforce his idea of a longevity into this enormous project, where so many distro rely on some code that was built decades ago, obviously moving a single brick might mess up a lot. But in here, that's an evolution rather than just breaking for refactoring or for providing a random solution. Break any program and distro that relied on this old way.
@tutacat
@tutacat Ай бұрын
Yes but if you change the new casefold function, that breaks the casefold function for _all previous kernel versions_
@Patterner
@Patterner Ай бұрын
2025 windows edition of this: "don't use spaces in file names"
@georgeorwell4752
@georgeorwell4752 Ай бұрын
Google Drive allows slashes in their file names. I absolutely love this idea and I think linux should adopt it.
@p0358
@p0358 Ай бұрын
Yes you should use emojis in filenames, it's Unicode, Linux supports Unicode, go use it. This is Windows behavior to be afraid of file names, where inserting a space or any special character will break everything. Don't tell them we have colons in the filenames too.
@NobodyInPerson
@NobodyInPerson Ай бұрын
Exactly. This bug is literally a problem with casefolding. Why would one ever want a case-insensitive filesystem, I don't get it. I have happily been using spaces, arrows and emojis in folders and filenames. Emojis are a very good way to 'tag' files in a super condensed way.
@mrab4222
@mrab4222 Ай бұрын
You can have spaces in Windows filenames. Can you have slashes in filenames? No, but not in Linux either.
@MasterHigure
@MasterHigure Ай бұрын
A previous workplace had all project files on OneDrive. That OneDrive folder had an æ in its name, in addition to spaces and a dash. It was plain horrible.
@mudi2000a
@mudi2000a Ай бұрын
The problem is a case folding file system. Linux or Unix in general was not designed with this in mind.
@abit_gray
@abit_gray Ай бұрын
Yes, you should be able to use Emoji and all other Unicode characters (or composite characters). If you say you support Unicode, you cannot go "but not that part". There should be only few characters (like '/') that are not valid filenames.
@c128stuff
@c128stuff Ай бұрын
This problem, in a more general sense, has existed since forever in any system which keeps persistent data, when wanting to make a change to how that data is formatted or located. And the solution is known, but has its own problems. You need to detect what is old, and migrate it to new before using it. The problems of that solution? - it will have some performance impact, tho if you thought about the need for this, that can be minimal - you run the risk of ending up with multiple 'generations' of a format, or in this case way of calculating a hash, and that means the number of conversions you need to support will grow with time. Those however can be addressed by not doing those conversions 'on the fly', but by scanning the persistent storage (filesystem) for all instances which need migration, and doing this 'offline'. That of course comes with downtime... you gain some, you lose some. Keeping a clearly wrong implementation because migrating is too hard... yes, it can be a valid choice, it can also come back to bite you at any random time in the future when 'clearly wrong' turns into 'real problem'.
@tlpthx
@tlpthx Ай бұрын
11:02 deciding based on the creation date would be stupid, to be frank. If the patch was installed on the date a file was created cannot be decided based on the merge/revert dates, those are only weakly correlated.
@coryschwartz1570
@coryschwartz1570 Ай бұрын
While working on at a company that allowed self managed (Linux) devices on the network. A couple of friends decided to change our hostnames to have Unicode characters. All hell broke loose. DNS worked fine, but it broke the network IDS for hours and and our IT department told me it somehow also caused routing problems because of their network auth system
@ottolehikoinen6193
@ottolehikoinen6193 Ай бұрын
Whitespace filename characters, emojis, normal foreign characters, the things people do. Basic Latin character set with preferably date in the yymmdd format in the beginning is the only acceptable archivist filename.
@actually_peanuts
@actually_peanuts Ай бұрын
@enemixius
@enemixius Ай бұрын
Never considered emojis in filenames. Emojis in SSIDs are pretty fun though.
@MaxusR
@MaxusR Ай бұрын
They just had to release a simple utility that will find any old incompatible files and rename them. No, they'll just revert the changes and say 'this feature is bad and should never be used'.
@lainalien
@lainalien Ай бұрын
ive used emoji in urls which is cool lokin until u get it in the url bar and theyre coverted to numbers rip
@catgirlQueer
@catgirlQueer Ай бұрын
punycode!
@5h4ndt
@5h4ndt Ай бұрын
non-printable characters are also quite nice to use. Like the bell. Makes your pc beep everytime you do ls that particular folder.
@DanielNerd
@DanielNerd Ай бұрын
the fact that people found out this is an issue is worse than the issue itself
@fredericjaquet3729
@fredericjaquet3729 Ай бұрын
EDIT : I saw afeter having written this that the case was already mentioned in the comments. Sorry. 4:10 Well, it's not the "beta" character, it's "Eszett" or "scharfes S" : a german character intended to replace the double 's' in some words (unicode U+03B2). In the "beta" character (unicode U+03B2 as an example), the tail should go lower than the bottom line of the character 😉 Eszett : ß Beta : β Merry christmans Brodie !
@stupidburp
@stupidburp Ай бұрын
Mojo language uses emoji file extensions by default, just because they can. No worries, boring old plain text extensions are also an option. If the OS struggles with long standing unicode, that seems like an OS problem.
@interru_io
@interru_io Ай бұрын
Usually every byte except ASCII '/' is valid for filenames in Linux for most filesystems. Even invalid utf-8. But most Software will choke on bytes that aren't valid utf-8.
Never Annoy The Linux Kernel Developers
22:32
Brodie Robertson
Рет қаралды 47 М.
Why Does This Linux Package Even Exist!
12:42
Brodie Robertson
Рет қаралды 28 М.
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 41 МЛН
IL'HAN - Qalqam | Official Music Video
03:17
Ilhan Ihsanov
Рет қаралды 700 М.
My scorpion was taken away from me 😢
00:55
TyphoonFast 5
Рет қаралды 2,7 МЛН
NTSYNC Takes Linux Gaming To New Heights
17:55
Brodie Robertson
Рет қаралды 63 М.
I Ran PS3 Games on PS5 Silicon to Prove Sony Wrong
9:14
Lowest Logan
Рет қаралды 660 М.
Самая простая установка Arch Linux и KDE Plasma (archinstall, rufus)
37:03
Дмитрий Бухарин live
Рет қаралды 7 М.
Why Your Backend Shouldn't Serve Files
19:40
Boot dev
Рет қаралды 85 М.
This VS Code theme is threatening people?
14:26
Theo - t3․gg
Рет қаралды 146 М.
Linux Ate My RAM, What Do I Do?
16:57
Brodie Robertson
Рет қаралды 31 М.
Finally Flatpaks Biggest Flaw Is Being Resolved
17:31
Brodie Robertson
Рет қаралды 24 М.
I switched to Linux 30 days ago... How did it go?
28:46
Craft Computing
Рет қаралды 304 М.
Why I Can't Use Linux - My Top 3 Reasons
26:05
Tek Syndicate
Рет қаралды 151 М.
GTK Takes Another Step Towards Dropping X11
15:19
Brodie Robertson
Рет қаралды 17 М.
Chain Game Strong ⛓️
00:21
Anwar Jibawi
Рет қаралды 41 МЛН