Do you see the bug?

  Рет қаралды 12,496

Jacob Sorber

Jacob Sorber

Күн бұрын

Patreon ➤ / jacobsorber
Courses ➤ jacobsorber.th...
Website ➤ www.jacobsorbe...
---
Do you see the bug? // This week, I thought I would give you a little debugging practice. Simple function. That happens to be broken. It has a bug in it. See if you can see it. We'll talk about it next week.
Related Videos:
Don't write clever code: • Don't write clever code.
***
Welcome! I post videos that help you learn to program and become a more confident software developer. I cover beginner-to-advanced systems topics ranging from network programming, threads, processes, operating systems, embedded systems and others. My goal is to help you get under-the-hood and better understand how computers work and how you can use them to become stronger students and more capable professional developers.
About me: I'm a computer scientist, electrical engineer, researcher, and teacher. I specialize in embedded systems, mobile computing, sensor networks, and the Internet of Things. I teach systems and networking courses at Clemson University, where I also lead the PERSIST research lab.
More about me and what I do:
www.jacobsorbe...
people.cs.clem...
persist.cs.clem...
To Support the Channel:
+ like, subscribe, spread the word
+ contribute via Patreon --- [ / jacobsorber ]
Source code is also available to Patreon supporters. --- [jsorber-youtub...]

Пікірлер: 144
@shadmtmtn1603
@shadmtmtn1603 2 жыл бұрын
The first char is never tested due to pre-increment in the while statement, so, in case of an empty string, the position of the next "\0" in memory will be used. Best case scenario, a null char is present in allocated memory, the result is wrong, but worst case scenario, no null char in allocated memory, it is a page fault/memory violation and insured crash !! Late add : in C, statements are evaluated right to left if same priority level
@actually_it_is_rocket_science
@actually_it_is_rocket_science 2 жыл бұрын
I'd say best case it crashes and worst case is it just continues on its day. Imagine giving people millions of dollars because they forgot to fill out how much they were depositing and it passes an empty string. Since it is undefined behavior it might go into a segment of memory with data in it until it hits a zero.
@shadmtmtn1603
@shadmtmtn1603 2 жыл бұрын
@@actually_it_is_rocket_science you speak true !!! Best case if program stops with system knowledge of that, worst case your program is running with incorrect datas !! The second case is of course the worst, unreliable process... an unhandled exception is probably better. However, the real worst case is a handled exception leading to the program continuation with invalid but validated datas !!! Best for 1 function doesn't means best treatment ^^
@billowen3285
@billowen3285 2 жыл бұрын
Yess I got it!
@orisphera
@orisphera 2 жыл бұрын
I thought the order of evaluation (left to right or right to left) of operators with the same priority in C depended on the priority level. I think I've seen a table of priorities for C++ and some operations (e.g. + and -) were evaluated left to right. UPD: I've misread it.
@z08840
@z08840 2 жыл бұрын
"Late add" is wrong - first of all it's not statements - it's operators, and statements always evaluated left-to-right. Second, associativity of operators depends on operators - some of them left-to-right - multiplication and division for example.
@LRFLEW
@LRFLEW 2 жыл бұрын
I noticed the bug, but a ton of other people already commented with the answer. Having worked with some assembly languages, though, I wanted to bring up a way to fix this code that may be not as obvious, but is pretty much the method used at an assembly level. While the "simple" solution would be to replace line 11 with "while (*end) ++end;", it doesn't usually get implemented that way from what I've seen in compiled assemblies. Instead, (when not just using a counter) it's usually done by replacing line 11 with "while (*end++);" and line 12 with "return (end-str-1);". One way this works in assembly is by using the x86 instruction REPNE SCASB, which will leave the register EDI with the same value as the variable "end" in that code. Even if it doesn't use that assembly instruction, it can still be better to use that sort of method because it removes an unconditional jump. Implementing "while (*end) ++end;" would usually be implemented as a read, comparison, conditional jump, increment, and unconditional jump back to the read. Instead, "while (*end++);" would be implemented as a read, increment, comparison, and conditional jump back to the read. This reduction of steps, particularly a reduction in jumps, allow the loop to be executed more efficiently.
@orisphera
@orisphera 2 жыл бұрын
Wouldn't return (end-str-1) return two more than needed because, as mentioned in another comment, operators of the same priority level are evaluated right to left in C so it returns (str-(end-1))? UPD: I've misread that comment; it says about statements. UPD2: According to a reply, the author of that comment was mistaken.
@richardyao9012
@richardyao9012 Жыл бұрын
As a C programmer, I can say that this was the solution that I imagined. The other solution did not even occur to me.
@suncrafterspielt9479
@suncrafterspielt9479 2 жыл бұрын
Empty strings will break since it skips immediately to ++end. But this bug is based on the operation presidency. If * is resolved before ++ it would work. Even though the length would then be off by 1 I did not test it, so I’m not 100 certain
@mathieu-1941
@mathieu-1941 2 жыл бұрын
Prefix increment has the same precedence as the dereferencing operator, so it's the associativity that matters. In this case it's right to left (in "reverse") so the precedence is correct and it's a defined behaviour
@taylorh1883
@taylorh1883 2 жыл бұрын
If you put ++*end it actually won't work. As you are dereferencing the pointer to a character then adding to that character. The solution would be to use the postfix operator so *(end++) where the pointer is incremented but returns the value pre increment. Then you dereference this. In this case for an empty string the null character will evaluate to false and the while will end while the length returned is 1.
@9SMTM6
@9SMTM6 2 жыл бұрын
Jup, what was just said by mathieu. Operator precedence is usually only defined as relevant with neighboring... "symbols", if it would instead be defined to allow what you suggest, that would not only break C but also all kinds of math. Solving it would make things much harder to parse. Atm to find out what happens you need to only remember 3 "symbols", with your suggestion, assuming you don't limit it in depth, that could be infinite symbols.
@clawsie5543
@clawsie5543 2 жыл бұрын
I don't think it makes sense to talk about precedence of unary operators that appear on the same side of expression. Expressions like "+-*x" should always be interpreted as "+(-(*x))", because there's no other way to put parentheses in a way that changes the order of operations. For that you need to change the expression itself. What precedence affect, are cases like "*x + 1", which can be interpreted as "(*x) + 1" or "*(x + 1)", depending on precedence of "*" and "+".
@MrDgf97
@MrDgf97 2 жыл бұрын
Ah yes, the president elected order of operations
@casperes0912
@casperes0912 2 жыл бұрын
To people talking about the return type; When are you testing for string lengths larger than 2^64 and how do you have enough memory to store those strings? Even if it's one of the short 32-bit longs we consider here that's still a 4GB string length
@casperes0912
@casperes0912 2 жыл бұрын
@LastName Almaember For any software I am bound to write for myself, and we don't use C at my professional place,, frankly, I don't care, haha. But sure
@Solarbonite
@Solarbonite 2 жыл бұрын
Agreed. Currently pointers pretty much only use 48 bits on common platforms (x86_64). 64 bits of ram in one address space would be huge. The TLB is so large that's why x86 eschews even thinking about it.
@zxuiji
@zxuiji 2 жыл бұрын
0:34, I see 2 bugs in this one, 1st doesn't check for NULL but I suppose that's something the dev should be checking for anyways, 2nd the pointer is incremented before checking the character so even if the 1st character was empty it will still return a length greater than 0 which is wrong in that case, the function at bare minimum should've looked like: unsigned long stringlen( char const *str ) { char const *end; for ( end = str; *end; ++end ); return (unsigned long)(end - str); }
@tiagocerqueira9459
@tiagocerqueira9459 2 жыл бұрын
Not checking for NULL isnt a bug. Strlen of the C lib doesn't do it as well, it's the caller responsability to check or not to check. If you can guarantee in your program its not NULL it would be an unnecessary check everytime, this way you have control (and responsability ;) )
@zxuiji
@zxuiji 2 жыл бұрын
@@tiagocerqueira9459 I see you only read the 1st word of my post and skipped the rest, if you had read the whole thing you would have realised your post is redundant.
@monochromeart7311
@monochromeart7311 2 жыл бұрын
Alternate solution: size_t stringlen(const char* str) { const char* end = str-1; while(*++end); return (size_t)(end-str); }
@zxuiji
@zxuiji 2 жыл бұрын
@@monochromeart7311 That's less noob friendly, frankly I would never use that kind of code since anyone looking at it has to think in order to understand it, the method I gave is understandable at a glance
@monochromeart7311
@monochromeart7311 2 жыл бұрын
@@zxuiji never was intended to be noob friendly, and a proper C developer would understand it. But in practice you'd use libc strlen anyways.
@georgecop9538
@georgecop9538 2 жыл бұрын
1:23 It should be *end++(btw you can use a for loop to avoid these errors)
@rjgonzalez8108
@rjgonzalez8108 2 жыл бұрын
Condition in the while loop assumes the string is terminated with a null character.
@stewartzayat7526
@stewartzayat7526 2 жыл бұрын
Strlen and a lot of C API's unfortunately assume that as well.
@proloycodes
@proloycodes 2 жыл бұрын
nice video! btw what are your thoughts on rust vs c?
@proloycodes
@proloycodes 2 жыл бұрын
@Bio-Ethics Committee might be.
@proloycodes
@proloycodes 2 жыл бұрын
@Bio-Ethics Committee but what if somebody is "intermediate in trendy", and "basic in old school"?
@Danieloani
@Danieloani 2 жыл бұрын
Greate video! Maybe try focusing on security when explaining the off by one bug.
@rustycherkas8229
@rustycherkas8229 2 жыл бұрын
As long as it would be commented to explain, it'd be easy to use: char *endp = str - 1; // to accommodate use of pre-increment below. Others viewers suggested using post-increment, then subtracting 1, but fail to say "explain subtraction". There's more than one way to skin a cat!!
@azatkhabibulin7514
@azatkhabibulin7514 2 жыл бұрын
Empty string causes pre-increment to go outside of it
@dmail00
@dmail00 2 жыл бұрын
Two issues preincrement is used instead of post increment and it does not handle NULL being passed.
@leokiller123able
@leokiller123able 2 жыл бұрын
If the length is 0 the first character is not checked and we have a buffer overflow because *++ptr increments the pointer before returning the value of *ptr
@Timmyfox
@Timmyfox 2 жыл бұрын
Fails for empty strings due to the pre-increment. An easy fix would probably be to add a check to increment only if the current state is valid, e.g: > while (*end && *++end); or > while (*end) ++end; Changing to pos-increment may also seem like an easy fix, but then we'll go outside the valid memory range and get an OB1 error. Additionally, it segfaults if a NULL string is tested, although the default library does not handle those as well. Easy way to handle those would be to just add an extra check, either through an if-statement at the beginning of the function, such as: > if (str == NULL) return 0; or amend the while loop such as: > while (end != NULL && *end) ++end; ... Or at least that's my best take as a semi-novice having done a couple courses and MCU project at uni.
@Ak4n0
@Ak4n0 2 жыл бұрын
while(*end) ++end;
@bluesillybeard
@bluesillybeard 2 жыл бұрын
I see two problems: 1: no check for null pointer. 2: breaks on empty strings. Generally when I'm testing something, I test what happens of I put null pointers into things, and what happens if things are empty. Those tend to expose bugs. It's not a be all end all solution, It's just the first steps to testing a function to be resilient. If you're using something like Javascript where a String "null" is equivalent to an actual null object, that should be tested as well. Edit: there's also the problem where to any inexperienced programmer it just looks like magical nonsense, but that's not a problem in actual execution so it probably doesn't matter as much, especially in a low-level function like finding the length of a string.
@umpoucosobreconhecimentos
@umpoucosobreconhecimentos 2 жыл бұрын
Post increment starts at position 1. So if you stringlen(""); them it will read memory outside of the string size
@tobb10001
@tobb10001 2 жыл бұрын
Interesting way of thinking about this is that it is (roughly?) equivalent to do {} while (*(end++)); This may make it a little more obvious what is going on.
@BonDeRado
@BonDeRado 2 жыл бұрын
Besides always skipping the first character, thus not working for zero length strings, there is no check for the case str == NULL.
@kwan3217
@kwan3217 2 жыл бұрын
The first test case I tried when I started my mental debugger was the empty string. I actually didn't do it right in my head, and I thought that it would check each character and then increment (I never use preincrement or postincrement in an expression, just as standalone statements). So, I thought it would always return an answer that was one greater than correct. The bug is actually worse. When I tried it on a real machine with the empty string, I got a length of 4, and the correct value for all other strings I tried. As others have said, it doesn't check the first character, because it pre-incrments and then tests. I guess in my case that whatever random memory was after my copy of the empty string was not zeroed out, so it kept looking until it found a zero byte. In my case, it stopped after 4 bytes, but it might have had to traverse 4000 bytes or 4 million bytes. Even worse, it would be traversing memory that might not even be allocated. It is undefined behavior, because pointers are only allowed to point to memory that is allocated.
@didgerihorn
@didgerihorn 2 жыл бұрын
3 Bugs: - handling of empty strings - const-correctness - Pointer diff shall return a signed value
@lucazieserl6993
@lucazieserl6993 2 жыл бұрын
I think empty strings, as well as multibyte / utf-8 characters would break the function.
@billbez7465
@billbez7465 2 жыл бұрын
Learned a lot from analyzing this piece of code!
@rogo7330
@rogo7330 2 жыл бұрын
Zerolength string will overrun buffer, because `*++end` firstly increments pointer, then dereference it. That way you never checking first byte of the string and result will not be 0 as it should be. You can fix that by changing pre-increment to post-increment (from `++end` to `end++`), that way incremention will happen _after_ expression is evaluated, not before.
@MarekKnapek
@MarekKnapek 2 жыл бұрын
I can see at least 4 problems. First bug: It doesn't work correctly for empty string. Calling stringlen(""); will crash in best case, or will report some nonsense number in worst case. Second bug: It doesn't work for null string, it crashes. The function accepts pointer and is missing documentation that users of this function are prohibited to call it with null argument. There is also missing (debug only) check for this pre condition, preferably in form of assert macro. Third: Result of pointer substraction is ptrdiff_t not unsigned long (at least in C++). You have no guarantee that unsigned long is big enough to hold the result. (At least if you want to be pendantic and standard compliant, no problem in practice.) Fourth: The function doesn't modify contents of the string but takes pointer to modifyable string. This prevents use on const strings for no reason (at least in C++, apparently C has no problem with this). And seeing only prototype of this function causes unnecessary confusion for new users of this function (bad documentation, communication of intent).
@MarekKnapek
@MarekKnapek 2 жыл бұрын
Reading other people comments I came with 2 more bugs. Fivth: It works only with zero terminated strings. But I guess this is implicit and doesn't need to be mentioned in documentation. Sixth: It doesn't do what people expect with non ASCII encoded text. For example with Shift-JIS, Big5 or UTF-8. It reports number of bytes or number of code units which is not the same as number of code points nor characters nor grapheme clusters.
@thepawday
@thepawday 2 жыл бұрын
unsigned long is 32 bit integer but addresses can be 64 bit, if addresses are 64 bit and string is long enough (33 bit size) at line 12 address subtraction can return 64 bit integer converted to 32 bit incorectly
@thepawday
@thepawday 2 жыл бұрын
#include #include #include unsigned long stringlen(char *str) { char *end=str; while(*++end); return (end - str); //in that case difference will be greater than sizeof(unsigned long) } int main() { //allocate 33 bit size memory block - !!!(4 GB)!!! char *long_string = malloc(0x10000000f); //fill it with 'a' letter memset(long_string, 'a', 0x10000000e); //putt null terminator at the end long_string[0x10000000e] = 0; //test it //stringlen(long_string) should be 0x10000000e long printf("%llx", stringlen(long_string)); //dont forget to free the memory free(long_string); return 0; }
@D0Samp
@D0Samp 2 жыл бұрын
Depends on your environment. That applies to an 64-bit environment like Windows using the ILP32 data type model (int and long are 32 bits for compatibility, long long and pointers are 64 bits), but not most Unix-like systems with a LP64 model (long is 64 bit). In fact, size_t is a typedef for unsigned long there.
@thepawday
@thepawday 2 жыл бұрын
@@D0Samp so yeah, but i am still thinking that a bug, and in my opinion better to use size_t type instead of long (on Windows size_t is unsigned long long). Buuuuuuuttt.... string 4 GB length is kinda "брах"
@pelajarankimia1503
@pelajarankimia1503 2 жыл бұрын
That function will only work if the string uses ASCII encoding or UTF-8 that is compatible with ASCII, but if the function deals with non-ASCII characters, the function will return erroneus result
@smrtfasizmu6161
@smrtfasizmu6161 2 жыл бұрын
If you pass an empty string it will say the length is 1 because the first iteration is happening no matter what?
@abarocio80
@abarocio80 2 жыл бұрын
Empty strings go crazy. When string starts with NULL ("\0") will cause to read possible non alocated memory!
@sm0kyJoe
@sm0kyJoe 2 жыл бұрын
Problem is the pre increment. It will not handle empty strings correctly. Answer will definitely be wrong and it will most likely segfault.
@HansBezemer
@HansBezemer 2 жыл бұрын
You see this error a lot in stack implementations of people who are not familiar with the concept. The stack pointer is set to the next available position - but when they push a value they seem to think "hey, I got to move to the next position to store my value" - which turns the pointer into a "last used" SP. Often, they either misdefine the "stack overflow" wrong as well - or the stack remains within safe territory during testing. Such errors can go unnoticed for years if users or the application behave reasonably civilized.
@fabmilan_
@fabmilan_ 2 жыл бұрын
Should work fine for just about any case, until throw an empty string at it and you've got yourself a nice buffer overflow. Since the function increments the pointer before dereferencing it, the while statement skips right past the null-termination in the case of a 0 length string.
@lsatenstein
@lsatenstein 2 жыл бұрын
Yes, just test with a zero length string as one case, and your example as the second. As a student exercise I would have programmed it as a do( ... )while (); I leave that do-while to the student. Whereas I would use the strlen() function from the coreutils library
@TranscendentBen
@TranscendentBen 2 жыл бұрын
Perhaps the question should be "at what point in the video do you see it?" I was trying to ignore the talking, and I saw it at about 2:15. My first question was "what does it do and how does it do it?" and next was "what's wrong with it?"
@TranscendentBen
@TranscendentBen 2 жыл бұрын
And what a tease .... lemme run a countdown with that test function: testit ("Five!"); testit ("Four"); testit ("Three!"); testit ("Two!"); testit ("One!"); testit (""); // Rocket asplodes on launchpad
@SnijtraM
@SnijtraM 2 жыл бұрын
Yes it's broken. Interesting is the way I immediately focused on the *place* where the code is broken. The other video referred is called "Don't write clever code" and I immediately spotted a "clever" construct here. "Clever" does not immediately mean broken, but in this case, it is.
@SnijtraM
@SnijtraM 2 жыл бұрын
A correct "clever" construct could be: for (char *end = str; *end; ++end) {} return end - str; The readable, and by all means, my preferred code, would be: char *end = str; while (*end) ++end; return end - str; Keep Occam's razor in mind.
@colinmaharaj
@colinmaharaj 2 жыл бұрын
1:58 I'm saying if it's a zero length string, it's gonna point into areas of RAM it shouldn't. No I didn't read the comments yet. I just wrote this. 2:08 just adding, I want to see a const for the string I'm measuring.
@ganishk3568
@ganishk3568 2 жыл бұрын
I guess it would return → (end-start)/sizeof(char)
@ganishk3568
@ganishk3568 2 жыл бұрын
The reason is the size of char* or char maybe of different sizes in different systems. It is also mentioned in one of your previous video for #stdint_h
@skilz8098
@skilz8098 2 жыл бұрын
Undefined behavior mixing pointer arithmetic on a dereferenced pointer while using a pre-increment. Yet that's not the complete story. Subtracting a char* from a char* and returning it as an unsigned long; Implicit type conversion which could be user or implementation defined by the compiler?
@PythonPlusPlus
@PythonPlusPlus 2 жыл бұрын
Seems to me that an empty string would cause it to enter out of bounds memory. You could fix that by using a post increment and subtracting 1 in the return statement.
@stevencolborne6845
@stevencolborne6845 7 ай бұрын
My complaint is you change the pointer so you are changing constant memory
@cipherxen2
@cipherxen2 2 жыл бұрын
It's missing guard clause if(str==NULL || *str==0) return 0;
@Redman8086
@Redman8086 2 жыл бұрын
Empty strings will cause undefined behavior.
@Qbe_Root
@Qbe_Root 2 жыл бұрын
In cases like this, it helps me to explain out loud what the function is doing, so let's see... It's making another pointer to the same string, then moving it to /the next/ null character... Wait, is "the next" including the character /at/ *str or not? Nope, end is being incremented before any checks occur, so this function can never return 0, and if fed the empty string, it'll run up to the next null byte in memory, which will be at least 1 character too far!
@jonhdoe4119
@jonhdoe4119 2 жыл бұрын
'stringlen("")' will be undefined behavior I believe (not sure if this call is even legal because of const-ness though)
@michalski9141
@michalski9141 2 жыл бұрын
how does that while loop know when to stop?
@PatrickIsbendjian
@PatrickIsbendjian 2 жыл бұрын
In C, strings are terminated by a zero byte. This is a convention that worked well in those days where ASCII was the all dominant character encoding scheme. The while loop is searching for that null byte and then the function returns the difference between the position of that byte and the position of the string's first byte. Not what you should do if handling Unicode but fine with ASCII. The only problem is the use of the pre-increment operator which will cause the end variable to point past the actual string end. The loop will then run until it finds a null byte whithout any warranty this will happen within the user's memory space. If it doesn't you'll get a memory fault and garbage in the other case.
@michalski9141
@michalski9141 2 жыл бұрын
@@PatrickIsbendjian oh yeah i didn’t think about the \0 (false) breaking the loop
@markchristophergemzon1052
@markchristophergemzon1052 2 жыл бұрын
What is the 'return (end-str);' syntax? I only encountered that now.
@SlideRSB
@SlideRSB 2 жыл бұрын
It's pointer arithmetic. Subtracting two pointers will result in a number which can be interpreted as an array index.
@markchristophergemzon1052
@markchristophergemzon1052 2 жыл бұрын
@@SlideRSB Oh yeah. I see it now. That one was tricky.
@israelim125
@israelim125 2 жыл бұрын
Empty strings cause a buffer overflow
@stewartzayat7526
@stewartzayat7526 2 жыл бұрын
It's not a buffer overflow, they cause the CPU to read from memory it shouldn't read from
@mathieu-1941
@mathieu-1941 2 жыл бұрын
At first I thought it was the precedence of the operators, but then I realised it preincrements the pointer so the first (index 0) character is never checked.
@actually_it_is_rocket_science
@actually_it_is_rocket_science 2 жыл бұрын
You are assuming that it's a string object with an null terminator. Strings in c are dangerous if you are not keeping length with the string. Very easy to go out of bounds if there is no null terminator.
@actually_it_is_rocket_science
@actually_it_is_rocket_science 2 жыл бұрын
This is especially dangerous if you have a fixed buffer length and you were copying data in. If you don't copy the null terminator in you get UB due to the fact that depending on what memory comes after your buffer you could overflow.
@actually_it_is_rocket_science
@actually_it_is_rocket_science 2 жыл бұрын
Beyond that it assumes that the length of the string is at least one non-null character.
@ashwinalagiri-rajan1180
@ashwinalagiri-rajan1180 2 жыл бұрын
Hey Jacob, I just watched your video on gdb. I was wondering if lldb was a viable alternative (it's the only option I have since I use a MacBook).
@JacobSorber
@JacobSorber 2 жыл бұрын
Yeah. lldb has worked well for me. Some of the commands are even the same. Sadly, not all. So, there is a bit of a mental translation needed.
@ashwinalagiri-rajan1180
@ashwinalagiri-rajan1180 2 жыл бұрын
@@JacobSorber Alright that's what I was wondering.
@SlideRSB
@SlideRSB 2 жыл бұрын
I see the bug. After about a minute of studying I noticed that the function will have problems with the empty string case.
@SimGunther
@SimGunther 2 жыл бұрын
Something to do with the preincrement and how it might return 1 for an empty string and segfault on NULL string.
@hayfahvytsen
@hayfahvytsen 2 жыл бұрын
Empty string fails!
@aaaowski7048
@aaaowski7048 Жыл бұрын
noob: "hnnngh pre incrementing looks so leet" pro reading noob's code: "so much cognitive overhead for what? a buggy strlen? fml"
@japroz
@japroz 2 жыл бұрын
Great vid!
@sivaramd5637
@sivaramd5637 2 жыл бұрын
I can observe two mistakes: 1. Empty string => the func will return 1 instead of 0 2. the function returns (end-str) which is of type (char *). It will return an incorrect answer if the given string lenght exceeds 8 bytes (in a 64-bit system). I am a little skeptical about mistake 2 since, I haven't tested it yet
@JohnSmith-hf4kh
@JohnSmith-hf4kh 2 жыл бұрын
I think both your points are incorrect. 1. As far as I can tell, the pre-increment is indeed the issue but it is essentially skipping the first character. So for an empty string you are not even guaranteed to get length 1 but (depending on a lot of factors) you are more likely to get a segmentation fault. 2. end-str subtracts two pointers (-> pointer arithmetic) which does not return a pointer but a number (not sure if it's int, long or something else).
@actually_it_is_rocket_science
@actually_it_is_rocket_science 2 жыл бұрын
@@JohnSmith-hf4kh if there isn't a null terminator it's undefined behavior since it'll just keep going in memory that it has access to until it finds a zero. Won't necessarily segfault depending on how memory is laid out especially in embedded systems.
@LRFLEW
@LRFLEW 2 жыл бұрын
"If the given string length exceeds 8 bytes (in a 64-bit system)." This is a non-issue. Any sort of array or buffer in C is guaranteed to have a size representable as an integer of type size_t, which on 64-bit systems will be 8 bytes. Because of this, "(end-str)" cannot meaningfully overflow.
@sivaramd5637
@sivaramd5637 2 жыл бұрын
​@@actually_it_is_rocket_science hmm, interesting. It is as you say. When I supplied ("") to the function, it gave 1 as output since there is '\0' at the end of it.
@sivaramd5637
@sivaramd5637 2 жыл бұрын
@@JohnSmith-hf4kh interesting to know that pointer arithmetic returns an integer type (different from the type of variables used). Is there any resource on this that I could refer to?
@smugtomato5972
@smugtomato5972 2 жыл бұрын
The first character is never tested, so an empty string returns a length of 1. it can be fixed by changing 'while(*++end);' to 'while(*end) end++;' so you actually check the first character
@ShotgunLlama
@ShotgunLlama 2 жыл бұрын
It breaks on the empty string, and I see many others have already noticed this
@md2perpe
@md2perpe 2 жыл бұрын
What value does it give for the empty string?
@SlideRSB
@SlideRSB 2 жыл бұрын
It won't. It will either give a garbage result or a segfault.
@md2perpe
@md2perpe 2 жыл бұрын
@@SlideRSB Correct. The code cannot handle the empty string. It will directly step to the right of the terminating NUL character.
@suncrafterspielt9479
@suncrafterspielt9479 2 жыл бұрын
Can you maybe talk about RiscV?
@agustinranieri
@agustinranieri 2 жыл бұрын
Empty strings don't work, can produce a seg fault or, in most cases, read junk memory Solution: use preincrement and return (end - str - 1)
@the_lenny1
@the_lenny1 2 жыл бұрын
you mean postincrement, preincrement is used in the example, or am I wrong?
@MECHANISMUS
@MECHANISMUS 2 жыл бұрын
Empty string won't work.
@Kamal_Taghlaoui
@Kamal_Taghlaoui 2 жыл бұрын
/* the bug: seg. fault when str is empty */ while(*++end); /* a solution */ while (*(end++)); return end-str-1;
@iuppiterzeus9663
@iuppiterzeus9663 2 жыл бұрын
will break on empty string and on NULL pointer
@rafalmichalski4893
@rafalmichalski4893 2 жыл бұрын
while(*end++); return (end - str - 1);
@lsatenstein
@lsatenstein 2 жыл бұрын
A challenge for your readers/students I wrote the following function. Incorporate an enum for the right most argument. I hope you do not mind the challenge to the reader. I have the enum commented out. comment out the three defines and replace the right argument with an enum. Can the reader improve on it? test with " this is a string " and with " " and with "" #ifndef EOS #define EOS '\0' #endif // enum trimside { trimleft = 1, trimright = 2, trimboth = 3 }; #define trimleft 1 #define trimright 2 #define trimboth 3 //external Linux function rawmemchr() char * rawmemchr(char *,char); char * stringtrim (char * restrict in, unsigned option) { register char * restrict cp; char * restrict fr; if (in != NULL && *in != EOS) { if (option & trimright) { cp = rawmemchr(in, EOS); /*in + strlen (in); */ while (--cp >= in ) if(*cp > ' ') break; *(cp + 1) = EOS; } if(option & trimleft && (*in
@simrego
@simrego 2 жыл бұрын
As many wrote the empty string, but a NULL will also kill it.
@toshibasony9222
@toshibasony9222 2 жыл бұрын
Let me guess - passing a pointer to a NULL breaks it?
@simonmaracine4721
@simonmaracine4721 2 жыл бұрын
My guess without trying anything is that passing an empty string will return wrong result.
@lemonsh
@lemonsh 2 жыл бұрын
theoretically it may segfault if the 'end' pointer escapes the program's memory and tries to dereference protected memory
@ReneWOlsen
@ReneWOlsen 2 жыл бұрын
The problem is empty strings reutrn 1 not 0.
@JannePaalijarvi
@JannePaalijarvi 2 жыл бұрын
Am I being weird for not wanting to see the bug at all, and instead just wanting to refactor everything first?
@stewartzayat7526
@stewartzayat7526 2 жыл бұрын
What is there to refactor though? It's just 3 lines of code
@JannePaalijarvi
@JannePaalijarvi 2 жыл бұрын
Refactoring is not about lines of code. It is about structure, readability and understandability of the code. Sometimes proper refactoring actually adds lines.
@stewartzayat7526
@stewartzayat7526 2 жыл бұрын
@@JannePaalijarvi sure, but this code seems clear enough. The majority of people spotted the bug in a few seconds. Being able to understand what the code does constitutes to me readability and cleanliness. There are obviously many ways you could write the same functionality and every person has their own coding style, and this coding style seems understandable to me. What would you suggest to refactor, if I may ask?
@JannePaalijarvi
@JannePaalijarvi 2 жыл бұрын
@@stewartzayat7526 : I would start with something like this: 1. Declare function before the definition. 2. Modify code and function to use to get EXACTLY known variable lengths 3. Name all variables to have definitions affixed to them, for example uint16_t u16MaxSize, const char* csInputString 4. Modify function definition to return SUCCESS/FAIL via some semantic 5. Modify function definition to have parameter of maximum string length. You might use the same parameter to store the result length or create another parameter just for that. Maybe latter is cleaner. 6. const the parameter string variable if I did not make myself clear yet about that 7. For readability, separate distinctive areas with whitespace bordering: I) Local variables definition II) Loop III) Return statement decisions 8. Remove char *end. Most of the time it does not even mean end. It would be like sStringIterator, but it is actually not needed at all. 9. Use fail-safer for-loop in determining the length, check character at parameter string + loop-variable amount, start at 0, increment 1. Leave no theoretical place for looping forever. Character being 0 of course signifies that string has ended. 10. If u16MaxSize has been reached or - for some reason - exceeded, store 0 as string length via parameter pointer and return FAIL. If u16MaxSize has NOT been reached, store loop-variable as string length via parameter pointer and return SUCCESS. Caller is responsible for defining the maximum size.
@joemacdonnagh6750
@joemacdonnagh6750 2 жыл бұрын
Strcpy().
@thepawday
@thepawday 2 жыл бұрын
stringlen("");
@_Omni
@_Omni 2 жыл бұрын
empy string returns 1
@TopchetoEU
@TopchetoEU 2 жыл бұрын
It may return anything else, as well as segment fault, since you don't really know what is past the string
@_Omni
@_Omni 2 жыл бұрын
@@TopchetoEU yeah
Testing matters. Watch those corner cases.
5:47
Jacob Sorber
Рет қаралды 9 М.
How to Check Your Pointers at Runtime
14:12
Jacob Sorber
Рет қаралды 31 М.
Smart Sigma Kid #funny #sigma #comedy
00:40
CRAZY GREAPA
Рет қаралды 37 МЛН
Алексей Щербаков разнес ВДВшников
00:47
How to make memory read-only in your C programs.
12:57
Jacob Sorber
Рет қаралды 19 М.
The purest coding style, where bugs are near impossible
10:25
Coderized
Рет қаралды 947 М.
Can I Handle Exceptions with Try Catch in C? (setjmp, longjmp)
11:13
How do I access a single bit?
11:07
Jacob Sorber
Рет қаралды 20 М.
TechHistory: The Story of The First Computer Bug
5:35
The TWS Channel
Рет қаралды 10 М.
I made it FASTER // Code Review
38:46
The Cherno
Рет қаралды 534 М.
How principled coders outperform the competition
11:11
Coderized
Рет қаралды 1,6 МЛН
Making variables atomic in C
11:12
Jacob Sorber
Рет қаралды 36 М.
Smart Sigma Kid #funny #sigma #comedy
00:40
CRAZY GREAPA
Рет қаралды 37 МЛН