an additional “bonus” comes with multithreading: v[1] = true; in one thread & v[2] = true; in another thread is a race condition
@TakenTooSeriously2 жыл бұрын
Yeah that's a really good point. I can't think of a way around this with bit sets in any case though.
@raymundhofmann76612 жыл бұрын
Not if the "&= mask" and the "|= mask" in the example gcc library code would be made a atomic operation. I looked in my "stl_bvector.h" and saw " typedef unsigned long _Bit_type;" which is the underlying type used. If this would be made atomic, race conditions should be fixed there. Unfortunately doing "std::vector" doesn't do it, so you have to copy it and make your own vector.
@anon_y_mousse2 жыл бұрын
@@raymundhofmann7661 Exactly, a standard and an implementation that aren't fully thought out inhibit all. A vector of bool's isn't inherently bad, but the way the committees and various implementers have handled it is bad.
@TakenTooSeriously2 жыл бұрын
@@raymundhofmann7661 how would you make bitwise operations atomic? That sounds like it would require a lot of compiler work
@khatharrmalkavian33062 жыл бұрын
If we ever achieve time travel, the decision to specialize vector is our first destination...
@youtubehandlesux2 жыл бұрын
Nah, the first move should certainly be telling Bjarne to call vectors dynamic arrays to stop it from being confused with math vectors Although the best action would probably be checking on the lads at LHC to make sure they don't have any funny ideas, just in case they figured out time travelling first...
@willofirony2 жыл бұрын
Thanks for that, Jason. I have always puzzled about the "weirdness" of vector (though my curiosity didn't inspire the detailed examination you gave us). There has always been a whiff of "here be dragons" about it. You demonstrated that it was well deserved.
@Qazqi2 жыл бұрын
Ranges come with new cases, so it's not as rare as it used to be. For example `zip` on two ranges returns a temporary pair containing both elements. If you write your generic algorithm to not allow temporaries, it won't work if someone passes in `zip`ped ranges. `transform` returns the result of calling the transformation on the element, which is quite likely a temporary. It's no longer an option to ignore this unless you know vector might be involved. We can't treat this as a special case.
@killerbee.132 жыл бұрын
I think most people are using range compositions, and not directly accessing the range iterators themselves. If you're just looping directly over a std::ranges::transform_view, there's not much benefit over just using the old std::transform instead. So this is more common, but I think it's mostly only affected the standard library itself, and other libraries which use the new C++20 iterator concepts, rather than 'normal' application code that is mainly either non-generic, or only generic over a closed set of inputs.
@jhbonarius2 жыл бұрын
Every now in a while I run into this, where I just want a dynamically sized array of booleans, don't care about storage space and want to use RAII and generic algorithms that should work for any other vector. They shouldn't have called it 'vector' because it behaves significantly different from all other vectors. It should have been something like dynamic_bitset or so.
@0xybelis2 жыл бұрын
vector could be a substitute for a usual vector of bool
@jhbonarius2 жыл бұрын
@@0xybelis yeah, i know. But it can be more values than 0 or 1, so it's not a type safe solution. It's more of a hack, required due to a (imho) bad decision in the C++ standard.
@mytech67792 жыл бұрын
@@jhbonarius A bool only requires 0. It is zero or any not-zero so it is typesafe. Unless you are saying the the packbit vector idea is fundamentally good but poorly specified and implemented?
@IasonNikolas2 жыл бұрын
Nice episode but can you explain what might go wrong if I write 'for (auto &&elem : data)` for other vector types like vector? I didn't quite get why I should avoid that...
@verylongtrain2 жыл бұрын
Nothing really... I almost always use auto&&... unless I want a copy or it's a basic type
@alagner882 жыл бұрын
Josuttis actually recommends that over a single &. Matter of need and style.
@benjaminnavarro8652 жыл бұрын
For vector it is fine but be careful with other containers/ranges as they can give copies instead of references and it that case you end up mutating a temporary. I prefer to always be explicit between auto/const auto&/auto& to make my intention clear and trigger a compilation error if what I want is not possible
@cppweekly2 жыл бұрын
I'm going against the grain here. As I said in the video (or meant to say) they each imply different meanings. auto &: I intended to mutate const auto &: I don't intend to mutate auto &&: I intend to move from it, or it is special in some way
@alagner882 жыл бұрын
@@cppweekly please correct me if I'm missing sth, but I generally tend to lean towards the opinion that `auto&&` is superior to `auto&`, as the latter can be moved from as well, but it cannot bind to temporaries (unless that is what one is trying to prevent from, of course). I'm not saying this is the only way, but I think this reasoning makes sense.
@benoitrousseau41372 жыл бұрын
Probably one of those ideas that sounded great until people begin to use it, like exception specifications. In retrospective I think they should have added a bitvector instead. For fixed size bit fields we use bitset not array.
@GroovThe2 жыл бұрын
I doubt the problem came up after people began to use it. It must have been evident during the implementation at the latest that they are unable to implement the recommendation reasonably. The result is not in the spirit of the standard. The only thing it does is that it replaces a good abstraction with a leaky one. :(
@AxWarhawk2 жыл бұрын
I'll never understand why this functionality was accepted as a template specialization for std::vector as you can't simply opt out of a specialization. I wonder what the arguments were against making it a dedicated class.
@anon_y_mousse2 жыл бұрын
Specializations are actually a good thing, they can constrain space usage, as it would here, and/or make things more efficient computationally. The problem here, and it seems to occur a lot with C++, is that they didn't go far enough. They should have standardized a reference iterator type for it. It wouldn't be difficult for someone to extend if themself, but you shouldn't have to. Of course, I would make it hardier than they did with gcc's libstdc++ and use an index into the vector as opposed to a pointer so that subsequent reallocations wouldn't invalidate the iterator (another common annoyance).
@dieSpinnt2 жыл бұрын
But some of this is also due to the platform (and history of f.e. x86, even 6502, etc.). On a machine-code level, please choose the appropriate platform, like micro-controllers, to utilize bit-shuffling which you beg for .. and which you demand?:) (not you. generally speaking) Some are just not able to do that kind of "economic" treatment of bit-mathematics. Because it doesn't matter? Because it doesn't matter! :) Greetings to my friend Intel 4004, Ferranti and Z80, here :P
@dougmair11982 жыл бұрын
@@anon_y_mousse o
@beandon242 жыл бұрын
Real answer is because the language is a mess :)
@TheMrKeksLp2 жыл бұрын
@@anon_y_mousse Except that std::vector actually ONLY optimizes for space effiency and not computational efficiency. Sure it's not a major problem... I mean every major platform can do arbitrary wide shifts but still. A vector should be a vector and a bitset should be a bitset This is actually surprising for a language like C++ that prides itself with "you know what's happening under the hood"
@Sadiinso2 жыл бұрын
At the end of the video you said that std::vector is *not* required be implemented like that. But, wouldn't the const auto& bit: data iterator work if the class is not specialized ? Does this mean that C++ code using std::vector can be incompatible with some stdlib distributions ?
@killerbee.132 жыл бұрын
It's not required to actually use bit packing, but it is required to *act* like it does. Also, there are a few new member functions for it, like flip(). So, it must be specialized, even if the specialization only exists to just make the interface worse and not change the implementation at all.
@Sadiinso2 жыл бұрын
@@killerbee.13 oh ok, it make a lot more sense now, thanks for the explanation
@Bodyja2 жыл бұрын
For many years now I dumped std::vector, everytime I need one y do: using Bool = char; std::vector. It kinda sucks when you want to print it but you can always use the !!v[i] trick
@redcrafterlppa3032 жыл бұрын
I think they should have put it in its own type. It's a special use case and also changes how you can use the vector. A standard container should always behave the same not depending on the type it contains. (my opinion)
@X_Baron2 жыл бұрын
I think std::vector is pretty universally considered a wart in the language. It behaves unexpectedly, has limited functionality, and doesn't really belong in the container library since it doesn't even meet the requirements of a container.
@TheMrKeksLp2 жыл бұрын
They should have just called it BitSet because that's exactly what it is
@gast1282 жыл бұрын
Dilemma though the space optimization is not something to drop easily. Boost's dynamic_bitset plays a similar role if you want to manipulate bits with unknown size. Ideally I would also like to have SBO for dynamic_bitset's since the allocation can be relative heavy.
@TNothingFree2 жыл бұрын
I thought so as well, a struct like "bit_reference" which just holds the data and bit location and performs the calculation. I wonder what are the benchmarks for bvector and just using array of booleans.
@spicynoodle74192 жыл бұрын
Is there a use-case for Boolean arrays? I always just make a uint and bit mask. It's infinitely faster than iterating an array.
@embeddor30232 жыл бұрын
I seriously don't get the hate on vector. It's quite hard to implement properly and you get it basically for free from the STL. And besides that, if anybody cares about performance that much, that they can't afford the bit masking , then they shouldn't be touching dynamic memory allocation using std::vector at all whether it's bool or not.
@modkins2 жыл бұрын
You cannot use them with std algorithm.
@embeddor30232 жыл бұрын
@@modkins I just tested it with std::transform. Seems to be working fine (MSVC2022).
@soveu82372 жыл бұрын
It's not about bit masking, writing a single bit requires at least an instruction to pull a whole word and an instruction to write the updated word back. Bitwise and/or/shifts are pretty cheap, the main cost is clogging the pipeline with two dependent memory operations. "If you're caring about performance you shouldn't use dynamic allocation" isn't a really good argument, because the memory can be allocated once at startup/initialization and then just used without reallocations. Specialization for bools is just inconsistent, other containers like std::array or std::span don't have them For example, if someone wanted to build a hybrid of std::array/boost::static_vector and std::vector, that person would need to specialize for bool just to "un-specialize" the structure itself from std::vector, because you won't be able to get a `bool&` or `span` from it. Joking a bit, PHP is also weird and not that fast, but it's for free! ;)
@embeddor30232 жыл бұрын
@@soveu8237 if you're going to initialize the entire array of bools at the beginning and do no pushes then you are better off using a unique_ptr for ct unknown sizes or std::array for ct known sizes. Performance always depends on the context and arguing about it with no given context and benchmarks is pointless. I could bring in the argument that vector is faster because it's more cache friendly as its information density is eight times higher than an array of bools, but with no actual context/HW everything could be true.
@anon_y_mousse2 жыл бұрын
@@soveu8237 That depends on the processor in use, and multiple updates to bits in the same chunk could be made at the same time which would save some pipeline space. Perhaps another reason why CISC processors are better than RISC as you don't need to Load, Or then Store. The thing that would make bitmaps more complex here is you need to shift the index down to access the chunk that bit is in and shift by a runtime quantity to make a mask. It doesn't require any more than a regular array of bytes would otherwise and inverting a bool in an ordinary array would require the exact same Load, Xor, Store on RISC chips versus merely Xor'ing on CISC. Though, if you were accessing them sequentially, that could be optimized further in specialized code which would more effectively fit the pipeline than wasting a whole byte per bool, and is probably why they wanted it this way, regardless of whether any implementation does it that way, or can in the case of special iterators which should exist in the standard but don't.
@juliean1232 жыл бұрын
std::vector was one of the main reasons I enrolled my own vector-implementation. I have a type-system abstracts vector over a set of predefined types, and having to treat "bool" specially was just so effing tedious. Knowing that all possible vectors are effectively the same layout enables lots of optimization/simplifications.
@noxagonal2 жыл бұрын
I learned about this quite a while ago and I remember thinking that this might cause problems, so I never used it and opted to use something else. Never looked too deep into what kinds of problems it might cause though, I appreciate the explanation.
@vorrnth87342 жыл бұрын
Why have it at all? We already have bitset for bit fiddling.
@cppweekly2 жыл бұрын
Bitset cannot grow
@matthieud.11312 жыл бұрын
I wonder why std::array doesn't have the same type of specialization for bool? And going a step further, we could imagine some other STL containers having specializations for some other types, e.g. std::map could be implemented with just a 256-byte array of optionals...
@LEpigeon8882 жыл бұрын
Because it was a bad idea and they don't want to repeat the mistakes of the past.
@cppweekly2 жыл бұрын
TBF it would also have a strong overlap with std::bitset
@simonfarre49072 жыл бұрын
The space optimization argument is so ridiculous though, that one MUST think that the one who made it, did it as a gag. The size of std::vector itself is 40 bytes. Its got to be one of the most not-thought through "optimizations" ever done.
@Spartan322 Жыл бұрын
Funny enough this also means that even just setting values on the bit reference objects is not atomic, so vector can't support any parallelism whereas technically every other type in vector is so long as its position is not changed in memory after getting the reference, (which generally shouldn't happen if you don't like race conditions), so you can't even use a parallel loop or vectorization on vector.
@cppweekly Жыл бұрын
I think you are technically correct. We know we cannot mutate the vector itself from multiple threads, but we are allowed to mutate different objects in different threads at the same time. However, an interesting way to look at this is that a "bit" is not an object. Bits don't exist in C++. OTOH, we have to address the fact that we asked for a vector (which are objects) and the implementation chose to give us back a proxy into that vector....
@Spartan322 Жыл бұрын
@@cppweekly I'm aware that there are no actual bits in C++, I'm just using "bit" as a shorthand for what is functionally at max efficiency akin to a single bit of data. (where it only takes up the space of one bit in memory so long as every element of the byte is filled with an actual value, its a theoretical proxy of a bit but any actual storage in memory needs to be a byte in size)
@bloodgain2 жыл бұрын
Bit fields cause some other weird behaviors, like breaking the automatic conversions that make fmtlib (and therefore spdlog) work. You need to do an assignment or static_cast to a fully-fledged type before passing it as an argument. Passing a bit field to fmtlib seems to work in the current version of the 3 major compilers, but it's not officially supported by fmtlib. I recently opened and closed an issue discussing it on both fmtlib and spdlog. Although it works, it may not be defined behavior, and Visual Studio's IntelliSense will say that it's wrong even though it compiles. My solution was to provide a very simple template function, named bit_field, that returns an integer type wide enough to contain any size field we will use (32 bits, in our case). This is cleaner than static_cast everywhere, clearly shows the reason for its use, provides only the conversions needed, can be automatically optimized if used where not needed, and is portable. However, trying to restrict the template argument to integer types ran me into some issues. I don't know if this was a compatibility problem with bit fields, a limitation with C++14's available features, or just a lack of knowledge on my part: template inline unsigned int bit_field(T value) { return static_cast(value); } I could/should(?) probably drop the static_cast. I hadn't looked at it recently, and I'm not sure why I decided on this style, which is probably less optimizable by the compiler than it could be. I should also be using a portable fixed-width type like uint32_t. I'm wondering if I dropped an update to this in a merge. (Instinctually, I want to take a ref/universal ref argument, but that results in the same error from spdlog/fmtlib if you pass a non-bit-field unsigned int, even though this claims that the type should be returned as unsigned int. It also results in "more than on instance of overloaded function matches" or unresolved external symbol for reasons I haven't wrapped my head around.)
@negidrums9152 жыл бұрын
Which is faster specialized vector or non-specialized vector?
@GeorgeTsiros2 жыл бұрын
the answer is some combination of "it depends" and "you must measure it"
@killerbee.132 жыл бұрын
A vector will probably be faster for everything except copying, but it will use about 8x as much memory. That is only an issue though if you know you have a really huge bit set to work with, as generally 1 byte per element isn't bad anyway.
@JesseBusman19962 жыл бұрын
With 64-bit addressing we can easily spare 3 bits to have pointers to bits instead of pointers to octets. This would clean up a lot of mess, though I'm not hopeful to see it happen in the near future. Soooo much stuff relies on octets. :(
@iamjadedhobo2 жыл бұрын
Texas Instruments made graphics processors that did that (TMS34010 and TMS34020 around 1988 IIRC) :)
@WouterStudioHD2 жыл бұрын
When was this choice made? Was is like 30 years ago or more recent?
@benjaminnavarro8652 жыл бұрын
Pretty sure this is a C++98 thing
@vladigr12 жыл бұрын
proxy structure sound interesting, do you have episode about it??
@RA-NAF2 жыл бұрын
I'm guessing its just a struct with an uint8_t with 8 values, and another uint8_t for the bit number (from 0 to 7). Then, it overrides every operator with masks that set/get only that specific bit.
@thatsnuts89352 жыл бұрын
Hi Json, you could instead use: vector bools: bools.push_back(json{ {"isTrue", false}, {"id", rand()%100000}, }); bools[0]["isTrue"] = true; you're welcome! Javascript developer.
@spicynoodle74192 жыл бұрын
lmao
@youtubehandlesux2 жыл бұрын
username checks out
@qazarl2 жыл бұрын
vector
@von_nobody2 жыл бұрын
Probably only thing I would be happy to be removed from C++, all this functionality would better fit `std::bitset` than `std::vector`
@MrAlbinopapa2 жыл бұрын
I was thinking the same, then remembered that you have to specify a size for template argument. This basically makes it more of a specialization of std::array ( std::array ). I personally can't imagine needing thousands if not millions of bools, enough to need to dynamically allocate for them, but I'm pretty limited on the scope of what's needed in the community. Other than that, I agree. std::bitset pretty much does the same under the hood as far as bit masking so even the space requirements aren't out of the realm of fitting on the stack in most cases.
@MrAlbinopapa2 жыл бұрын
@@__Brandon__ yeah, it would make more sense. Bitcast isn't a typical STL container so people wouldn't expect it to behave as one like they do with vector.
@PanDiaxik2 жыл бұрын
I think they could make another type, let's say bit and make vector specialization for it. It would allow people to choose between fully functional vector of bools with inefficient memory usage and vector of bits with limited functionality, but efficient memory usage.
@taragnor Жыл бұрын
The problem is that you can't really have a standalone bit in C++ for the exact same reasons the video outlined. Your data types have to all take up at least 1 byte. So you can never really make a bit type that's actually a bit. The thing is that in C++, bits don't really exist. The minimum size of something is always 1 and that size is in bytes. All access to a bit have to essentially be some kind of proxy construct. There are space-wasteful constructs that behave more like any other type (aka boolean) and there's space-saving constructs that work weird because they're dealing with a unit smaller than C++ can normally handle.
@Tyranisaur2 жыл бұрын
Is the proxy object from vector one of the best use cases of decltype(auto)?
@catorlife Жыл бұрын
it's surprising when those big-brain engineers back in the day when they created this "bool vector" concept didn't think about refernce stuff, I then they had to pad the inconvience in so clumsy and awkward way like that
@admercs2 жыл бұрын
Why can’t we operate directly on bits in C++? It is ironically the machine language. Super annoying at times.
@LemonChieff2 жыл бұрын
I’ve never used std::vector. I have used (don’t try this at home) std::basic_string, though. (Not in prod.)
@anon_y_mousse2 жыл бұрын
You my good sir, disgust me, but not in a bad way.
@cppweekly2 жыл бұрын
I can honestly say I have no problem with this at all, as long as you document why you chose it and probably hide it a bit like: using bit_string = std::basic_string
@gnolex862 жыл бұрын
I hate vector of bool with a burning passion, every time I have a template that provides its type to vector, bool as an argument causes so much annoyance due to proxy objects. One time I even resorted to putting bool in a struct just so I can use bool&.
@TheMR-7772 жыл бұрын
“WeIRd” is reminding me of “LOKI”, in the thumbnail
@GeorgeTsiros2 жыл бұрын
4:20 "And you're still pushing back ints!" as the great DeSinc once said: "If you gotta fuck up then you might as well go for gold... just slam it in there, multiple times, just keep slammin' it..."
@PawelKraszewski2 жыл бұрын
"How do I take a reference to a bit. This makes no sense" "Hello there" says any microcontroller with bit-addressing...
@cppweekly2 жыл бұрын
I had no idea there were microcontrollers with bit addressing. I tried googling and cannot find anything like that.
@PawelKraszewski2 жыл бұрын
@@cppweekly 8051 family had "bit addressable RAM". Each of the 128 bits of 0x20..0x2F area can be individually addressed in a special 0x00..0x7F addressing mode.
@treyquattro2 жыл бұрын
I prefer std::bitset which makes the representation explicit. The downside is that bitset is a fixed size.
@SeppyYT2 жыл бұрын
there is dynamic_bitset in boost library
@rallolollo93092 жыл бұрын
never used vector of boolean because i always used a bitmask or some other way to resolve problems needing arrays of booleans and never had to deal with this but this this is madness like wt the actual fk??
@victoreijkhout61462 жыл бұрын
A reference to a bit is only impossible if the standard assumes the now-common hardware. I thought the standard tried to be hardware-agnostic? But yes, I've run into this weirdness when learning C++ and it confused me quite a bit. (Ha!)
@YSoreil2 жыл бұрын
It's hardware agnostic yes but it is not a superset of all possible hardware, it targets a specific "abstract machine". There are quite a lot of hardware constructs which you can't express in C++
@69k_gold Жыл бұрын
This is why Java creators stopped worrying about memory
@DaddyFrosty2 жыл бұрын
You know you’ve done too much low lever stuff when u instantly know what the video is about from the thumbnail alone
@ohwow20742 жыл бұрын
Finally someone cleared this up. Also isn't the C++ standard hardware agnostic? I thought the specification of C++ is very abstract and they don't consider hardware limitations when designing language constructs. But I may be wrong.
@Nobody17072 жыл бұрын
It's true that the C++ standard is hardware agnostic. That's why it describes all operations as being performed on an "abstract machine". The smallest unit of memory that can be addressed by the abstract machine is a `char` (or now `std::byte`). This is a hardware agnostic requirement, because you can always implement byte addressing on top of bit addressing, but you cannot implement bit addressing with byte addressing. So the standard chose byte addressing as the least constraining requirement. Importantly, a byte doesn't have to be an octet. Historically, it could be seven or nine bits, but even on recent hardware it can be the size of a whole machine word (as found in DSPs).
@ohwow20742 жыл бұрын
@@Nobody1707 interesting. Tnx.
@killerbee.132 жыл бұрын
@@ohwow2074 Bytes have been as small as 6 bits on some architectures, but C required CHAR_BIT >= 8, which C++ inherited, so in C++ you never have to worry about bytes being smaller than you expect. They might be larger though. If you did want to implement C/C++ for, say, a 36-bit word-based architecture, you'd probably have to set CHAR_BIT = 9 or 12, depending on how the memory addressing works. Anyways, C++ is hardware-agnostic, but not so hardware agnostic as to not acknowledge that computers use binary and that bytes are comprised of bits. So it's not at all out of scope for the standard to talk about memory optimizing bit storage. However, that doesn't make it a good idea to do it for the most common (and I'd argue most important) container type in the library.
@YSoreil2 жыл бұрын
I have never even considered using this type. It doesn't behave like a standard container so it really doesn't offer me anything I can't get from bitset in most cases.
@GeorgeTsiros2 жыл бұрын
sizeof(bool), much like sizeof(char), equals 1 because that's how many bytes it occupies. struct HasPadding {char acter; long long man;}; sizeof(HasPadding) may be 9, may be 16. _That's how many bytes the type occupies_ but "sizeof" does not _necessarily_ tell you anything about how the "values" are "stored" in an instance of that type. An assumption that Vector occupies N*sizeof(T) bytes would be incorrect, imho. All in all, I find none of this "weird" in the sense that it is "unreasonable". "Weird" as in "huh, would you look at that..." or "hah! clever!", sure. "Weird" as in "leaves a bad taste in mouth", not at all.
@DamianReloaded2 жыл бұрын
The actual weird thing is the 'ranged for' really. If you were to use a classic 'for' the iterator would't be a reference either.
@theGoldenHeel2 жыл бұрын
...but I mean, isn't the byte to be the smallest addressable unit of memory, defined by the architecture? This is just another hack for what the programmer should already know how to deal with.
@GeorgeTsiros2 жыл бұрын
1:18 what the hell? repre- sentation WHAT THE HELL?
@petermuller6082 жыл бұрын
I would have assumed a reference to a bitfield would express a reference to a bit