CppCon 2017: Nir Friedman “What C++ developers should know about globals (and the linker)”

  Рет қаралды 49,005

CppCon

CppCon

6 жыл бұрын

Presentation Slides, PDFs, Source Code and other presenter materials are available at: github.com/CppCon/CppCon2017
-
Most C++ developers know that using globals (or singletons) excessively in code is usually a poor decision. But it is also hard to deny their usefulness in more than one situation: for loggers, factories, intrusive performance benchmarking, and other very useful applications, globals can be very helpful.
Unfortunately, usage of globals in C++ ends up exposing details of the language not seen so often otherwise. Instead of dealing with our tried-and-true, well specified compiler, we often need to look more at the behavior of the linker, which can vary more between platforms and between different options such as static vs dynamic linking.
This talk will look at some surprising pitfalls that can occur with globals: valid programs only a dozen lines long that can trigger segfaults when using the gcc linker. It will explore some of these issues using command line tools like nm and objdump. The talk will enumerate several pitfalls to be avoided. It will look at some of the different axes of the problem space such as: globals that are private versus public, eager versus lazy, global inter-dependencies, and quality of emitted assembly. Finally, solutions and idioms will be presented that address the major use cases, and make it easy for developers to use globals (only where necessary!) in a safe way.
-
Nir Friedman: Tower Research Capital, Trading Software Developer
After completing a PhD in physics, Nir started working doing C++ in low latency and high frequency trading. He's interested in the challenges of writing robust code at scale, and highly configurable code that minimizes performance trade-offs.
-
Videos Filmed & Edited by Bash Films: www.BashFilms.com
*-----*
Register Now For CppCon 2022: cppcon.org/registration/
*-----*

Пікірлер: 47
@greenUserman
@greenUserman 5 жыл бұрын
Cool! I really liked this talk. Made me realize I don't know enough about the linker.
@MPDR
@MPDR 3 жыл бұрын
I once had a problem where I had a static vector that was filled at "compile time" with information about dynamic types for an ECS implementation. I encountered the bug when I noticed the vector was being filled before main() executed and then, when main() executed, the vector was actually empty, I ended using the lazy loading function solution shown in here that I found in a website. Cool to be able to actually understand why that was happening. Great talk!
@ChristianBrugger
@ChristianBrugger Жыл бұрын
This talk has some really great questions and comments at the end.
6 жыл бұрын
In most cases you don't need to have the global variable as part of the shared library export table. You can use a linker script to remove the symbol for the variable from the export table and only leave the functions/classes that are actually used by the host program visible. This way the shared library will always get private instances of its own globals.
@quickNir
@quickNir 6 жыл бұрын
Unfortunately I didn't have time to talk about visibility. As you said, you can hide visibility and this does resolve these issues, but then you get multiple copies of the global. I don't know if I agree with "most"; certainly any global that you don't own yourself, it's not really safe to hide. E.g. if you write a library that uses a logging library (very common), you shouldn't fail to export globals from that logging library. Someone might want to use the logging library and your library, and not want there to be two global loggers. In some cases it would be ok, if you say have a factory as an implementation detail. But many factories, part of the whole point is that clients of the library can register in it. So I think it simply depends.
6 жыл бұрын
Yes, you are right. My comment was intended to be an observation on your idea that more than one instance of the same global is less "evil" than multiple constructor/destructor calls on the same instance.
@ceigey-au
@ceigey-au 6 жыл бұрын
Off topic: hello fellow aviator-wearing sloth person!
@zhaoli2984
@zhaoli2984 6 жыл бұрын
nice talk. that trick is neat
@iddn
@iddn 6 жыл бұрын
AFAICT the C++ standard totally ignores the fact that shared libraries exist. Everything is assumed to be a monolithic application. This leads to problems not only with Boost.Log but also Boost.ASIO, and can even prevent mixing the two in the same application. It's a nightmare.
@ryannicholl8661
@ryannicholl8661 2 жыл бұрын
Static/dynamic are implementation details.
@llothar68
@llothar68 7 ай бұрын
@@ryannicholl8661 Implementation should be part of the language. It's insane that stupid (abstract guys with high IQ but stupid as shit in real life) try to isolate the language from the world it's living in. That we don't have standardized language enforced build tools is part of it. This is not the 1980s anymore where specifiying a language is enough. I hate you people from the deepth of my heart.
@sanderjobing
@sanderjobing 6 жыл бұрын
With C++14 using gcc 5.4, I tried declaring a static g_str in the header file as shown in the presentation at 19:57, but this does not link: multiple definition of `detail::g_str[abi:cxx11]()'. Anyone tried this solution also? I was thinking of using it as a best practice.
@quickNir
@quickNir 6 жыл бұрын
Do you have the include guard for your header file? Sorry if that's a silly question. You definitely shouldn't have problems declaring (defining, to be precise) a *static* variable in the header.
@ryannicholl8661
@ryannicholl8661 2 жыл бұрын
Of course not, this is an ODR violation.
@kwkfortythree39
@kwkfortythree39 6 жыл бұрын
I go to slides url but I cannot find the ones corresponding to this talk, only the general repository. What's the direct link please?
@r-jwolthuis7243
@r-jwolthuis7243 6 жыл бұрын
It is CppCon's default message. It might be Nir Friedman hasn't given them the slides (yet).
@quickNir
@quickNir 6 жыл бұрын
Apologies about that, I need to submit them. Here is the direct link: www.nirfriedman.com/reveal_globals_linker/.
@zhaohui0923
@zhaohui0923 6 жыл бұрын
could you please submit a pdf or slide version to the official git repository? I still can't find it. Thanks a lot.
@shivanshu3
@shivanshu3 6 жыл бұрын
This doesn't seem to happen with MSVC. MSVC creates 2 separate globals - one for the dll and one for the exe. But I was able to repro this with g++ on Linux. I'm using cl.exe version 19.11.25506 and link.exe version 14.11.25506.0
@bloodgain
@bloodgain 5 жыл бұрын
I would see the creation of 2 separate globals as an error. It's certainly unexpected behavior from the programmer's perspective. Here, where the example is just a value, it may not matter, but it certainly matters in many cases -- e.g. the example given of a global logger.
@andersknatten
@andersknatten 2 жыл бұрын
I have the same problem. This works fine on Linux, but on Windows with cl.exe 19.16.27045 and link.exe 14.16.27045.0 I get two separate globals.
@pauldubois0
@pauldubois0 2 жыл бұрын
It's because in Unix toolchains, symbols in shared libs default to being visible; but in MS toolchains they default to being hidden. See other comment threads on this video for the implications of symbol hiding.
@andersknatten
@andersknatten Жыл бұрын
@@pauldubois0 Did you verify this? I don't have the code I used to test with anymore, but I'm pretty sure I exported the symbols when testing with msvc.
@Manava2012
@Manava2012 4 жыл бұрын
@25:00 Defining global in header file is not a possibility.Right? You would have multiple global with same name with each translation unit and Build Fails. Not sure what is meant here.
@keris3920
@keris3920 4 жыл бұрын
Globals can be defined in headers. Consider a header only library where all headers are included in a single header file. You can define a global at the top of your aggregate header and use it in all subsequent headers.
@lesto12321
@lesto12321 Жыл бұрын
as embedded programmer we have tons of global that need performance access, need to share data between HW interrupt routine and main task/threads
@andersknatten
@andersknatten 2 жыл бұрын
One important heads up: Make sure that these global variables (or the functions that contain them) are exported in the dynamic library! Both in the case of the inline global, and in the case of the static local (the only two I checked), you get two different copies used in the running process if the symbols are hidden. The loader / dynamic linker is not able to ensure uniqueness of hidden symbols. When making a shared library, it's not unusual to do `-fvisibility=hidden` to only export the symbols you intend to export. - For the inline solution you can just annotate the variable it with `__attribute__ ((visibility ("default")))` to export it, and everything works. - For the static local solution, you have to annotate the function *containing* the static local with `__attribute__ ((visibility ("default")))`
@KobiCohenArazi
@KobiCohenArazi 4 жыл бұрын
Nir - around 22:03 - why std::string& g_str and not auto& ? there was a comment from the audience. thanks.
@puyadaravi3109
@puyadaravi3109 3 жыл бұрын
@Kobi Cohen-Arazi Because then you would be accessing the global via a pointer (he could have used `auto` or `std::string`). Also I think there is a `static` keyword missing for the `g_str` declaration.
@retrograder3303
@retrograder3303 2 жыл бұрын
so tldr; include headers that your TU depends on in a single header
@ryannicholl8661
@ryannicholl8661 2 жыл бұрын
ODR violation?
@AbbeyRoad69147
@AbbeyRoad69147 10 ай бұрын
He is tricking the linker into not executing the global contructor. Global constructers happen before main() or on load of the .so file. No surprise it segfaults.
@lukeskywalker2116
@lukeskywalker2116 7 ай бұрын
Ada 95 solved this problem and the C++ people said “oh, that’s too hard.” Ada 95 is C++ 2030
@markramirez3920
@markramirez3920 Жыл бұрын
"C++ uses C linkers" ...
@TheEVEInspiration
@TheEVEInspiration 6 жыл бұрын
So, don't use dynamic linked libraries. There is no need for them in most applications as they offer no benefits, only downsides. The whole thing is an artificial problem and should not impact development concerns.
@quickNir
@quickNir 6 жыл бұрын
This statement is rather hyperbolic. Static and shared linking both have their pros and cons. And if you're writing code for other people, they may have a requirement of having you ship a shared library (or both), so it will not be up to you, so the issues here will still be relevant. For a quick but reasonable summary of the trade-offs: stackoverflow.com/questions/1993390/static-linking-vs-dynamic-linking. Another nice point not discussed here, is that shared libraries allow function interposition, a specific and common example of which is the "LD_PRELOAD trick". AFAIK the environmental variable MALLOC_CHECK_ also depends on interposition. This allows you to quickly and easily rerun your application with a debug or instrumentation oriented version of a function without recompiling. If you were to statically link (in the MALLOC_CHECK_ example, glibc), you can't do any of these tricks and must recompile/relink each time.
@TheEVEInspiration
@TheEVEInspiration 6 жыл бұрын
First, thanks for the response and I know I can be provocative! I would argue that the case you bring up for dynamic linking can be done without dynamic linking libraries just the same. For example at the expense of an indirection step and a different way of compiling and linking. I remember from the 90s the Microsoft C/C++ compiler had the option compile per function and link only what is needed. When a compiler does that, every function is essentially on its own or part of a graph of code that gets included only when possibly needed. Linking is very fast and could in theory even be done incrementally / just in time. Having a new binary to run with a different implementation of a function in place can be very rapid and with some effort and tooling even with other code changes. Anyway, DLL hell became a meme for a reason and there is still heavy over-use of them IMO. Back when I still did C++, I hated it when a library had no source code with it. It impeded debugging and learning what truly happened and made having everything compiled the same way unlikely. And because hidden issue like you demonstrated, I actually ran into these before in the 90s and had to link libraries twice (back then I did not know why). The only "valid" reason I seen libs were shipped in compiled form was the fear of customers and others seeing the code. Today with open source and so many security fears in general (not related), as a customer I would prefer having the sources even more then ever before. And hiding implementations is just a technical hotfix for a non-technical problem and that is always a bad thing for someone, usually the customer. Don't take any of this as a critique on you, your presentation was good. I am just voicing a point of view.
@TheEVEInspiration
@TheEVEInspiration 6 жыл бұрын
And it might be my confirmation bias, but I can see a lot of what I wrote also noted on the page you linked.
@quickNir
@quickNir 6 жыл бұрын
Well, that would be an extra indirection. For a function that isn't getting inlined the shared library approach has zero overhead. Not to mention that the shared library approach is non-intrusive: you can use interposition on functions defined in libraries you don't control. That's why it's easy to swap in tcmalloc or jemalloc. Anyway I'm not really sure what the big downsides of shared libraries are. If you stick purely to one or the other, then you don't get these issues, so it seems weird to blame it on shared linking. In fact, a global defined in a shared library is always safe, even if people have other static libraries. The inverse is not true (and how I demonstrate the issue). Of the advantages of static linking listed on the SO page, 2/3 (running in limited environments, and startup time) are 100% irrelevant in what I do (HFT). As for making distribution easier, I'm not sure if I really agree. Copying over a directory isn't substantially harder than copying over a single file. Meanwhile, you have all of the benefits discussed above, as well as potentially using less memory and getting better performance when running different, but code-sharing binaries on the same machine.
@TheEVEInspiration
@TheEVEInspiration 6 жыл бұрын
"Well, that would be an extra indirection. For a function that isn't getting inlined the shared library approach has zero overhead" Indirection that is perfectly predictable is practically free. And where non-inlined code is located in memory matters, just as with data layout. Code gets cached, and thus pre-fetch predictability and information density have a performance impacts. One of the jobs of an optimizing compiler+linker system is to figure out either by profiling or static call graph analysis where to put code relative to calling code. Code that is hardly ever called must be moved away from hotspots to keep the hotspots dense and increase code cache effectiveness. This matters when run-time ultimate performance is your primary goal. The less of a library is used the less dense the code is from the perspective of your CPUs cache(s). Using very little of the functionality in a library is common. Static linking alone can remove the dead code and have a benefit (if the library isn't a tangled mess). Your use-case might be different in that your typical hotspots reside fully in your main program or library. Then this effect does not affect you as much as each can get optimized on its own. This is a design element creators of libraries need to keep in mind. Too often I see tiny and frequently needed functions put in APIs that are not part of the main program or even have to be invoked remotely.
Why You Should Always Help Others ❤️
00:40
Alan Chikin Chow
Рет қаралды 12 МЛН
MOM TURNED THE NOODLES PINK😱
00:31
JULI_PROETO
Рет қаралды 21 МЛН
Can you beat this impossible game?
00:13
LOL
Рет қаралды 52 МЛН
CppCon 2018: Jason Turner “Applied Best Practices”
1:03:19
CppCon 2016: Jason Turner “Practical Performance Practices"
1:00:29
Premature Optimization
12:39
CodeAesthetic
Рет қаралды 754 М.
What is a Monad? - Computerphile
21:50
Computerphile
Рет қаралды 591 М.
Why You Should Always Help Others ❤️
00:40
Alan Chikin Chow
Рет қаралды 12 МЛН