Modernizing Compiler Design for Carbon Toolchain - Chandler Carruth - CppNow 2023

  Рет қаралды 51,163

CppNow

CppNow

Күн бұрын

www.cppnow.org​
/ cppnow
---
Modernizing Compiler Design for Carbon Toolchain - Chandler Carruth - CppNow 2023
Slides: github.com/boostcon
---
The algorithms and data structures used for parsing and compiling in most compilers today are rooted in 50 year old computer architectures and language design realities. What would a modernized compiler design, based on modern computer architectures, leveraging data-oriented design, and targeting modern languages look like in order to maximize performance while retaining flexibility?
This talk will provide an overview of the traditional model for designing a compiler in C++ and highlight some of the key limitations of these design patterns. Then it will introduce a new set of design patterns that we are using to build the Carbon toolchain's compiler, and describe why we think they can help us reach unprecedented compile times (for a C++-like programming language). It will also show specific programming techniques that have proven useful and how we have overcome some of the traditional limitations and challenges of implementing a parser or compiler in C++.
While this talk focuses on the design and implementation of the Carbon compiler, it will also touch on the Carbon language because achieving our goals for the compiler interacted with the design of the language grammar.
---
Chandler Carruth
Chandler Carruth is the technical lead for Google's programming languages and software foundations. He has worked extensively on the C++ programming language and the Clang and LLVM compiler infrastructure. Previously, he worked on several pieces of Google's distributed build system and made guest appearances helping maintain a few core C++ libraries across Google's codebase. He received his M.S. and B.S. in Computer Science from Wake Forest University, but disavows all knowledge of the contents of his Master’s thesis. When not hammering away on a weirdly shaped keyboard, he enjoys sushi, fine dining, brown spirits, and everything about wine.
---
Video Sponsors: think-cell and Bloomberg Engineering
Audience Audio Sponsors: Innoplex and Maryland Research Institute
---
Videos Filmed & Edited By Bash Films: bashfilms.com/
KZbin Channel Managed & Optimized By Digital Medium Ltd: events.digital-medium.co.uk
---
CppNow 2024
www.cppnow.org​
/ cppnow
---
#boost #cpp #compiler_design

Пікірлер: 65
@binary_gaming113
@binary_gaming113 8 ай бұрын
Missed opportunity to say "keep the carbon footprint to a minimum" in the beginning
@pmcgee003
@pmcgee003 7 ай бұрын
🎉
@superscatboy
@superscatboy 8 ай бұрын
I don't know much about the inner workings of compilers, but the little I do know I learnt from Chandler Carruth.
@CuriousCauliflowerX
@CuriousCauliflowerX 8 ай бұрын
Always good to hear from Chandler, the one of the few folks on the conference with connection to reality.
@samanthaqiu3416
@samanthaqiu3416 3 ай бұрын
he redpilled me on the insanity of the GNU GCC/G++ tightly coupled design based entirely on political principles
@maxrinehart4177
@maxrinehart4177 13 күн бұрын
​@@samanthaqiu3416 where I can read(or see) more about this topic? I assume he wrote or talk about it
@balajimarisetti4245
@balajimarisetti4245 8 ай бұрын
You'll learn a lot from Chandler's talks. Great one again.
@BoostCon
@BoostCon 8 ай бұрын
Thank you for your positive comments!
@maxrinehart4177
@maxrinehart4177 13 күн бұрын
​​​​ I enjoy your videos but please for the love of God, fix your audio static problem, it's hard to hear when the audio stream is is poor. Also please hand a microphone 🎤 to the people who ask questions, I can barely hear them, thanks for Chandler for repeating their questions I can understand what they asked. Invest in audio quality and extra microphone for the audience.
@Bentley070
@Bentley070 8 ай бұрын
Awesome talk! This idea of "semantic IR" is also used by Zig to do type analysis and compile time execution together in a single pass. It does indeed limit type inference though, as you can't use future usages to determine the type of a variable declaration.
@willw2596
@willw2596 4 ай бұрын
I'm in the middle of the talk and it dawns on me. The token in the lexer is basically a tuple of (id, line-number) where the id is the index into the ECS arrays. ECS being the Entiy-Component-System from the game industry. Nice to adopt ideas from the game development.
@about2mount
@about2mount 3 ай бұрын
Actually you are correct the token is either a single character or a group of characters or words without whitespace or spaces between them. A Lexer token will catch the following as tokens in the loop as single characters: In a Lexer they catch: a, z, 0, 9, (, ), {, }, [, ], +, -, *. etc. In the Parser they catch: KEYWORD, LPAREN, RPAREN, myvar, EQUAL, 100, MINUS, 42, MULT, 5, DIV, 2, ADD, 45.332, NEWLINE, x1, EQUAL, DQUOTE I am a sentence, DQUOTE, etc. State Switches, counters and Temporary Variables which are variables as Integers, incrementing and string storage set precedence's conditionally and catch tokens at each iteration in the loop. Struct's as tuples are used to store what your asking as line number, value and id etc. Temporary Variables pass these values into the structs and do so in the loop when specific conditions are met and if errors occur. These structs as Tuples can easily be stored into Log or data files at the end of the process when flags are set in your program also. And for faster processes can re load these struct files as cache files.
@petermuller608
@petermuller608 8 ай бұрын
I really enjoy Chandler's talks
@gregandark8571
@gregandark8571 Ай бұрын
He's a very bright man!
@atomsRnot4717
@atomsRnot4717 8 ай бұрын
Interesting talk on making compilers faster by adopting data-oriented design instead of the usual way of writing lexers and parsers, replacing the incrementally-allocated token and AST structures with faster array-based structures. Unfortunately the number of questions breaks the flow. It seems some of the audience members wanted everyone to know every thought that popped into their heads.
@MrHaggyy
@MrHaggyy 8 ай бұрын
Thats also a great deal for a compiler that just changes a part in a huge code base and makes your changes linkable.
@andy_lamax
@andy_lamax 6 ай бұрын
this was a very curious audience. Love it
@SamualN
@SamualN 8 ай бұрын
the flashes of audio static made this really hard to watch towards the end
@SimonBuchanNz
@SimonBuchanNz 8 ай бұрын
Funnily enough I took a stab at parsing a baby typescript fast, and ended up with a vaguely similar structure, including a semantic tree as a basic block IR. I stalled out when I realized I'd have to actually figure out analysis 😅 Definitely some neat ideas like postorder parse tree.
@hampus23
@hampus23 8 ай бұрын
Very entertaining 👏
@joaosilveira29
@joaosilveira29 8 ай бұрын
great talk!
@brynyard
@brynyard 8 ай бұрын
Great talk, but kinda painful with all the interruptions from the "shadow" gallery.
@msclrhd
@msclrhd 8 ай бұрын
I'm interested in compiler design that supports both IDE (LSP, IntelliJ, etc.) and compiler/interpreter models as there is a lot of shared infrastructure in the lexing, parsing, and semantic analysis side of things. Advanced IDE features like optimization hints, interactive debugging, code completion, parameter hints, semantic highlighting, etc. are all blurring the line between an IDE tool and a full compiler/interpreter. There are also challenges in defining how the compiler/interpreter models the program and how the IDE (LSP, or custom APIs like in IntelliJ) that make it challenging to define an abstract lexer/parser that works in all 3+ environments (command line, LSP/VSCode, IntelliJ or other IDE-specific API).
@a-bm9ux
@a-bm9ux 8 ай бұрын
repeating the questions would gone a long way here
@abuyoyo31
@abuyoyo31 8 ай бұрын
Interestingly, Chris Lattner himself said in a recent llvm talk that he wished he had put a c++ - specific IR into clang, and that pretty much all other llvm-supported languages do. Also, great talk.
@kalidsherefuddin
@kalidsherefuddin 8 ай бұрын
Thanks
@jackgame8841
@jackgame8841 8 ай бұрын
cool talk
@kuhluhOG
@kuhluhOG 24 күн бұрын
One should also look at how Zig does it since they also have a data-oriented approach for their compiler which lead to a very fast compile time.
@chrraz
@chrraz 8 ай бұрын
Very cool talk but it is a shame that the questions are not repeated as it is very hard to hear the audience. They sound like very interesting questions.
@_____case
@_____case 8 ай бұрын
Wake up, honey! There's a new Chandler talk!
@kallehed6330
@kallehed6330 8 ай бұрын
niiiiiiiiiiiiiiiceeeeeee
@petermuller608
@petermuller608 8 ай бұрын
Interesting ideas for compiler frontend design! I do however wonder how much time is spend in the frontend vs. optimization passes in the backend
@ChandlerCarruth
@ChandlerCarruth 8 ай бұрын
On at least a few big C++ codebases where I have measured, it's easily in the ballpark of 70% - 80% time in the frontend for a development or debug build.
@GeorgeTsiros
@GeorgeTsiros 8 ай бұрын
@@ChandlerCarruth pleaseplease tell me you at least touch the subject of incremental/differential parsing? When the programmer inserts a new character, it's terribly wasteful to parse the entire file from the start
@barakap2372
@barakap2372 6 ай бұрын
@@GeorgeTsiros a file is always recompiled when you touch a new character. different code files which aren't touched tough may be incremental compiled
@perschrijver8884
@perschrijver8884 8 ай бұрын
How does the merging of code locations and token kinds into a single 32 bit value work? Is it x bits for token kind and x bits for character index?
@ChandlerCarruth
@ChandlerCarruth 8 ай бұрын
32-bit index is a handle to the token. Can then use that to look up the details of the token, including a byte offset into the source file and the spelling of the token within the source file.
@szaszm_
@szaszm_ 8 ай бұрын
18:18 Shouldn't switch to full screen slide view, since the speaker is talking about something barely related, in response to a question.
@JacobPeralta21
@JacobPeralta21 8 ай бұрын
What's font style and colour scheme in vs code,??? 48:34
@u9vata
@u9vata 8 ай бұрын
There are many parts in this how i approach my toy-language compiler: But interestingly I came from direction from forth to a similar stack machine for syntax tree parsing instead of kind of inventing it downwards from traditional parsers not being good for data oriented design. But some of the techniques seems very close actually... oh... and also the lexer to do parentheses-like checking and error messages too... makes much more sense than the parser doing this. EDIT> Yes, "semantics as IR" is very possible and my approach too, just I call it differently and do it much more text-based than IR in that sense its not a bytecode. Also makes its much easier to develop your compiler, better understand what the various steps do, what the optimizer step do etc. etc.... and the optimization and semantics as IR can be represented as same representations - they not need to be different at all. Actually did not expect carbon compiler design having this cool ideas so might end up a good language in the end. But I think this talk is beneficial to any kind of compiler design practically...
@-taz-
@-taz- 8 ай бұрын
47:50 What's that typeface? It's exactly what I have always been looking for! Sharp, clean, yet tall like the old VGA screen fonts from when I was a kid.
@cemgecgel4284
@cemgecgel4284 8 ай бұрын
If you are talking about the VS Code font, it is JetBrains Mono.
@mdciotti
@mdciotti 8 ай бұрын
Looks like Iosevka
@jmjmjm439
@jmjmjm439 8 ай бұрын
Looks similar to Iosevka
@ChandlerCarruth
@ChandlerCarruth 7 ай бұрын
It's Iosevka, I'm a big fan.
@cemgecgel4284
@cemgecgel4284 7 ай бұрын
Well looks like I did not know what I was talking about. Now that I looked at it again it definitly does not look like JetBrains Mono.
@SentientNr6
@SentientNr6 2 ай бұрын
Reminds me a lot about what Zig & Jai tries to do.
@sumofat4994
@sumofat4994 24 күн бұрын
Have done. Amazing that now everyone wants to join the party when they realise they are slow af.
@maxrinehart4177
@maxrinehart4177 13 күн бұрын
​@@sumofat4994 are you serious? Check your facts please, because Chandler is known in the compilers industry way even before Andrew worked in zig and Jon on jai. In fact they both watch his talks and started incorporating some of his ideas. Jon mentioned him multiple times on his streams.
@br3nto
@br3nto 8 ай бұрын
1:01:04 what’s the name of the person and code that the guy shouts out this point in the vid? “You do know your reinventing ?? or V-Code>??, right?” And “this is how Nicholas ???? Compilers all work?”
@br3nto
@br3nto 8 ай бұрын
I think it’s “P-code” produced by the Pascal compiler and “Niklaus Wirth”
@sanket143
@sanket143 8 ай бұрын
Can someone explain what's struct of arrays?
@ChristianBrugger
@ChristianBrugger 8 ай бұрын
Instead of struct A {int a; int b;}; std::vector. One does struct B {std::vector a_values; std::vector b_values;}; The benefit is 1) no wasted space due to alignment. And 2) if a function only needs a, only those are loaded into cache. So it is more space efficient and more performant.
@guillaumewenzek4210
@guillaumewenzek4210 8 ай бұрын
Array of Struct: {a: int, b: float}[] Struct of array: {a: int[], b: float[]} The second is more cache friendly if you read 'a' ten times more often than 'b'
@shaurz
@shaurz 7 ай бұрын
Instead of accessing array[index].field you have fields in separate arrays, e.g. field[index]
@CartoType
@CartoType 8 ай бұрын
Interesting and well presented, but questions should have been kept to the end. They are inaudible anyway in this video, as well as breaking up the flow badly. Some if the ideas are quite old, such as using a custom stack actually allocated on the heap to avoid running out of stack space when parsing data with a nested structure. I was doing that in the 90s when implementing a web browser on a PDA.
@ChandlerCarruth
@ChandlerCarruth 8 ай бұрын
There was a special mic that AV folks and organizers thought would capture the audience questions so they were audible. =[
@noxabellus
@noxabellus 8 ай бұрын
i can hear the questions with headphones on, fwiw
@user-py9cy1sy9u
@user-py9cy1sy9u 8 ай бұрын
1:01:11 whos compiler?
@shilangyu
@shilangyu 8 ай бұрын
niklaus wirth
@tomw4984
@tomw4984 8 ай бұрын
Shut up and let him present!
@gim913
@gim913 8 ай бұрын
I see it's getting discussed later on, but when I started watching, my first thought was that numbers ~25:00 would have more sense when shown as 0.1 of sec, so 1m, 100k and 10k respectively, or even better, I think many people are used to 60 fps rendering, so it'd be even better posed as: how much code can you cover in 16ms and that would be respectively: 167k loc, 16k loc, 1.6k loc
@AK-vx4dy
@AK-vx4dy 8 ай бұрын
Chandler looks humbled after trying to make own langugage ;)
@rocknroooollllll
@rocknroooollllll 3 ай бұрын
Questions, questions more questions. A sign of an ill-prepared talk if ever there was one. I will be happy when c++ community focuses on the actual language instead of these 'safe' extensions. Yes, Herb, looking at you, too.
@jordixboy
@jordixboy 7 ай бұрын
pretty annoying to assume that everybody comes from university. Im self taught and I can take a book and learn about compiler theory
Rust Features that I Want in C++ - David Sankel - CppNow 2022
1:14:39
КАКАЯ ХИТРАЯ КОШКА! #cat #funny #pets
00:50
SOFIADELMONSTRO
Рет қаралды 17 МЛН
Let's Create a Compiler (Pt.1)
1:11:03
Pixeled
Рет қаралды 431 М.
Zig for Impatient Devs
9:48
Isaac Harris-Holt
Рет қаралды 54 М.
Faster than Rust and C++: the PERFECT hash table
33:52
strager
Рет қаралды 487 М.
Робот зарядка на 65W🤖
0:36
serg1us
Рет қаралды 1,6 МЛН
Почему сканер ставят так не удобно?
0:47
Не шарю!
Рет қаралды 132 М.
The PA042 SAMSUNG S24 Ultra phone cage turns your phone into a pro camera!
0:24