Compute shader 101

Рет қаралды 26,740

Raph Levien

Күн бұрын

Пікірлер: 36

@glenneric1 2 жыл бұрын

Thank you. I've been looking for a formal treatment of this subject rather than people diving right in.

@sombrastudios 3 жыл бұрын

What a TIME to learn about Compute Shaders, what a time! You're a real inspiration for me and the more I see of your work, the more I'm impressed by it

@marschrr 3 жыл бұрын

Yay! Awesome timing! Now I certainly have something cool to do on the weekend! Go Raph!!

@IanMcKellar 3 жыл бұрын

Thanks Raph! This was great. The KZbin format actually worked pretty well because I was able to pause and watch this over a few days when I had time and brain to pay attention.

@AlanMelling 3 жыл бұрын

Thanks for creating and sharing this presentation. I’ve done quite a bit of Metal compute shader programming, but haven’t branched out to the others. This was a good foothold into the space of other options. Thanks again.

@FinaISpartan 3 жыл бұрын

An hour long content, time to grab some snacks. Thanks for the upload! edit1: 9:00 as of Vulkan 1.2, some OpenCL functionality is being merged into Vulkan. As a result, there are some projects that have taken to compiling OpenCL source to a Vulkan target with decent results (see clspv and clvk). It's another option, maybe it'll get more support than OpenCL did.

@RaphLevien 3 жыл бұрын

Indeed, and I probably should have mentioned that, as it is a viable path going forward. My sense is that clspv is pretty clunky (see linuxreviews.org/The_State_Of_OpenCL_To_Vulkan_Compute_Layers_On_Linux_In_April_2021 for reports of frustration) and that if you're building new stuff, it probably makes more sense to target compute shaders directly rather than through the OpenCL layer.

@Tuasmanque 3 жыл бұрын

I feel like one of the biggest bottlenecks is that it's just so hard to target SPIR-V right now as opposed to, say, LLVM IR. Rust-GPU having to write their own backend is an unfortunate example of this. Of course MLIR etc. do exist and there were even rumblings of getting a SPIR-V backend into LLVM itself, but thus far nothing is near the level of maturity for folks without a decent amount of compiler know-how to use.

@pauek 3 жыл бұрын

Thanks Raph, such a good presentation. It helped me put into the map many concepts that I had floating there in my mind (the computation model, and workgroups in particular). I feel like I can start thinking about translating certain computations to the GPU model now, the door is open... ;)

@scottlott3794 3 жыл бұрын

Thank you so much for this video! I've been digging around compute shaders for weeks and this answered SO many questions I had!

@alskidan 3 жыл бұрын

Thanks for sharing this on KZbin!

@Turbanawesomeness 3 жыл бұрын

Super excited to try this out!

@frederickmueller7916 3 жыл бұрын

Very nice general overview. I got interested into using Vulkan for compute 1 year ago and having that would have saved me a lot of pain^^. Finally the amount of boilerplate turned me down and I just resigned and use CUDA. Will definitely check out your piet-gpu-hal though

@avirads 3 жыл бұрын

Excellent talk on such a new topic. Thanks for sharing

@FinaISpartan 2 жыл бұрын

Update for 2022: Kompute, a GPGPU lib fostered by the Linux Foundation for Vulkan compute, is looking like a potential contender. Exciting times.

@Turbanawesomeness 3 жыл бұрын

Couple comments/questions: 1. Your comment on scan over a monoid is reminiscent of recursion schemes. Perhaps there is room for a programming model that is a hybrid between the APL and functional style, that is centered around cata-,ana-,etc-morphisms as primitives? 2. Do you think it makes sense to use PTX as the target representation when working with NVIDIA GPUs?

@RaphLevien 3 жыл бұрын

1. There are languages such as Fortress (research, not currently active) and Futhark that seem to be exploring this space. I'm hopeful for that work to continue. 2. I think PTX is too Nvidia-specific, and that SPIR-V has a similar role but much more portable. I've been looking at ISA output from shader compilers and feel like they generally do a good job, ie I don't think a lower level representation would be that helpful.

@OvergrowthMash 3 жыл бұрын

fantastic intro! thanks

@ArtavazdBalayan Жыл бұрын

Thanks for explaining quite complicated topic in a such structured and easy to understand way. Could you, please, stream some practical examples of creating compute shaders in Rust?

@squirrelcarla 2 жыл бұрын

Thank you, I learned so much!

@essamaly5233 3 жыл бұрын

Where to place OpenCL on your slide deck#11? Is it same as CUDA? What about AMD new HIP? Do you see HIP equivalent to CUDA? Is there something in the HW makes HIP not CUDA, or are they pretty much the same?

@klejdiks 3 жыл бұрын

Thanks for the awesome video

@jerryzhang7124 2 жыл бұрын

this is gold!

@meowzerus 3 жыл бұрын

Great talk, very good background overview on the situation! The big concerns I have today is the composability of GPU code with all the fragmentation we have today, and libraries/tooling primarily targeting big projects like servo. This is a notable goal of RustGPU, but it felt like you spent 30-40% of this talk on the red tape, ceremony, limitations, tooling, and quirks of the compute environment alone. For example Pathfinder is brilliant in many ways but completely out of the question for integration in projects that could benefit from it. Do you think progress is being made in the GPGPU scene to support the average project in rust in terms of usability and integration feasibility? I work on Fontdue which has a lot of embedded and web users that just can't use Piet/Druid/Font-kit/etc or support big C++ dependencies in their build. I'd really like for these people to have a more browser grade option in the future since Fontdue kinda sucks.

@RaphLevien 3 жыл бұрын

These are some deep questions and I have spent some time thinking about them. There's probably a better place than KZbin comments to have this discussion, but here goes. You are absolutely right that composability of GPU code is a huge problem. You can't just link a .so like you would for CPU code, although conceptually there's nothing blocking that kind of capability, it just hasn't been done. One hope (that I've expressed) is that Rust ecosystem rallies around a single runtime, of which wgpu is by far the most hopeful candidate. Then each component can manage its own pipelines, shaders, etc. (for example, font rendering, a topic dear to both of us), and stitch that with the rest of the system. That may or may not happen, as Bevy is experiencing some friction with wgpu, and rafx appears to be gaining momentum. So a fragmented future is definitely a possibility. It's also possible that some engineering effort gets spent on interop tools, things like common traits for accessing resource allocation and the like. I think of it as similar to raw-window-handle, but scaled up to a good chunk of the GPU surface area rather than just getting access to a window. I've given some thought to what that might look like, but it's a lot of work and I'm not likely to take it on myself. All this is worth thinking about, and I certainly hope to spark some work towards those goals.

@meowzerus 3 жыл бұрын

@@RaphLevien WGPU does look like the future, but I'm a bit disappointed in it until the GL backend gets some more love. I just skimmed rafx which seems like it allocated more bandwidth to its GL backends, so it looks interesting. The raw-gpu-handle idea also sounds like it would be interesting to see as well. Do you have an idea of the topics you're going to explore and talk about more in the future?

@GrayOlson 3 жыл бұрын

@@RaphLevien On the topic of "composability of GPU code", I think that perhaps just as important or even more important than interoperability of the host-side API code is interoperability of the *shader code itself*. This is one of our biggest medium-to-long-term goals with the rust-gpu project at Embark... we absolutely love the way that cargo and crates.io have allowed Rust to embrace a culture of sharing code and we take advantage of this to generally great success on our internal projects. We really want to be able to take advantage of those same things when writing shaders. We're not there yet, and for now are focusing more on getting rust-gpu to the point that it can support all of our game's renderer's use cases as that is essential for us, but we're really looking forward to the (hopefully) near future where we really want to focus on being able to publish "gpu crates" which you can bring in and use just like you would with the current Rust CPU crates.... and through that, sharing as much of our own shader code as we can :)

@HrBDev 3 жыл бұрын

Watch it on 1.25x speed, you are welcome!

@pauek 3 жыл бұрын

That totally depends on the listener!

@AhmadAli-kv2ho 29 күн бұрын

36.50

@Jkauppa 2 жыл бұрын

the thing that you did not show any simple example, that the languages are ON PURPOSE made super complex to implement

@Jkauppa 2 жыл бұрын

the idea of parallelism does not require any explanation itself

@DimiterStanev 3 жыл бұрын

Thanks for the great talk. Long time ago, I remember reading this - www2.eecs.berkeley.edu/Pubs/TechRpts/2006/EECS-2006-183.pdf - and it was categorizing 13 distinct "compute problems" as "dwarves" and how well they parallelize. Haven't read it recently, but remember that FSM (finite state machines) were the hard problem when comes to parallelism, to quote from the article (2006): "Although 12 of the 13 Dwarfs possess some form of parallelism, finite state machines (FSMs) look to be a challenge, which is why we made them the last dwarf. Perhaps FSMs will prove to be embarrassingly sequential just as MapReduce is embarrassingly parallel. If it is still important and does not yield to innovation in parallelism, that will be disappointing, but perhaps the right long-term solution is to change the algorithmic approach. In the era of multicore and manycore. Popular algorithms from the sequential computing era may fade in popularity. For example, if Huffman decoding proves to be embarrassingly sequential, perhaps we should use a different compression algorithm that is amenable to parallelism."