FAST Random in 3 LINES OF CODE // Ray Tracing series

Рет қаралды 45,936

Күн бұрын

Пікірлер

@TheCherno Жыл бұрын

Thanks for watching! ❤ Check out Brilliant for FREE for a full 30 days, visit brilliant.org/TheCherno. The first 200 of you will get 20% off Brilliant’s annual premium subscription!

@ConfusedNDazed Жыл бұрын

All I want for Christmas this year is another video in this series 🎄

@ng.h9315 Жыл бұрын

That's wonderful 👍 Please continue this course (Ray Tracing) until the end...❤️ I love to learn Ray Tracing, ,,and your teaching is awesome 👍

@mateuszgaecki5949 Жыл бұрын

You forgot to add it to the playlist

@PawelStoleckiTechArt Жыл бұрын

Thanks for the awesome series, looking forward to the next episodes.

@sumikomei Жыл бұрын

I've said it before but I really like hearing your thoughts behind things. It's very interesting and helps me a lot, especially as a newer programmer. I'd also really like to see the switch to compute shaders before moving to dedicated raytracing hardware.

@fischi9129 Жыл бұрын

Ngl, this series is so great, it takes something that looks insane, and to be honest, if you start from scratch and need to find all the material on your own, it would probably take a long long time. And it just preps it up nicely, step by step, and makes learning about graphics really simple and approachable :) Also, the videos being 20 mins long makes it really easily digestible :)

@mastershooter64 Жыл бұрын

Woahhh new cherno video let's goooo, thanks for your C++ and OpenGL series Charno, they've been super helpful!!

@ben256dev Жыл бұрын

Dealing with random devices and stuff deterministically has been a bit frustrating for me lately as I've been building a particle system. The random method you defined here seems like it could be really useful to me in the future! Even though I'm not following along with my own Ray Tracer I really do appreciate the videos in this series. :)

@simonfrohlich7766 Жыл бұрын

The way you implemented it, the random unit vector is not distributed uniformly on the unit sphere but is more likely to give values further from the axis. Just something to look out for...

@JakobWesthoff Жыл бұрын

Thanks for all the videos. Really enjoying them. The middle step you are talking about, like moving the code to compute shaders at some time in the future, would be something I would be really interested in, as would be a perfect introduction into the idea of using the gpu for this, before making things more blackboxed when moving to the raytracing cores.

@Jurkox26 Жыл бұрын

You can move it on the GPU with Vulkan without having RTX cores. The only requirement is 6GB of VRAM.

@alexsemm816 Жыл бұрын

Really enjoying the series so far, i am actually already somewhat hyped for when we start going to our gpu. This series is really great!

@mjthebest7294 Жыл бұрын

probably will never happen, looks like a dead series

@xXbenjman3000Xx Жыл бұрын

@@mjthebest7294 Ye, thats sad. It was a really good source to learn raytracing.

@olavonstad7297 Жыл бұрын

13:15, please continue to include those sort of things🙏Gold nuggets, not a waste of time imo😄

@mackerel987 8 ай бұрын

Please continue this series😭

@osakaandrew Жыл бұрын

I've been studying floating point and assembly optimizations recently and got thinking about your `divide by 0xffffffff`. I know the compiler will find a way to avoid that expensive division in the final assembly, but I'm now hyper-sensitive about using division in code and it got me thinking. You can retain all the integer binary data you get from your `PCG_Hash()` function, then overwrite the top 9 bits (what will become the sign and exponent portions of the float) with 0b001111111 (so sign is 0 and exponent is 127), then change the datatype to a float. This will give you a number between 1.000... and 1.999...), so finally subtract 1 to get your random float. You could also get a value between -1 and 1 by changing that bitmask above to not overwrite the sign bit. Again, I know it's stupid to try to outperform the compiler, but I'm pretty sure this would be faster.

@Bobbias Жыл бұрын

So, I've been reverse engineering/modernizing a game from 1996 (they released the source, but some parts are missing), and the number of times I've actually seen a division instruction are quite rare. Even compilers back then avoided divides quite well. You shouldn't be scared to write something in terms of division. If you're concerned, then look at the generated assembly and ensure there's no division used there (you can quite easily check using a debugger, or even something like Ghidra). Unless you can prove that the compiler is generating an unnecessary division, you should just write things out plainly.

@osakaandrew Жыл бұрын

@@Bobbias I agree. I've been writing a lot of WebAssembly recently, and it doesn't have an optimizing compiler (unless you're compiling from c/c++ or rust), so dividing by a constant here leads to performance penalties identical to what you'd expect from FDIV vs FMUL instructions. Games from 1996 probably would target the original Pentium? So maybe you're seeing FP instructions? Or maybe still using fixed point. But definitely, developers back then would have tried to avoid both DIV and FDIV instructions. Looking it up on Agner Fog's tables for Pentium, DIV takes 2x the clock cycles as MUL. And FDIV take at least 7x longer than FMUL. I definitely agree that it's a good idea to stick to readable code. Not only is it easier for you and others, it usually makes it more likely your optimizing compiler will find ways to squeeze maximum possible performance out of it. Writing "pre-optimized source code" usually breaks both of these. In this one particular case though, I think it's worth investigating going with this floating-point trick. I *think* you get the same degree of randomness with the same precision but higher performance. But it would take profiling to know for sure.

@ABaumstumpf Жыл бұрын

@@germanassasin1046 "it smells like ub to me." Depends on the exact implementation. And you do not want to look at any network-code implementation cause they rely on a lot of UB there :P Accessing a Float as a 'char' is explicitly allowed, accessing it as an 'int' is not.... Well, there are other ways that are allowed by the standard: memcpy is one of them - that does allow you to copy the bit-patterns of incompatible types and will most of the time compile to the exact same assembly as just casting (zero instructions), or with C++20 'bit_cast'.

@vladmunteanu5864 8 ай бұрын

WHERE IS THE NEXT VIDEO! I love this series please continue

@arnedebeer Жыл бұрын

Eeyo, keep them coming! Really looking forward to see where the series is heading 🔥

@luquest1848 5 ай бұрын

Deserves a like just for the straight face on the "quick one" gag - even if I am 12 months late to the party

@szirsp Жыл бұрын

12:45 I feel like using numeric_limits::max() is just using UINT32_MAX with extra steps :) (Technically it's not because UINT32_MAX is declared in stdint.h and numeric_limits::max() is using UINT_MAX which is in limits.h when int is 32 bit, but still it's mostly just more work for the compiler and and the developer to write/read.) To make it more C++ like (and actually more useful) I would suggest writing: numeric_limits::max() ...which would allow changing the seed type without touching the function body.

@oleit3 2 ай бұрын

Please continue this series!

@miroaja1951 Жыл бұрын

I hate to be mathematically pedantic about gpu optimized code, however when it comes to generating numbers on spheres the approach shown at 14:15 is not all that good, essentially due to the procedure producing numbers in a cube then normalizing them it causes a really uneven distribution of points along lines on the sphere where the edges and corners of the cube project to (imagine as if there being more space to put points there than along the centers of faces, thus less even). To fix this clumping you should be using a normal distribution, which is radially symmetric, thus eliminating this whole square issue. There are some gpu implementations of normal distributions, and after some quick SO skimming it seems they're generally speaking pretty fast too!

@Alkanen Жыл бұрын

We've pointed this out since the first RT video where he implemented InUnitSphere about a year ago :D

@miroaja1951 Жыл бұрын

@@Alkanen :D and yet the cube remains

@Alkanen Жыл бұрын

@@miroaja1951 yeah :) I think @TheCherno said that it doesn't matter though, this whole thing will be replaced with something more appropriate (cosine probability?) later in the tutorial.

@miroaja1951 Жыл бұрын

@@Alkanen that'd be pretty fun

@MichaelPohoreski Жыл бұрын

⁠⁠⁠@@Alkanen14:46 Cherno mentions replacing it with Cosine Weighted Distribution in the future

@davidcfrogley 3 ай бұрын

Phew, more than a year after the video originally dropped and I've finally caught up on the ray tracing playlist. It's kind of like reading the latest book from your favorite author and then having nothing to read until the next book is finished, which can take years in some cases. So, looking forward to the next ray tracing video, whenever that appears.

@vacuumdiagrams652 Жыл бұрын

I think the biggest problem with the "hacky" implementation of InUnitSphere is it's rather strongly biased towards generating points in the corners of an axis-aligned cube, as opposed to uniformly on the sphere (intuitively, it's comparatively unlikely that you'll generate x and y near 0 with z nonzero than to generate all three points far from 0). The 'correct' way to generate uniform points on the sphere is to pluck the numbers from a normal distribution and normalize. Box-Muller is popular, but a quick and dirty way that would probably work just fine is to add up a few calls to your uniform generator (3-5 should be enough), so the central limit theorem saves you.

@vacuumdiagrams652 Жыл бұрын

For those who might be curious, here's a quick python script illustrating the problem in two dimensions: rng = np.random.default_rng() def generate(n): pts = rng.uniform(-1, 1, (2,n)) norms = np.sum(pts ** 2, axis=0) return pts / norms plt.scatter(*generate(10000), alpha=0.002) plt.show() Run this a few times and you'll see every time there's more points near the corners of the square.

@TheCherno Жыл бұрын

But it doesn't matter, and we shouldn't even be using a uniform distribution here in the first place.

@kirikiri-san5324 Жыл бұрын

@@vacuumdiagrams652Not a bad idea , Although for a more involved path tracer for example ,Monte Carlo Integration using MIS , maybe drawing samples from a Uniform distribution and transforming it to another sample distribution using the Inversion method gives great results,of course I guess this is more important if we have realism instead of interaction in mind.

@pierreollivier1 Жыл бұрын

another way would be to build an array at launch of random number, depending on your needs and use that pool either as a seed for further generation, or directly as values, although I'm not sure how many random numbers you do, need but caching is a pretty useful techniques for optimisation purposes.

@pierreollivier1 Жыл бұрын

another approach could also be to use a sort of grid system, where each grid has a sort of "true" ray surrounded by fake rays that derived from the true ray. although the overhead might not be worth it.

@asteriskman Жыл бұрын

love that sweater. also, for my money I like the little side quest diversions into like the C macro vs the C++ std::

@Alkanen Жыл бұрын

Yay, more RayTracing series!

@enveraaa8414 11 ай бұрын

Thanks for the series ❤

@haps3000 3 ай бұрын

When is the next video coming out for this series!?

@u9vata Жыл бұрын

Your video came right in time, because I might use this for a randomization in one of the algorithms I do - it is totally not for graphics, but the alg needs speed and not really needs the cryptographic good properties at all!

@nickred7701 Жыл бұрын

Would love to see another video in the series!

@juhotuho10 7 ай бұрын

Just caught up in my Rust version! would be super cool to get more episodes for the series some day

@raith-san Жыл бұрын

Very cool, when's the next one?

@masheroz Жыл бұрын

You could also add in the system time to get some extra change.

@Volian0 Жыл бұрын

good luck using that in a gpu shader

@boggless2771 Жыл бұрын

@@Volian0you can pass it in from the cpu into the gpu on a per-frame basis.

@lucass8119 Жыл бұрын

Sorta defeats circumventing the system Random generator if you're just gonna get system time. I would imagine a syscall is going to be MUCH more expensive than a few shifts and multiplies.

@aethreas 3 ай бұрын

Honestly how do you get 14 episodes into a raytracing series without discussing ray-triangle interceptions or BHV construction? Casting rays and accumulating light is the easy part

@VoidloniXaarii Жыл бұрын

Very interesting! Thank you

@jfriday_13 Жыл бұрын

Definitely interested in seeing code this as a compute shader!

@DragoniteGod-qb8xj Жыл бұрын

What about importance sampling. For better quality and also maybe a way to shoot more rays in places that matter and less rays in places that don’t

@teksatan4699 Жыл бұрын

When I was a young lad, I needed a truly random number generator. "pseudo-random" didn't sound good to me lol, so I ended up having my app query a url that contained a quantum random number generator (some college has it up still) that used quantum physics to generate true random numbers. I would save a massive string of numbers, and then use the random values from that as needed. If I used up the list, I would query a new one in a thread. (I needed something that could not be predictable in any way) What could be less predictable than quantum physics lol. That url is still active BTW and you can still use it. EDIT: Just so you guys know this is a valid and rather fast method of getting random numbers. Before you actually run out of your stored list of random numbers, query the url in a thread to repopulate the list. and continue. It's as fast as can be, since your just looking up a memory location, rather than generating the number on the fly. (Not saying this is useful for your situation, it's just another way of acquiring random numbers that are truly random, which is particularly useful for hashing and encryption algorithms)

@ABaumstumpf Жыл бұрын

Nowadays CPUs have hardware-sources for random numbers - a lot faster than querying something over the network.

@teksatan4699 Жыл бұрын

@@ABaumstumpf You misunderstand exactly how this would function. You clearly aren't very experienced. This process is widely used (storing a large string of random numbers, and using it throughout the lifetime of your application runtime). You would for example store a randomly generated string like: "ae45tba3hta4hvgt4gvthagv4thgv4tha3gv4thag4v3jbtjk4htba4jhtbja4hbtjkah4btjhb45tsj4hbtsja4hbtsje4hbtjskehb4ts4hbtebzw783r8ow73bt8ow47va87hnt984ntv8947bt8947ngtva894gtnv8a7o98aw394v8ao5ogtno8sa3" (but MUCH larger) and anytime you need a random string you would take a chunk from the random string like the first 8 bits for example: "ae45tba3". You would do this thousands of times, until your string runs empty, at which point you query the internet for a new random string. *You ARE NOT querying the internet each time you need a random number* You only query the internet right before your global string of random numbers/chars runs empty. (every couple million calls to your random string/number function)

@teksatan4699 Жыл бұрын

@@ABaumstumpf CPU's *CAN NOT* generate "truly random numbers". There is a couple ways to do this, one way being: *You build a custom quantum random number generator for your computer* (at the tune of several million dollars), or *you use someone else's hardware over the internet* Most applications do not need "truly random numbers/string", but some do and, in those cases such as data encryption etc., there is only 1 way to acquire TRULY random numbers that cannot be backward engineered and or decrypted depending on the use case.

@teksatan4699 Жыл бұрын

@@ABaumstumpf There are hardware methods such as using fan noise, mouse movements, and other similar things, but these can still technically be "deciphered" if you will. Especially if the system is compromised, because a bad application can also record mouse movement, and fan noise, etc. that can then be used to find out the random strings you may have generated with fair accuracy. If you want a truly random number that stands almost no chance of someone figuring it out, you must use an off-system generator (A generator not directly connected to the system that is using the random strings), like I described.

@teksatan4699 Жыл бұрын

@@ABaumstumpf I should also add, you can't store this as a normal block of allocated memory like a single "std::string or char*", because a memory dump of the system would reveal your global random string. The memory would have to be broken up, itself encrypted, or few other ways, but you would only want to use methods that would faster than simply making a random() system call for example.

@JayNakum_ Жыл бұрын

more videos please!

@npip99 Жыл бұрын

8:15 Having a new seed per frame seems useful; I'm actually confused because if accumulate is used doesn't it need to be new seeds every frame for that accumulate?

@TheMark147 9 ай бұрын

great series I cannot wait when we will move math to shaders :))

@wuspoppin6564 Жыл бұрын

crazy how just changing the way our random numbers are generated can shave away a few ms' in frame time

@wuspoppin6564 Жыл бұрын

well, "random" but still true

@charababis6321 Жыл бұрын

@@wuspoppin6564 random number generation is quite an expensive task

@Saturate0806 Жыл бұрын

11:30 Jesus what a naughty little hack

@REktSigMa Жыл бұрын

What about Lut Materials? Like can you create your own Lut for your scene quality? Like in most engines like Radiant you are able to edit any Lut material in Photo Shop, then copy and paste the edited Lut to a duplicated default Lut already installed. Making for your own dynamic.

@aloluk Жыл бұрын

At 14:30 why dont you take the *2 -1 outside the creation of the vector?

@satheyaseelansugieethan6602 Жыл бұрын

Hi, can you add to the C++ playlist with Data Structures and Algorithms targeted with the aim of building a base for coding assessments please.

@JATmatic Жыл бұрын

This is nice series as it allows me to tinker with some one elses code. The math can maybe be speed up by `const auto SCALE = 1.0 / UINT32_MAX` and multiplying the random uint by that. This avoids division. There is also some funny code like doing non-SIMD code between some glm operations.😉

@itsukishuun Жыл бұрын

This series has been very informative. Any chance you'll be adding objects other than spheres? Or did i missed it ?

@ScibbieGames Жыл бұрын

It's circles cause they're very easy to do. Other shapes aren't particularly hard however, you only need to calculate ray colission of a flat surface made of three points (a triangle). Every other shape in computer graphics is made up of triangles. So once you got a single triangle working you good.

@Scotty-vs4lf Жыл бұрын

ray and triangle intersections are very simple. check where/if the ray intersects the plane that the triangle makes up and then check if that point is within the triangle

@visionary_3_d Жыл бұрын

Loved it

@ale-lp Жыл бұрын

I've been toying with my own CPU raytracer for some time and was excited to see a faster random number generation as it is the most expensive computation I have at the moment, but it turns out it's not really faster tan the good old C function rand(). At least on my environment.

@bmericd Жыл бұрын

Can we implement Intel's open image denoiser later along the path? I think it would be quite fun to experiment with denoisers.

@johmarjac Жыл бұрын

Bro, please tell me where you bought that awesome pullover. I want that tooo @The Cherno

@user-ib3gl6pj6h Жыл бұрын

Hi when will you switch to vulkan

@Fed981981 Жыл бұрын

You could make the float result even faster if you would multiply by the inverse of max int.

@journeyman9162 Жыл бұрын

So you mean 0?

@ietsization Жыл бұрын

@@journeyman9162 in that context inverse means the reciprocal so 1/max int. However that number is pretty small and you might lose precision in the float representing it.

@journeyman9162 Жыл бұрын

@@ietsization That's what I meant. Precision will be lost and it will just be 0.

@Fed981981 Жыл бұрын

Obviously my comment assumed to have the number as a float before dividing, just like in the video. However, given the way floats are represented in binary, the precision is maintained (in the mantissa), only the order of magnitude will make highly negative exponent.

@journeyman9162 Жыл бұрын

@@Fed981981 Interesting. I didn't know that.

@xenopheliac7202 Жыл бұрын

Hey - I 've been loving this series, and started on the Physically Based Rendering book, but I've been worried that AI might invalidate all the time spent learning this, thoughts?

@Infinity-js5vk Жыл бұрын

AI isn't a cheat tool that can do everything, you must be specific and still be knowledgeable. It may seem that AI opposes learning, but quite the opposite actually. Learning benefits your usage of the AI.

@asandax6 Жыл бұрын

@@Infinity-js5vkyou're correct. AI is just an accelerator. It accelerates the ability for people to test their ideas. Coming up with those ideas still requires knowledge. Give AI to someone who knows nothing about programming and they wouldn't suddenly be able to make programs at the level of even amateur programmers who have 2 weeks of experience.

@AlefCS 3 ай бұрын

Why casting both operands of the division? Wouldn't be enough to cast only 1 of them?

@MrSandshadow Жыл бұрын

can you do the episode on the game engine architecture ?

@PoorlyMadeSweater Жыл бұрын

I had an issue where I had to spin up a ton of threads that all needed to instantiate their own Random generating object, which destroyed performance (OOP amiright?). I ended up making a global generator that iterated its seed with every use, but that wouldn't work in shader code. Nice little generator.

@binoice1301 Жыл бұрын

You could have passed delta time every frame so it will change the seed gen function,maybe?! 😅

@vladyslav007 Жыл бұрын

I wonder of an impact of such if-else branching on the performance. Or after N iterations branch predictor relearns and everything is okay? Missing branches is pain sometimes.

@SaHaRaSquad Жыл бұрын

Branch prediction is very accurate with many loop iterations, it's unlikely to go wrong aside from a few frames after each toggle.

@anon_y_mousse Жыл бұрын

I still remember trying to write some code on a 533mhz Celery back in the day and finding the branch predictor screwing up so often that I tried to write as much code as I could without branching. Computers have gotten better, but it's still not bad advice to avoid branches if you can, just not to the detriment of clarity in your code, because if you bugger that up, you'll be in for a world of hurt later on when trying to modify your own code.

@lifeisoof Жыл бұрын

00:47 do it

@masheroz Жыл бұрын

Random!

@orelronorelron Жыл бұрын

The dotnet core GC code has somthing like that (:

@tabletopjam4894 Жыл бұрын

I thought this series was dead again, I really like watching and learning but this series is very sparse in release schedule

@ABaumstumpf Жыл бұрын

GLSL had a noise-function .... Nvidia made a useless (but somewhat compliant) implementation returning 0, AMD did never even follow the specs - there were many required functions they didnt implement at all, leading to the software not even running on ATI/AMD cards.... that was so frustrating when trying to figure out why sometimes the rendering was mangled, sometimes it worked, and sometimes it just crashed - turns out running it on Amd, Intel or Nvidia was the reason.

@SillyOrb Жыл бұрын

Just make the serialisation a standalone "side quest" to the series instead of a regular video. Take the scene, treat it as just an example dataset and only focus on the serialisation. If I am not mistaken, you have already done that with some other feature and it worked out.

@andrewsavastysnov6891 Жыл бұрын

why not to set seed from current time, it will make it change from frame to frame

@SaHaRaSquad Жыл бұрын

Because it's unnecessary, slower and probably not available on the GPU.

@decky1990 Жыл бұрын

Why marsenne twister? There are other distributions and the members weren’t declared as static? Will it not be restarting the pseudo-random sequence every time, just on different threads?

@jdudeofficial4263 Жыл бұрын

HELLO

@wonder3583 Жыл бұрын

Your function for generating random floats is going to return a lot more 1.0s because UINT32_MAX cannot fit in a float. You'd need to do the division in double precision and then convert to a float.

@feschber Жыл бұрын

UINT32_MAX fits in a float very easily