If you want any language have good performance numbers, then compare it to Python.
@ThePrimeTimeagen Жыл бұрын
classic w
@anon-fz2bo Жыл бұрын
Lol facts
@JakobKenda Жыл бұрын
MY language (slj) is 4x faster than python (really) (it's 18x slower that C tho)
@suarezlifestyle Жыл бұрын
Python is not fast lol, make a better research
@rjandonirahmana4326 Жыл бұрын
@@suarezlifestyleexactly why you gotta compare it with python
@catcatcatcatcatcatcatcatcatca Жыл бұрын
Can’t wait for python programmers to evolve into Mojo programmers who just never use any of the new stuff, but now can say the language they write in is using modern process optimisations and cache-efficient data structures. Kinda like C++
@aleph0540 Жыл бұрын
LMAO why violate them like that!
@aoeu256 Жыл бұрын
Did people forget about Julia & LISP? You could have easily use macros (compilers) to turn one high-level S-expression format into intermediate S-expression format into C S-expressions. Even in Python today you had @numba(nopython) to turn static looking Python that looks like C-code into C-code and compile it.
@aoeu256 Жыл бұрын
Why don't people write their code in terms of relational constraints. Like I need this and this and this, then use chatbots and solvers to generate code and custom hardware for your application? A good way of transforming your constraints to code is Lisp's s-expressions and quasi quoting which sounds like Julia.
@aleph0540 Жыл бұрын
@@aoeu256 Too complex, high-level scripting is the closest you'll get to that. You're talking about something that exceeds semantic language and symbolic language, conceptual language. Doable but will likely involve probability, don't think LLM are it though.
@vectoralphaSec Жыл бұрын
From Pythonistas -> Mojicians.
@catcatcatcatcatcatcatcatcatca Жыл бұрын
Cache-line sized vectors being their own type is pretty brilliant idea. It probably allows even better performance compared to manually doing that, but also just reduces typing.
@ChamplooMusashi Жыл бұрын
Aligning types to the size of caches in general is a great optimization. People most often think of optimizations in terms of algorithmic order of n magnitude things but the reality is making something 8x faster is a substantial speedup. Making something even 30 or 10% faster will make something take about 10 less minutes if it takes an hour
@farqueueman Жыл бұрын
I'm looking forward to mojo. Anything Latner touches turns to gold.
@kebman Жыл бұрын
Mhm how does it shake you?
@yeahmanitsmurph Жыл бұрын
It’s not about the size of your SIMD it’s what you do with it
@ThePrimeTimeagen Жыл бұрын
facts
@SpaceChicken Жыл бұрын
As long as it can measure dicts, I’m happy.
@RickGladwin Жыл бұрын
@@SpaceChicken Look let’s not turn this comparison video into a dict measuring contest.
@MrR8686 Жыл бұрын
If it performance better in worst case, like for loops, isn’t that a benefit?
This new watchmojo language is looking really cool, wish I could use it to compile rust
@copper280z Жыл бұрын
Numba, a JIT compiler package for python, seems to do a good portion of what Mojo promises. I regularly get big speedups over numpy using it, particularly because it can auto-parallelize both native python loops and many numpy function calls.
@ckmichael8 Жыл бұрын
That is basically Cython with some vectorization steriods, which can be implemented in Cython given engineering resources.
@yeetdeets Жыл бұрын
@@ckmichael8 But Numba doesn't require as high IQ. If you can use numpy you can get C-ish performance in a single function with just a decorator. It's finicky with any argument not a boolean, numerical or a numpy vector thereof though.
@ckmichael8 Жыл бұрын
@@yeetdeets Yes you are right. I think the usecase for Cython and Mojo alike is for things that Numpy does not support yet, like new algorithms that cannot be efficiently expressed in existing Numpy funtions. If there is a numpy way of doing that than Numba is certainly the better way. But then for research things like new ML algorithms there maybe no existing implementation available at all so a Cython/Mojo implementation would be required.
@Idlecodex Жыл бұрын
Hey, on tilling: this is necessary to keep the processor cache hot. The classical example is inverting the index of the two loops in a matrix-vector multiplication. The parallel algorithms for the same operation can be tuned by sizing the chunk of the matrix your are operating on. This becomes even more critical when you add another level of locality by using an accelerator like a GPU or when working in a MPI cluster.
@spicybaguette7706 Жыл бұрын
You gotta feed the beast, especially when you're extracting all the juice out of your CPU with SIMD
@spazioLVGA Жыл бұрын
Amazing stuff. Still I wonder...how is it fair to compare Mojo with plain python when numpy is basically a part of python itself at this point? Numpy often outperforms even Julia (for large arrays).
@Navhkrin11 ай бұрын
They also have comparisons to optimized numpy implementations and still achieve 2.5x over numpy. Also note that Mojo is being built as a hetegenerous language, meaning that it should be pretty straightforward to utilize GPU or other accelerators. Having all this in one single coherent package is a very big deal
@ruanpingshan Жыл бұрын
Anyone else notice that their Python performance benchmarks are for Python 3.10? Python 3.11 is supposed to have some major speed improvements.
@aoeu256 Жыл бұрын
For Python 2 you could have used psyco, there is numba, and julia and stuff
@jaysistar2711 Жыл бұрын
f32 is directly supported in almost all SIMD ISAs. f64 reduces the number of components (in 128 bits, you can have 4 32 bit float, but only 2 64 bit floats).
@ThePrimeTimeagen Жыл бұрын
ah, very interesting
@fakenameforgoogle9168 Жыл бұрын
@@ThePrimeTimeagen a lot of ML programs use F16 as well but that might be more related to memory savings than speed
@jaysistar2711 Жыл бұрын
@@fakenameforgoogle9168 While it's obvious that it's smaller, the real savings is in terms of speed. GPUs commonly use F16 for both reasons.
@isodoubIet Жыл бұрын
@@fakenameforgoogle9168 Even f8 recently
@djixi98 Жыл бұрын
did some 5min optimizations by using numpy and got it to be 1400-1800x faster than the example he provided. Still, if i can continue to code in python and make it faster, and have strong types, then i see this as an absolute win lol
@samhughes1747 Жыл бұрын
In Rust, you can pin a shared buffer, and dispatch slices from it to each core. That’s basically what I’m expecting that Mojo code to actually be doing.
@ThePrimeTimeagen Жыл бұрын
interesting
@rybavlouzi Жыл бұрын
Good stuff dude, I find your content in the land of devs on KZbin very unique. Keep it up!
@Chalisque Жыл бұрын
Basically, if you exclude a chunk of what Python can express, what remains can be made very efficient. So add a little syntax to allow you to ring-fence stuff that you want optimised. Makes a lot of sense.
@VivBrodock8 ай бұрын
ngl as someone learning python as part of my topology degree mojo looks really tempting especially once they busted out the Mandelbrot
@oeerturk Жыл бұрын
i think mojo is basically a proof of concept/best showcase for MLIR and what better accessible lang to superset to be honest. very exciting project and also very curious and excited about what mlir can accomplish for other languages.
@ryanfav Жыл бұрын
If it lowers the difficulty in being able to make code that can run on GPU's and mixed use cases, I'm all for it, still it being signup only feels very weird right now,
@SpaceChicken Жыл бұрын
You sign up for the playground right now if I read the site correctly, not the language. It made more sense to me after trying to jump in myself
@zeyadkenawi8268 Жыл бұрын
hopefully not gonna be proprietary
@u9vata Жыл бұрын
No, you are wrong about f32 and f64. On the CPU side of things (which the example is on), the float is always faster and all games, all cad, all highperf code always use float instead of double unless precision errors crop up. This is even true without simd, but with simd things are even worse because you can either do 2x as many operations on floats or 1x as many double, which is literally halving the speed! Also in the past there was a case when doing 8 byte loads were slower than 64 byte ones because certain (in most cases risc) CPUs could not even address smaller memory. But even there this have stopped at 32 bit usually so 32 bits you can manipulate even as integers just like you would do the 64 bit ones even on arm usually. So even for integers it is worthwhile to use 32 bits instead of 64 bit - for example there are builds of 64 linux kernels that enable 32 bit pointers and those are much better if you are not having more than 4gb of memory but otherwise want to do 64 bit ops usually. Also in many highperf codes of mine I tend to store indices instead of pointers because indices can be stored on 32 bit and thus your "pointer-ish" part of the data eats literally half of data cache - and yes memory is plentiful, but caches are very limited. What you say had some merit in the past, but in current state of the art highperf optimized codebases it is actually a bad advice to use double - unless of course float errors kill your alg. The story also highly differ on GPUs, but traditionally GPUs also massively operate on float and not double so in most cases its faster there too. I do not follow all architectures on the gpu side of thing for cuda and all though, but there sometimes using the bigger type is better - like on GPUs that do not support float16, those will be emulated with float32 which is bad. If the GPU does support float16 and suchformats however those can be immensely more faster for machine learning if float errors let you do that so codes usually just ask the APIs if there is support and if they can use 16 bit floats they do. Its good to have this language because python is extraordinarily slow.... Extremely... Only good for glue-code kind of fast hacking but sometimes even the glue part is slow so this is good development. I don't know what are the results though if he would compare to lets say cpython or something that is compiler ahead to time...
@Ty4ons Жыл бұрын
From what I know about GPUs FP32 is the main focus in gaming while PF64 is for compute applications. Gaming cards often lock down FP64 performance so you need to buy a workstation card to get the full performance. Sometimes different architectures are used like right now for AMD RDNA is optimized for gaming with slow FP64 while CDNA is optimized for compute with very fast FP64. I think some consumer GPUs like Intel's have no hardware FP64 so it isn't used much in client applications. It is my understanding that lower precision is becoming more and more important thanks to its use in machine learning with architectures getting improved performance for FP16, INT8 and even FP8 on Nvidia Hopper.
@nekomakhea9440 Жыл бұрын
25:00 "will work on exciting projects like Excel spreadsheets, data entry, and *building hyper-intelligent armed robots* "
@Imaltont Жыл бұрын
Kind of reminds me of Common Lisp with one of the several approaches to integrate python. For integrating python in a faster environment ofc, not syntax. SBCL even has a lot of nice SIMD stuff, native threads and green threads, has nice interactive developement and debug tools. You can also optionally declare types, which does impact performance. It could use a better package manager and some better project tools.
@MohamedElzahed89 Жыл бұрын
He's well known person in deep learning community but I would say in order to compare you could compare numpy vs mojo for matrix multiplications , dot products , etc .
@EwanMarshall Жыл бұрын
And try comparing to a cython compile too.
@MohamedElzahed89 Жыл бұрын
@@EwanMarshall yea its always easier said than done, but lets hope that works
@aoeu256 Жыл бұрын
Also @numba.jit(nopython=True), @rpython, @julia (if it exists)...
@fanshaw Жыл бұрын
Python is chosen because of its ease of use and libraries which take care of things for us. If we add all these specialist language constructs back into it, have we just undone that ease of use; is it still easily understandable; or does it provide a reasonable pathway from noob to expert?
@Loanshark75310 ай бұрын
Probably the idea is that it can be used for those creating the language and libraries. Currently many python libraries are implemented in C, C++ and Fortran. Therefore if it is possible to write fast Mojo code, the library could just be written in that reducing. The hurdle of linking different programming languages.
@zuma206 Жыл бұрын
Great content as always, keep up the good work man!
@ThePrimeTimeagen Жыл бұрын
ty
@SteveMoore-n1v Жыл бұрын
My team works on Tammy AI. Does Mojo has an API we can test?
@mario7501 Жыл бұрын
This is awesome and I'm really looking forward to when it gets released. But it is a marketing stunt. You should compare it to something like numpy with multithreading. Probably still 10-50x faster, but no one who has the slightest idea about numerical calculations in python uses for loops.
@RipazX Жыл бұрын
I can't wait for the next stage of new programming languages, like Writescript, an superset of Typescript; C+++ an superset of C++ and lets not forget; GoGo an superset of Go.
@NathanaCentauri6 ай бұрын
@13:00 I felt like there was a code that just entered your being. And you had that revelation of macro expanding life altering script that subtly changes ones life. Not all in one call but gently nudgingly like a hash or crc or plane like CUDA where the code of the Jedi master is shared with everyone everywhere with everything .. EVERYTIME
@magfal Жыл бұрын
7:20 I have used it a lot, through a custom Postgres extension written in Rust using the awesome PGRX framework. When you've got a good fit for SIMD and the tools to easily apply it the performance improvement is like going from Python to C#
@grantwilliams630 Жыл бұрын
I also wrote a ton of heuristic optimization algorithms like 8 years ago, but mine were in Matlab...
@samhughes1747 Жыл бұрын
I’m sure it’s already been pointed out, but SIMD instructions are sized specific to the registers they can handle, and some architectures aren’t actually flexible-if you don’t have data that fills the register when issuing on a GPU, then you pad with 0’s.
@olafbaeyens8955 Жыл бұрын
It is only very fast if you have very fast hardware fore it. Auto-tune may work in the bootstrap code that measures which settings gives you the fastest result. And maybe this could be changed dynamically during the running of your project.
@mvargasmoran Жыл бұрын
"What kind of BS measurement are they doing?" best question ever.
@0xhenrique Жыл бұрын
as a web developer, this video seemed like Egyptian hieroglyphs to me ngl
@some1and297 Жыл бұрын
5:45 most operations you do with just floats (if you are actually writing low level fast code) is probably going to be memory bottle necked so even if in theory an operation would take a few extra machine cycles, from my understanding it could still be potentially faster to use f32's because they theoretically take up half the memory bandwidth of a f64.
@Dev-Siri Жыл бұрын
since mojo is just python. Python devs can safely put 10+ years of Mojo experience on their resume.
@jesse9999999 Жыл бұрын
at the beginning of this video i'm really hoping this makes me want to use my mojo playground access, but there is fear in my heart
@ThePrimeTimeagen Жыл бұрын
yeah, it seems neet
@jesse9999999 Жыл бұрын
i am pleasantly surprised
@vikingthedude Жыл бұрын
U friends with the president or smthing?
@playea123 Жыл бұрын
I don’t get the criticism that they picked a bad feature of python to compare against (i.e. for loop). In my mind, it’s fantastic that improves on what python does badly. I don’t see why anyone would use it if it just improved on libraries that already use C or C++ under the hood like bumpy or pandas. The whole idea for me is that, by using mojo, you get a better version of python in all cases and especially the most basic ones (e.g. just a simple for loop) without having to learn any new syntax. I think mojo will be great for already good python developers who already use type hints. Although I am a bit salty that mojo doesn’t use the same case for types as Python (e.g. Int vs int). I don’t think mojo is trying to replace Rust or C++. The jobs current python users most do simply isn’t the same as what Rust and C++ users do (unless for some reason you work at a company that uses python for backend or game engine development). Mojo is supposed to make data engineering, data analysis, data science and ML work better. No one was really using Rust for that.
@fueledbycoffee583 Жыл бұрын
I actually work my backend in python! Our web server uses flask and is pretty good! De dx of making backend with pythong is amazing. And IMO alot better than javascript. Also the production web servers if i am correct are written in c++. So you dont get as much of a performance penalty
@playea123 Жыл бұрын
@@fueledbycoffee583 I would prefer a backend in python than JS too, but I know that doesn’t scale well and the dynamic typing is problematic. Not saying your company is bad or anything. It’s just that massive companies aren’t likely to extensively use Python on the backend.
@theairaccumulator7144 Жыл бұрын
It's closed source
@fueledbycoffee583 Жыл бұрын
@@playea123 we do have a rule: All python must be written in a typed way. We extensivly use data classes, enums and validators so we do shoot ourself in the foot as much as posssible. Since our backend is a big thing we must doit that way because without it, it would be a tangled mess.
@fueledbycoffee583 Жыл бұрын
Ironically we dont use typescript because we arrived to the conclusion that typescript is not a good type system for js. Is hightly subjective but we dont enjoy the typesystem of TS. We go along pretty ok with vanilla js.
@quachhengtony7651 Жыл бұрын
You know the saying, "if it's too good to be true..."
@Kknewkles Жыл бұрын
Hey Prime, serious question(forgive me for asking before watching the video): I've heard Mojo's aiming to replace Python as the AI "vehicle" language - but what's the point if the heavy lifting is done by the CUDA/GPU stuff? How much realistically(5, maybe 10%) can you speed up by replacing the non-GPU related things?
@TCH534 Жыл бұрын
CUDA is going to be for the AI work in python.
@Kknewkles Жыл бұрын
@@TCH534 yes, the "heavy lifting" I referenced. All the big bulky matrix multiplication stuff is done on GPUs, and is the vast majority of any workload. Python is there just as a high-level script for ease of use.
@jereziah Жыл бұрын
it's not free to call other languages, esp from python, nor are the type conversions (which is why the 'serious' python libraries force you to commit to types). The beauty of mojo will be for researchers to set fire to fewer trees, with less effort.
@Kknewkles Жыл бұрын
@@jereziah that's not gonna make the whole thing thousands or even hundreds of times faster. It's gonna make 5-10% tens, maybe hundreds of times faster.
@Kknewkles Жыл бұрын
@@markusmachel397 I feel like if anyone cared about that sort of margins(and I don't think it's 10% always... it's like... 1-5% if I think really hard about it, and it's skewed heavily towards... 1%, maybe 1.5% being at the mean), we'd see a lot more implementations being... C/C++/Rust with pure compute shaders. If we're talking squeezing every little drop of performance. I don't have hard data for this, but looks like convenience and just "roll it and shove it at hardware" is the approach. ...like with so many things these days... Seems to work not that all bad though. (though who can tell how it would work at the peak perf, def. not me)
@TankorSmash Жыл бұрын
The Fireship video just came out of few days ago, and we're already reuploading it :(
@fg786 Жыл бұрын
I'm a simple one I just wonder will we ever run out of names for new programming languages?
Just recycle them, as these junk languages will never catch on.
@di4352 Жыл бұрын
Did I miss something, where in the code examples did it imply a +35,000X speedup, I'm only seeing +4,000x at most, not a dig, but just where is it? Also, does the +14x speedup imply that the differences in hardware between Fireship and the other dev's computers at compilation affected the outcome of their code tests?
@ParanoicoBR Жыл бұрын
it's kinda fishy but the 35,000x speedup was for the mandelbrot set algorithm which they glossed over
@crides0 Жыл бұрын
If it has good type system, reasonably fast (compared to C) and doesn't have a bunch of features in it then it should be a fine language
@BlueCodesYep Жыл бұрын
Can't wait for this sounds awesome, and preciate your videos dude always a fun watch.
@draakisback Жыл бұрын
This is so funny. There are languages like rust and Julia which had bindings too all sorts of neural network frameworks which are really fast. I don't know why these people think it's a great idea to reinvent the wheel without using these other languages first. Julia is almost as fast as c in some cases and it's got all sorts of really cool symbolic math features in it
@yyy5523 Жыл бұрын
We can use numoy and do that matrix multiplication without using any for loops. Also numpy lists is faster than normal lists.
@DataPastor Жыл бұрын
True, but numpy is written actually in C, C++, Cython and Fortran, and this is the point. How can you author fast Python libraries/code without using these languages.
@yyy5523 Жыл бұрын
@@DataPastor yes
@PurpleDaemon_ Жыл бұрын
22:53 just to note, python have slots for static classes.
@jackalph.d801 Жыл бұрын
This is the first I'm hearing about mojo. I wonder how it compares in performance to julia, which is really supposed to target the same audience (in most ways) and has been around for a little bit longer. I have used julia for a while and it is very easy to write incredibly performant code, often the compiler is good enough to do some basic simd on the arrays you pass in. I would love to see how they go toe to toe.
@Rin-qj7zt Жыл бұрын
The autotune feature was low hanging fruit. I'm actually stunned its a new concept to people because it's such an old one for me. I just assumed the compiler was already doing it.
@Chalisque Жыл бұрын
I'm not a CPU expert either. But with regards to float32, the FP is done in the SIMD registers. Most likely Mojo will convert things to use packed SIMD where possible, and you can fit twice as many FP32s in a SIMD register as FP64s. Loading a single FP32 or FP64 is likely memory bound, so using FP32s means you can have more in cache at the same time. I guess an expert (i.e. not me) doing proper benchmarks will give a better picture.
@JackDespero5 ай бұрын
The only thing I dislike about Mojo is the name. But that being said, the fact that it is a superset of Python makes it so that even if I only use it 1% of the time, that it is still worth learning the extra bit of syntax. It does what Julia promised, but it actually understood the assignment: you are competing to get the Python folks, who often are not programmers but scientists, so make the transition as easy as possible. In this case, the transition is instantaneous, as you can still use all your tools. That actually makes me sad a bit, because I like the name Julia much more than I like Mojo, but oh well.
@nexovec Жыл бұрын
That's a VERY sexy product they got there. I really need my manual memory management and the syntax is still stupid, but it's a step in the right direction I think.
@FirstNameLastName-eo2pq Жыл бұрын
I think for the 32/64 bit operations.... All depends what the compiler does.... Thought on a 64-bit machine the compiler may simply use 64 bit float under and be done with it. I remember and was surprised at work going from 16 to 32 bit was little or no problem. When moving to 64bit machines *EVERYTHING* had to be on a 64bit word boundary, when not... BAM! EXCEPTION!!!! So the easiest solution was to tell the compiler to automatically align everything internally on a 64bit alignments, most everything will work, while if i'm remembering, some places the auto align didn't work but more often when doing bit operations.... Was working in c/c++ so could be were some cases where it seemed remembering most of the problems if i'm remembering correctly struct/unions/classes are , given i was working in C++ at the time, would think MOJO or other things these days, might be a hidden detail under the cover they simply do and fix up for you automagically.
@Chalisque Жыл бұрын
The other thing to consider is that by restricting this stuff to a subset of Python (in terms of what can be optimised), and not allowing precise low-level stuff like e.g. C, there are potentially more optimisations available since Mojo doesn't need to worry about pointers and other stuff. (Roughly: The more your language can do, the less the compiler can assume about a program's behaviour.) The average data scientist using this could quite likely end up with something faster than what your average C programmer could do in C (as said average C programmer likely knows less about optimising numerical code than the authors of Mojo). I look forward to seeing where this ends up.
@l0ad1 Жыл бұрын
float32 and float64 are computed in amd64 in the same FPU with 80 bits of precision, so time (latency) to compute is the same, but, if aligned properly, the cpu can parallelize internally, so you always can do twice as much float32s than float64, thus float32 gives a higher throughput generally, but both have the same latency.
@ոakedsquirtle Жыл бұрын
plus the point of SIMD is to do identical operations on multiple values, and if you are using a f64 rather than f32 that takes up twice the amount of slots for the SIMD operation
@christianpaulus5311 Жыл бұрын
15:33 "The servo mechanisms in my neck are designed to approximate Human movements. I did not realize the effect was so distracting." Data
@JonitoFischer Жыл бұрын
float 32 and float 64 are handled by the FPU, no masking operations required
@savorsauce Жыл бұрын
If Mojo can support these speedups even just with the base 8x gains. It could save lots of money when it comes to heavy computations when training big models.
@kennethbeal Жыл бұрын
You are fun. Thank you for bringing a smile to my face.
@AsbestosSoup Жыл бұрын
Has anybody found a mojo vs rust performance benchmark testing rust's worst-case to a mojo implementation and vice versa?
@theyashbhutoria8 ай бұрын
16:15 Yeah robots can just relay the text wirelessly from their "brain" to any other system.
@SimGunther Жыл бұрын
The techniques in Mojo will proliferate other more popular languages, but I don't buy this Apple hype train one bit. If they changed how we think about how we can more easily reason about programs and prototype, I'd buy into it.
@theinsane102 Жыл бұрын
fifty years and no one can beat C still 😂
@SimGunther Жыл бұрын
@@theinsane102 LOL so true!
@9SMTM6 Жыл бұрын
Maybe "MLIR" is something else, but I would guess that this is already happening with most languages, and ESPECIALLY it's already happening - and unavoidable - on GPUs? Like Rust, and also clang and most other modern compiled languages I know of, has its own intermediate representation. It does some optimizations on that, and then that goes to LLVM, another intermediate representation, LLVM then compiles to machine code. Thats 2 intermediate representations already. Then, when you want to run stuff on a GPU, modern APIs like Vulcan, Metal or Dx12 are targets from some default shader language that has a first party compiler (often C dialects) but there's also other frontends, like eg Rust - experimental community project - , but that then gets compiled to an intermediate language that is part of the API, for Vulcan that is SPIR-V, and that then usually gets compiled again to a vendor specific representation that actually gets executed.
@pif5023 Жыл бұрын
Is there a book that explains these higher concepts like vectorization, simd, unrolling, … or are they just the result of experience and mathematical reasoning?
@chrisroberts1773 Жыл бұрын
Loved 'Mojo programmer, must have 10 years experience'.
@goreosmartins2335 Жыл бұрын
I have done this parallelized matmul in C and this looks nicer but also not as hardcore, and I'm all about that hardcore. I don't want to call 'simd', i want __m512d _mm512_mask_mul_round_pd(__m512d s, __mmask8 k, __m512d a, __m512d b, int r) and nothing less
@SmirkInvestigator Жыл бұрын
O wow, I remember looking at neural-js 10 or 8 years ago. It was beyond me so I put it back.
@ThePrimeTimeagen Жыл бұрын
hah!!!
@wld-ph11 ай бұрын
Mojo Python is compiled, at the moment WSL if you have Windows. Waiting for a Windoze version, but RW read rights twins... now have X as a buddy (what a Twatter MuskRat?)
@wld-ph11 ай бұрын
(Scrolling) HelloWorld runs so much faster... lol...
@mohammedalmahdiasad6832 Жыл бұрын
yes austin power keep coming back to my head in the last week or so
@pieter5466 Жыл бұрын
13:144 Would love to know why you don't use the term "developer experience" - as well as perhaps certain other terms.
@vncstudio2 ай бұрын
Mojo, someday but for now Cython 3.1 or C are very fast to write all the computational extension modules for Python.
@alxjones Жыл бұрын
The only semi-useful part of the demo is where they did the same exact code in Python vs. Mojo, and even that was biased against Python. Let's see the most performant Python vs. the most performant Mojo. After all, Mojo isn't competing with Python, it's competing with things like Cython. Still, I'm curious about this language and what it brings to the table in terms of drag-and-drop performance improvements to Python code and an excuse to write Rust code while tricking my employers into thinking I'm writing Python.
@JoaoVitorBRgomes Жыл бұрын
6:22 machine learning works better at float32 instead of 16 or 64 because it balances precision and memory usage.
@Septumsempra8818 Жыл бұрын
"Why are he robots looking at the keyboard?"
@djupstaten2328 Жыл бұрын
A faster pyhon with structs, var and let -- I'm ready. This is killer.
@thekwoka4707 Жыл бұрын
I think you're wrong on the float 32 thing. At least generally. A compiler can actually recognize that the numbers are 32 bit and pack 2 into a single 64bit registry, which can improve performance.
@shekishral6570 Жыл бұрын
"...I always hated rust anyway..." I laughed so hard at that🤣
@ThePrimeTimeagen Жыл бұрын
gotem
@mihalious Жыл бұрын
4:11 I've heard that new fancy algorithms in spite of having better big O complexity, not really cache-friendly so in practice standard algorithm won't be slower at least. But not 100% sure that it's true.
@admercs11 ай бұрын
It’s as if Ron Burgundy had a programming channel and I love it.
@adambickford8720 Жыл бұрын
Autotune feels like its essentially JIT? Also, once you understand the superset, is it really python? Type systems, structs, etc aren't just some syntactical sugar and take a bit of learning to truly understand.
@johnwu5908 Жыл бұрын
If it provides 8x speed boost without modifying original code as they said, then there's not much barrier to transit into mojo imo
@RickGladwin Жыл бұрын
It seems like the main difference between JIT and autotune, as I understand them, is that JIT will do extra compiling work and cache the results at runtime, based on what parts of the code are being run in the interpreter most often and thereby using up redundant processing by being interpreted over and over, whereas autotune is actually compiling a given section of code a few different ways at compile time, measuring what the performance is like on that particular system, and including that in the rest of the compiled code. I’m not an expert in either feature though, and the Just In Time compiler implementation probably varies across languages.
@nadiaezzarhouni300 Жыл бұрын
Imagine if you did that level of optimization in assembly 😂 the processor will be chilling at cold temperatures in the corner and you brain and fingers will catch on fire 😂 , imagine doing it in native binary instead 💀 but it will be rewarding in terms of performance
@daysofgrace2934 Жыл бұрын
Python is versatile, If you need speed you have Cython, if you need a full stack web-app platform then you have Anvil run Python in the browser and server side, you Numpy, TF, PyTorch, Pandas, Plotly. I hardly use C/C++/VB. SQL stood the test of time, still using that since the 90s...
@anon-fz2bo Жыл бұрын
its pretty cool tbh, both swift and llvm are huge & cool in their own right too.
@sorry4all Жыл бұрын
I've never felt so craving for a language. Python with struct, but also without ; and {.
@MyAmazingUsername Жыл бұрын
It's a good thing. Give better tools to the noobs who only understand Python. At least they can learn about types and see that you get vastly improved performance by actually understanding data types. It is all machine code in the end. This is a good stopgap or learning language before systems programming.
@kebman Жыл бұрын
This got my mojo on. I've got 10 years experience with this language, and I'm looking for a worthy project.
@Leto2ndAtreides Жыл бұрын
Key Question: Does this mean that we can train our neural nets much faster? Is GPT-5 training about to get faster?!!!
@thekwoka4707 Жыл бұрын
A major part of making python 35,000x faster had to do with just how awfully slow it is to begin with. It seems like not using the integrated multi threading/parallize stuff in mojo was still only getting Python up into the realms of JS speed.
@maximusorca4146 Жыл бұрын
If it really does work with all of python, It makes Python capable of anything
@maximusorca4146 Жыл бұрын
Well according to the playground and documentation, it works with All python libraries and isn't only for Machine Learning so we will se what the future holds.
@itermercator114 Жыл бұрын
(I'm at 14:09), the one benefit I see is that it's Python but has all the quality of lifes of C. Even if it isn't for AI, it's basically Python that isn't as gimped if you've used lower level languages.
@LukeDickerson1993 Жыл бұрын
would it be possible for an AI to track your eye movements to see which live-comment a streamer is reading, and hold it in view as unread comments scroll by?
@spell105 Жыл бұрын
You don't need AI for something like that. Eye tracking has been around forever.
@woolfel Жыл бұрын
Mojo is using MLIR and LLVM under the hood. The authors of Mojo are compiler experts, so it's not surprising MOJO is so much faster. The real reason why MOJO has auto-tune is it makes it trivial to target different hardware. NVidia 4 series cards have thousands of cuda cores, so the speed up is going to be even higher than CPU cores. Then there's TPU and the other AI accelerators.
@MrR8686 Жыл бұрын
I think the main struggle is it not open source
@chrisochs7112 Жыл бұрын
Unity does something similar with their burst compiler. In both cases these solutions are solving within the constraint of being a superset/subset of another language. The approach has pros and cons. You are carrying all the baggage of the language you are extending with you. And you lose some breadth in random places vs not just building something like this on top of a good general purpose language that is already performant like Rust/C++. But it's not like they had a lot of choices either. Build your own vs what? It's really C++ or Rust and from there you run out of good options really fast.
@ոakedsquirtle Жыл бұрын
Most Python jobs and C++/Rust jobs do not have as much overlap. The data scientists and such that are using Python will benefit greatly by integrating Mojo into their system.
@luciusbest7155 Жыл бұрын
how does it compare to Julia?
@davidramos7955 Жыл бұрын
Yeah knowing about Julia this doesn't look that remarkable
@DuarteMolha Жыл бұрын
It is not a question of what language is best it is about the userbase. Having this as superset of python makes it much more valuable than julia
@excelfan85 Жыл бұрын
Thanks @ThePrimeagen I just burned all my rust lang books and am eagerly awaiting your mojo merch and future everything becomes a mojo convo, also can we agree that we measure dicts with the same measurement we use for horses. Hands, how many hands is your dict?
@vitluk Жыл бұрын
This is really cool! I might finally have a reason to use "python" again
@bharatDarshaByDrone Жыл бұрын
Recruiters are hiring for mojo developer with 10 years of experience 😅
@codewithhakem Жыл бұрын
ThePrimeTime: "Classic.. EVERYONE KNOWS ABOUT THAT"