Check out HubSpot's FREE AI Prompt Library Now! clickhubspot.com/h14h
@mmmm76826 күн бұрын
no
@Happ1ness26 күн бұрын
Meta doing "Open"AI's job is still kinda surprising to me, lol
@user-cg7gd5pw5b26 күн бұрын
Are you insinuating that Saint Zuckerberg is otherwise untrustworthy?!
@Gamatoto203825 күн бұрын
Shouldn’t OpenAI be renamed to closedai 😂
@peterjackson453024 күн бұрын
@@Gamatoto2038 Bang!
@JTient19 күн бұрын
Yeah i still don't trust zuck but good on him I would have rather paid 12$ to keep my privacy so you ain't gonna fool me again. Also Darpa program lifelong he was handed it. Even Elon scares me. The outcome doesn't look good we got to flip the tables before it's too late.
@ichbin198426 күн бұрын
The reasons why the technical reports are the most cited is because everytime you use the models in your own research, you reference the technical report. So with 23k published papers, of course the technical reports will be at top
@aditya8anil24 күн бұрын
That’s something new I learned today
@I77AGIC26 күн бұрын
you need to either divide citations by the time it has been out or make a graph showing citations over time where the day each paper is released is shifted to the same place on the x axis. then you would be able to see which papers grew the fastest.
@mrj77426 күн бұрын
Came here to make this comment 👏
@npc441625 күн бұрын
yeah so ranking for growth rate for number of citations over time rather than absolute citation count
@heywrandom892424 күн бұрын
An example of shifted curves is available on the github star history website which allows comparing repositories
@LeBeautiful26 күн бұрын
ByCloud with the amazing AI analysis videos..can’t wait what’s in store for your channel and AI as a whole in 2025
@minecraftermad26 күн бұрын
just barely missed meta's new paper which seems it'll change stuff in the next year alot. (byte latent transformer) also i'm very surprised nGPT isn't here.
@XenoCrimson-uv8uz26 күн бұрын
can you give me a summary of it
@npc441625 күн бұрын
released 1 day ago Meta's new Byte Latent Transformer (BLT) model outperforms tokenization models, up to their tested 8B param size. The canon previously was that it won't be possible to make byte-level models stable, or make them converge in training. their main claim is: "For fixed inference costs, BLT shows significantly better scaling than tokenization-based models" Traditionally, LLMs use tokenization - breaking text into predefined chunks (tokens) using fixed vocabularies this works directly with bytes (dynamic patching) (rigid, fixed-size tokens) -> (dynamically segments text into patches based on byte entropy) Byte Entropy: A measure of information complexity that determines how much computational resources should be allocated to different text segments. (higher entropy indicates more unpredictable or complex data segments) [Instead of treating all text in the same way] we changed: tokenization - breaking text into predefined chunks using fixed vocabulary into: Byte Latent Transformer - working directly with raw bytes (dynamic patching) we got: Improved performance on reasoning tasks Enhanced long-tail generalization Superior character-level understanding quote: BLT architecture trends between Llama 2 and 3 when using significantly larger patch sizes. The bpe tokenizers of Llama 2 and 3 have an average token size of 3.7 and 4.4 bytes. In contrast, BLT can achieve similar scaling trends with an average patch size of 6 and even 8 bytes. Inference flop are inversely proportional to the average patch size, so using a patch size of 8 bytes would lead to nearly 50% inference flop savings. Models with larger patch sizes also seem to perform better as we scale model and data size. BLT with patch size of 8 starts at a significantly worse point compared to bpe Llama 2 at 1B but ends up better than bpe at 7B scale. This suggests that such patch sizes might perform better at even larger scales and possibly that even larger ones could be feasible as model size and training compute grow.
@CantoTheDegenerate66624 күн бұрын
@@XenoCrimson-uv8uz Basically gets rid of tokenizers and interprets the input's bits directly
@erkinalp24 күн бұрын
@@CantoTheDegenerate666 except even while processing byte by byte the model tends to invent some kind of morphemes by itself
@firecat666624 күн бұрын
@@CantoTheDegenerate666 So it's like a tokenizer but with a token for each individual character?
@김인영-q5x26 күн бұрын
Thank you. I have been learning about LLMs in general. This video helped me alot!
@moomanchicken646626 күн бұрын
9:55 these are distrubition graphs so its showing that there is variance in the accuracy rather than showing that the accuracy is deteriorating
@hydrargyrumm25 күн бұрын
thanks, I'll get started now.
@noctarin151626 күн бұрын
Can you cover Meta's Byte Latend Transformer and Coconut (Training Models to Reason in a Continuous Latent Space)?
@CantoTheDegenerate66624 күн бұрын
I hope you make a video on Byte Latent Transformers and Large Concept Models, both from Meta (THE GOAT). These two imo are complete gamechangers!
@RedOneM26 күн бұрын
I wonder in how many papers ChatGPT is a ghostwriter author...
@npc441626 күн бұрын
very interesting.... i wish to know what the future of the ai llm space is going to be, we know that scaling transformers are giving diminishing returns, as seem by top ai labs like open ai, meta, google etc... so i wonder which of these techniques would it be that will be the next big thing that we scale to go further.... will it be mamba... or KAN or maybe diffusion LMs,... who knows, only time will tell...
@2034-SWE25 күн бұрын
Diminishing returns? OpenAI?
@yannickm542923 күн бұрын
@@2034-SWE if we consider scaling transformers only then yes, diminishing returns. The latest advancement is based on reasoning capabilities, not because of even more compute. The transformer architecture has almost reached its limit with regards to scaling and performance benefits. Not saying it wont be overcome or we switch architectures but this is the current state
@npc441622 күн бұрын
@@yannickm5429 yes exactly, the transformer architecture pleatued, so now everyone it looking for the next big thing, like open ai did with o1, they claim that Large reasoning models are the next big thing but if we look at the results of the latest o1 paper then these reasoning models dont seem to scale well... like for example in some cases o1 preview gives better results than o1 full, so maybe this architecture is not all about scale.... we will see... we also have to see if these reasoning models are actually that good to begin with, that open ai claims aswell, like yes they are better but they are still sometimes only as good at other llms like for example claude 3.5 sonnet (new), its just an llm yet its on the same level as o1, so maybe LRMs are not that big of a deal and we need a truly novel architecture from from the ground up ... like ilya sutskever said, the age of scaling transformers is over, now we need to find a preplacement for pretraining itself... lets see...
@npc44165 күн бұрын
yes so now we scale test time compute instead @@2034-SWE
@XiangyuChen-t1q22 күн бұрын
How to sort these papers by citation numbers?
@yagzhandag380315 күн бұрын
how did you sort the papers by citation on arXiv?
@Steamrick24 күн бұрын
Do any papers from November (or December at this point) even have any citations yet? I mean, someone has to read the paper and then write and publish a paper of their own for a citation to exist... how much can a paper be worth if it was farted out in less than a month?
@BertVerhelst25 күн бұрын
do you think a llama 3.3 7b model will be released?
@geckoo919017 күн бұрын
Hey, that website is great it has a lot of scientific papers, although it seems to be addressed to engineering and technology, I can't find a lot about micro biology
@shadowoftokyo26 күн бұрын
Where are the weekly posted banger researchs in the community tab though ? I miss them
@badizzl22 күн бұрын
I just found a paper from Meta AI about Large Concept Models. I'm still a layman but it sounded very promising for coherence and energy consumption. So far it works with text-to-concept and speach-to-concept encoders and a concept-to-text decoder, but I think it could work with other modalities (e.g. video) too, if you make encoders/decoders for that. I can't explain it. Just read it for yourself
@callmebiz26 күн бұрын
Have improvements in pure CV models plateaued? Or are we just not noticing cuz LLMs is what's everyone's been talking about the past 2 years?
@ddoice26 күн бұрын
Noice video, but you should normalize the citations with cit per day.
@myliu626 күн бұрын
Pretty clear that transformers dominated this year. I'm curious to see the most cited in other fields like diffusion, or RL. After all, the biggest breakthrough usually come where not everyone is looking.
@Neuroszima26 күн бұрын
"AI and ML" bro it is only NLP in there, or NLP-related paper analysis, maybe with some twist of generating images Xd
@versaknight26 күн бұрын
Yeah lol.
@jmoney469526 күн бұрын
Well, LLMs dominated the conversation, so when ranking by citations, it makes sense.
@Neuroszima26 күн бұрын
@@jmoney4695 yeah i know, it is understandable, but it still made me laugh when he said "and that's it in the news of *AI and ML* ", like, bro XD...
@human_shaped25 күн бұрын
They should have just been weighted by days since publication.
@alkeryn170026 күн бұрын
so close to 32768 !
@khanghoutan470626 күн бұрын
This is such a nerdy comment
@npc441626 күн бұрын
pls explain
@smohanta901625 күн бұрын
@@npc4416 max value of 16bit signed integer
@alkeryn170025 күн бұрын
@@npc4416 32768 is a power of 2, programmers deal with them pretty often. the amount of ai papers published in 2024 is close to that number.
@Delmaler126 күн бұрын
this list is biased towards early papers. because they have more time to be cited
@surajsamal416126 күн бұрын
bro why dont you put a lot of vidoes love youre videos btw
@mfpears22 күн бұрын
2024 is far from over.
@rickyrickster130326 күн бұрын
6:54
@KAZVorpal25 күн бұрын
It's a shame that the Apple paper demonstrating what we experts knew, that LLMs don't reason, isn't on the list. People don't like the truth. Ah, I see that you did give a monthly...but that you don't understand its impact. LLMs don't reason. They just look up answers, one token at a time.
@LiebsterFeind26 күн бұрын
I am horribly disappointed that you did not cover all 34,276 papers in this video. Shame! 🤣
@locusruizlopez599724 күн бұрын
So much information 😅... Rhis is so fast
@panzerofthelake446026 күн бұрын
plz look into metas AI papers, one that's about BLT (Bit Latent Transformer or in the lines of that) and COCONUT (Chain of continuous thought). Please.
@Ari-pq4db17 күн бұрын
Awesome
@DmitriZaitsev24 күн бұрын
Please remove the disturbing background music it's not possible to concentrate on the video
@joemonteithcinematography747726 күн бұрын
Wow I am early!
@fionnanobaoighill26 күн бұрын
gee pee tee
@StefanDeleanu26 күн бұрын
GG
@renanmonteirobarbosa812926 күн бұрын
How about the top 10 worst papers ?
@aron292220 күн бұрын
This is a really bad way to find interesting papers
@Ahmed.Shaikh11 күн бұрын
i'm trying to find interesting papers and would love to know what a better way would be to gauge interest for a given research paper...