Categories for AI talk: Category Theory Inspired by LLMs - by Tai-Danae Bradley

  Рет қаралды 5,410

Pim de Haan

Pim de Haan

Жыл бұрын

Motivated by the recent emergence of category theory in machine learning, we teach a course on its philosophy, applications and outlook from the perspective of machine learning!
See for more information: cats.for.ai/
In this third invited talk, Tai-Danae Bradley will discuss:
The success of today's large language models (LLMs) is striking, especially given that the training data consists of raw, unstructured text. In this talk, we'll see that category theory can provide a natural framework for investigating this passage from texts-and probability distributions on them-to a more semantically meaningful space. To motivate the mathematics involved, we will open with a basic, yet curious, analogy between linear algebra and category theory. We will then define a category of expressions in language enriched over the unit interval and afterwards pass to enriched copresheaves on that category. We will see that the latter setting has rich mathematical structure and comes with ready-made tools to begin exploring that structure.

Пікірлер: 15
@knowledgereact
@knowledgereact 28 күн бұрын
Great talk. It addresses questions that I have been wondering about. You've made a lot of interesting progress.
@madlarch
@madlarch 6 ай бұрын
Fascinating talk. First time I've really started to see why it is useful to work with enriched categories. And now I've got a whole bunch of references to look up for further info!
@NoNTr1v1aL
@NoNTr1v1aL Жыл бұрын
Absolutely amazing video!
@aug_st
@aug_st Жыл бұрын
Amazing talk. fundamentally important to understanding all machine learning. language model pretraining is like a first pass at learning the structure of language by understanding word contexts. knowledge graph learning and link prediction using contrastive learning is eerily similar to the use of the unit interval enriched category between sets. so many interesting insights and connections here.
@borntobemild-
@borntobemild- Жыл бұрын
Sounds like Douglas Hofstadter's ideas on cognition. Interesting.
@araldjean-charles3924
@araldjean-charles3924 Жыл бұрын
This is really inspiring work. On "blue", and "red" functor, a "color" functor might be instructive. What are all the relationships that have to do with a color, very Yoneda? maybe?
@annaclarafenyo8185
@annaclarafenyo8185 Жыл бұрын
It would be interesting to see how embedding models of complex sentence structure, Chomsky grammars, are embedded in the category of inclusion with probabilities, and this property of language seems to be learned by current AIs without explicitly being programmed in, and the categorical interpretation seems to explain how, if the structure of embedding in the category sense includes the syntactical structure of embedding in the Chomsky sense.
@efraimcardona8452
@efraimcardona8452 Жыл бұрын
They seem to learn a functor where they represent the semantics. In the Lawvere sense.
@MDNQ-ud1ty
@MDNQ-ud1ty 7 ай бұрын
Language is categorical. It is topological. Koopman proved that all non-linear models can be linearized by enlarging them to higher dimensions. The world is "linear". This means that essentially vector spaces can be used to explain all things. NN's lift models in to higher dimensions and then linearizes them. The structure in the domain will still be mapped(completeness). Linear algebra is well developed to solve any problem in theory. Everything else is an optimization problem. I think when she says "X has no structure" is a bit misleading. If domain truly has no structure then representing in R is meaningless unless one is trying to put arbitrary structure on it(or rather structure of R). The entire point of representing something in something else is to try and expose and reveal the underlying structure in something one understands. Category theory doesn't solve this since category theory is simply an abstract language. What it does provide is a nice way to represent those ideas without getting bogged down in specifics.
@annaclarafenyo8185
@annaclarafenyo8185 7 ай бұрын
@@MDNQ-ud1ty This is nonsense, possibly AI generated.
@MDNQ-ud1ty
@MDNQ-ud1ty 7 ай бұрын
The only reason this works is because language has a topological/categorical structure on it structure that effectively mimics the reals. The mapping in to R works because we know how to deal with R and because the mapped structure does have structure in it. It's not just an arbitrary set(if it was then the mapping would be arbitrary and all derived structures based on R would be meaningless). The idea is really just representation theory. Represent a structure in terms of another more familiar to try and understand the former.
@shiva_kondapalli
@shiva_kondapalli Жыл бұрын
In the functor of points approach, schemes are representable functors, contra variant functors from the category of Schemes to set. Just thinking aloud, is it possible to model language using this algebro-geometric perspective?
@MDNQ-ud1ty
@MDNQ-ud1ty 7 ай бұрын
You can model anything, it is only about effectiveness. Generally speaking, the more data you attach to your models and retain through structural disambiguation the more informative your model will be.
@ansschapendonk4560
@ansschapendonk4560 5 ай бұрын
Nothing new! This kind of information is already written down in articles and books since 2013 about the soundhelix (Lauthelix, klankhelix). The Semantics of Derivational Morphology: Theory, Methods, Evidence (Linguistische Arbeiten, 586) by Sven Kotowski (editor), Ingo Plag (Editor) Since Radboud University confirmed that Nomen est Omen (our names say who we are), we know that a name like Ingo PLAG helixes in PLAGIARISM, because words lengthen from the back (adjectio) and dissolve from the front (detractio) like in Dutch WRAAK (revenge) that helixes in German (w)RACHE. In the announcement, Plag talks about “innovative methodologies” by which he means the “universal sound helix”. Unfortunately, Plag copied these “new and very interesting insights” from the books about the Universal soundhelix (klankhelix, Lauthelix) that are in the Deutsche Nationalbibliothek (German National Libraries) in Leipzig and Frankfurt. Plag worked at the same time with a Dutch female teacher at the Philipps-Universität-Marburg. She was falsely accused of sexual harassment, which was followed by immediate dismissal, while the actual background came to light by a notorious German whistleblower. He accused the president, several deans and the ombudsman of covering up her research results with which Ingo Plag plagiarizes in all his publications. In the announcement he talks about “oft-neglected fields” of certain directions in linguistics. It is indeed true that Plag and all his colleagues (including Noam Chomsky) never came up with the idea of sound rules expressed by Goropius Becanus. In addition to adjectio and detractio mentioned, Becanus pointed out the permutatio in which words must be read from back to front. Instead of 'neglect', one should speak of ignorance. In addition to the well-known rows P, T and K, Becanus also pointed out a fourth row W that was never understood by Plag and Co. It is indeed painful that linguists themselves never came up with these ideas, but what falls under the heading of criminality is the fact that they completely ignore the research results of Becanus and the soundhelix (klankhelix, Lauthelix). The question is therefore not whether we should judge this book here, but rather the authors, professors and employees who are given carte blanche through publishers affiliated with their university when it comes to offenses such as plagiarism and making false accusations. For example, the Horizon 2020 project, with subsidies amounting to millions of euros, is solely due to the rediscovery of the soundhelix. With the soundhelix one can not only reconstruct the past, and then correctly; one can also use it to spell and therefore predict the future. It is therefore not just about 'predicting semantic properties' but about a whole system that, like the Oracle of Delphi, can indicate future discoveries. If we look at religious books such as the Torah and the Bible, for example, we can predict the words to which these books refer using the soundhelix. From STAR helixes STER because vowels helix alphabetically (a > e > i > o > u). Via detractio helixes out of STER > (s)TORA while the adjectio is the cause of the extension of the word. From TORA helix (t)ORAK > ORACLE, but through the permutatio we now also know that from ORAK / KARO is helixing, that helixes in KORAN, because there is a second octave available, which gives the vocals a higher pitch, just like in Canaan. In addition to TORA, however, TUAS also is helixing, which is TUAS GLOS, the BIBLE, which literally means 'second writing'. The Oracle of Delphi is never taken seriously by male scientists, because this knowledge was controlled by women. And here is the point, because it now appears that women during pre-Jewish matriarchy already had more knowledge of linguistics than men do today. So that's where the shoe pinches. So the KORAN is not the third book, but the fourth! KORAN helixes in QUA'RAN, but than in QUINTESSENCE, which deals with FIVE. This fifth book has a pointe, a clou, because that is the discovery of the soundhelix self! The soundhelix forms words automatically over which humans themselves have no influence. This phenomenon is therefore 'divine' (i.c. female, but also related with ‘devil’), which leads one to conclude that beliefs in a God (or Allah) is based on a kind of superstition. It has nothing to do with a male God, but only with a Mother Nature! The soundhelix has a MA-the-MA-tic pattern, which is indeed related to the mama’s who were the first human beings who could count, but which is indeed related to the algorithms referred to by Plag in his announcement. The concept of GOD can be translated via permutatio into GOD/DOG, which in Dutch means HOND, which via adjectio helixes in HONDERD, which helixes in HUNDRED in English. But in Dutch, HONDERD is helixing in HUN DRIETJES (their three) alias 1-0-0 in which the ones and zeros refer to algorithms. In Plag’s books, you will never find a reference to the Dutch language, while it is now indisputable that not only German helixes out of the Dutch language, but also English ánd French. And this finding turns the history of Europe upside down. But what is even much more interesting is the discussion going on in astrophysics. Jonathan Oppenheim wants to have discovered something new by combining certain theories when it comes to quantum mechanics, gravity and string theory. In English, a ‘string’ refers to 'underpants'. During the pre-Jewish matriarchy, women were in charge, which is indicated by the saying 'wearing the pants'. Now the image of the molecule that refers to the SAMARIUM is shaped like a pair of pants, which deals with the number five: the pentagram! SAMARIUM contains the name of the Holy Mary who had a baby as a virgin. A synonym for baby in Dutch is a 'broekie' (pants)! The soundhelix is able to give us insight into knowledge about cosmology without us having to study the universe. GRAVITY helixes out of Dutch KRAP which helixes via permutatio in PARK, but P > F and K > C which is pronounced as S, resulting in GRAVITY > FORCE, i.c. FIFTH FORCE! Particles split in two, which is associated with SCHAAR (scissors). Astrophysicists still don't know why Schrödinger's CAT can be dead and alive at the same time. Women are called FOTZE in German, which means helixes out of Dutch VOD which means RAGS in English. Out of SCHAAR(s) > RAGS is about reduction. The VOD / DOV helixes in DUIF (pigeon, but also ‘peace”) ánd VOODOO, which gives the connection between the soundhelix alias THE WORD (Jesus) and religion a different dimension. For centuries, people have been led to believe that bad behavior will get them into hell. This threat came from outside. People now hardly believe in an all-powerful God. Knowledge about the soundhelix should lead to the conviction that there is indeed something like an omniscient mechanism through which people are stimulated from within to behave well. We call this behavior charity. Books like this by Ingo Plag are therefore extremely relevant. But it has now also been proven that Ingo Plag himself is an idiot, because through knowledge of the soundhelix he should know that his plagiarism will come true! Pale(s)tin(t) means "white faces", since the Jewish people are not coming from Israel but out of the North of France: IJZERHOEK! Ingo Plag knows this. Nice regards, Ans (Johanna) Schapendonk
Categories for AI 1: Why Category Theory? By Bruno Gavranović
1:00:08
ROCK PAPER SCISSOR! (55 MLN SUBS!) feat @PANDAGIRLOFFICIAL #shorts
00:31
Just try to use a cool gadget 😍
00:33
123 GO! SHORTS
Рет қаралды 85 МЛН
I CAN’T BELIEVE I LOST 😱
00:46
Topper Guild
Рет қаралды 43 МЛН
A New Perspective of Entropy
5:46
Tai-Danae Bradley
Рет қаралды 9 М.
Emerging architectures for LLM applications
55:19
Superwise
Рет қаралды 49 М.
Bruno Gavranović (Strathclyde)
54:00
AG at Nottingham
Рет қаралды 310
What A General Diagonal Argument Looks Like (Category Theory)
36:10
Generative AI in a Nutshell - how to survive and thrive in the age of AI
17:57
No Priors Ep. 39 | With OpenAI Co-Founder & Chief Scientist Ilya Sutskever
41:59
No Priors: AI, Machine Learning, Tech, & Startups
Рет қаралды 202 М.
The Language of Categories | Category Theory and Why We Care 1.1
19:24
Iphone or nokia
0:15
rishton vines😇
Рет қаралды 1,9 МЛН
Какой ПК нужен для Escape From Tarkov?
0:48
CompShop Shorts
Рет қаралды 270 М.
cute mini iphone
0:34
승비니 Seungbini
Рет қаралды 4,8 МЛН
Samsung Galaxy 🔥 #shorts  #trending #youtubeshorts  #shortvideo ujjawal4u
0:10
Ujjawal4u. 120k Views . 4 hours ago
Рет қаралды 10 МЛН
Gizli Apple Watch Özelliği😱
0:14
Safak Novruz
Рет қаралды 3,7 МЛН