Hella New AI Papers - Aug 9, 2024
53:56
Can LLMs Learn by Teaching Other LLMs?
28:56
Multi-Head Mixture-of-Experts
25:45
14 күн бұрын
Exponentially Faster Language Modeling
27:38
The Structured Task Hypothesis
10:40
Пікірлер
@The9thDoctor
@The9thDoctor Сағат бұрын
This just seems like they're calling what's normally referred to as in-distribution test set (so not in train but from the same general distribution) "inside OOD" otherwise where do you draw the line between inside OOD and in distribution validation?
@thomasmartin9599
@thomasmartin9599 3 сағат бұрын
I immediately knew listening to this that there were going to be comments about you not knowing what you're talking about, while they don't know anything about your background, and of course the comment section delivered.
@huytruonguic
@huytruonguic 4 сағат бұрын
so I suppose inside ood is modeling data lies within the span of training datapoints and outside ood is modeling data lies in an orthogonal dimension to the space spanned by the training datapoints
@wwkk4964
@wwkk4964 7 сағат бұрын
Great 😊
@TiagoTiagoT
@TiagoTiagoT 11 сағат бұрын
i not sure if this is just going over my head or if I'm onto something. If the issue is the steps are moving away from the surface of the unit hyphersphere, what if instead of doing the steps in a Cartesian-like space, you handled it like a generalization of quaternion-like rotations in whatever dimensionality you are dealing with?
@robertphipps1430
@robertphipps1430 11 сағат бұрын
this might solve the 'frame problem' which early more procedural approaches to AI found difficult. Context is all about working out what is important, and an expanding context window would effectively be a solution to the basic problem of working out what IS relevant information in a certain situation.
@deltamico
@deltamico 12 сағат бұрын
It's odd they didn't try to accumulate the tokens in an episode and chose a single one instead
@Tunadorable
@Tunadorable 10 сағат бұрын
right! and i didn’t look heavily into how they chose said single one, i think it had to do with its “representativeness” according to some metric used in the graph grouping stuff they did to tune the memories. i’d be interested to see ablations and compute comparisons between this single token, a sum & norm, and some more sophisticated pooling mechanism (attention based?)
@kimcosmos
@kimcosmos 12 сағат бұрын
So it can recommend paragraph, section and chapter breaks? And from that build an index? Finally, an AI boredom graph.
@jmarkinman
@jmarkinman 18 сағат бұрын
They should change the title from “infinite” context to “unbounded” context, as “infinite context” implies something physically impossible.
@Tunadorable
@Tunadorable 15 сағат бұрын
ppl do love that word haha
@peteyman949
@peteyman949 18 сағат бұрын
Oi!
@RoulDukeGonzo
@RoulDukeGonzo 19 сағат бұрын
Plz link induction head vid
@RoulDukeGonzo
@RoulDukeGonzo 19 сағат бұрын
Finally! Hierarchical attention.
@deltamico
@deltamico 12 сағат бұрын
Not really if there is only one level of selection
@RoulDukeGonzo
@RoulDukeGonzo 10 сағат бұрын
@@deltamico I know, but it's a start
@jonnylukejs
@jonnylukejs 21 сағат бұрын
bro who stole my work
@jonnylukejs
@jonnylukejs 21 сағат бұрын
I literally own the copyright to the code for this.
@Tunadorable
@Tunadorable 20 сағат бұрын
lmao relatable
@Tunadorable
@Tunadorable 20 сағат бұрын
haha i assume simultaneous creation is a real jerk but if you’re being literal about them ripping code from your github i would love to see said code. can’t remember if i checked whether they open sourced theirs for comparison to be able to make that claim
@81618j
@81618j 22 сағат бұрын
In practice, the output of such a thing looks something like this --Notice how I used a very short poem, but I've tried it on Essays and it works similarly: --- Title: Tidal Emotions: The Turbulent Rhythms of Love Keywords: love poetry, emotional waves, metaphorical imagery, romantic turmoil, oceanic symbolism, intense feelings, poetic expression --- Abstract: This poignant piece of love poetry explores the tumultuous nature of romantic emotions through vivid oceanic imagery. The author employs a powerful metaphor, comparing feelings to waves, to convey the cyclical and often overwhelming nature of love. Unlike traditional romantic poetry that might focus on gentle, pleasant aspects of affection, this work delves into the more challenging and intense experiences of emotional attachment. The poem's brevity belies its depth, offering a concise yet impactful exploration of the speaker's complex emotional landscape. By subverting the expected imagery of "pretty whitecaps," the poet presents a raw and honest portrayal of love's less idyllic aspects, resonating with readers who have experienced the unpredictable nature of deep emotional connections. Author and Affiliation: Anonymous Surprise Factor: The poem's unexpected turn in the last two lines challenges conventional romantic imagery, presenting a more realistic and potentially unsettling view of love. By rejecting the notion of gentle, playful waves, the author implies a deeper, more turbulent emotional experience, possibly hinting at the difficulties and intensity of the relationship. This subversion of expectations creates a powerful impact, encouraging readers to reflect on the complex and sometimes uncomfortable realities of romantic attachments. Table of Contents: N/A (Single short poem) Reference: Amy O'Connor - Drowning They come in waves, my feelings for you. And not pretty whitecaps dancing at my feet.
@rhaedas9085
@rhaedas9085 Күн бұрын
I tried something that I think is similar to this (without the math part). My idea was to convert conversations into tokens for storage, and when a new prompt would be entered it would look up past events and pull things that matched closely, in theory be memories of related topics based on the token vectors. It didn't work because I don't know enough about the intricacies of tokenization and math (basically wasn't as plug and play as I was hoping for) so I did the next best thing and stored these past conversations as text logs which I then would look through with each prompt to find similar topics. In the end I actually used the LLM to do this analysis search first, then pulled the first few random good matches and incorporated them into the prompting. Even with the much less effective method it does seem to remember things. I think it only worked because I used an uncensored model that had no limit for input. I was hoping for a different approach to try but as you went through the paper a lot of it felt familiar in the general approach. I do think the token direction would work a lot better and faster, since it's a much better way to compare concepts than textual search.
@Tunadorable
@Tunadorable 21 сағат бұрын
ah sounds like you did a prompt engineering/automation version of this same general (intuition/pattern/structure/idea/methodilogy), very cool
@StephenRayner
@StephenRayner Күн бұрын
Inaccuracy is fine, inconsistency however would be a big issue.
@omarnug
@omarnug Күн бұрын
Yeah, never understood this deflation risk thing. Why eat today when I can eat next month for a lower price. xd Why live in a house now when I will be able to rent/buy one cheaper in ten years. Anyway, technology gets cheaper and cheaper and I don't see anyone waiting years to buy a new smartphone.
@Tunadorable
@Tunadorable 20 сағат бұрын
absolutely beautiful distinction you’ve made between needs and wants
@davidhart1674
@davidhart1674 Күн бұрын
Thanks!
@Tunadorable
@Tunadorable Күн бұрын
no thank you☺️
@davidhart1674
@davidhart1674 Күн бұрын
Thanks!
@Tunadorable
@Tunadorable Күн бұрын
thank you☺️
@acestapp1884
@acestapp1884 Күн бұрын
About being able to influence brain processes, the assembly of the narrative is likely hierarchical, with various feedbacks of different strength between different levels and different brain areas inside the same brain.
@Tunadorable
@Tunadorable Күн бұрын
YES! unfortunately in these videos and with topics like these it rarely makes sense or is possible for me to touch on every single related thought, but this is one I was tempted to get into and you said it more concisely than I would've
@johnkintree763
@johnkintree763 Күн бұрын
Long term storage of EM-LLM memory segments can probably be managed in a graph structure, similar to vector storage within neo4j graph databases. A related development is the release of Falcon Mamba 7B. Apparently, increasing the amount of context included in the prompt does not increase the requirement for RAM.
@Tunadorable
@Tunadorable Күн бұрын
Not an expert or even competent with graphs & vector storage, but my impression is that that should work and be a great opportunity to expand the scalability of this technique. Pretty sure the authors mentioned vaguely the potential for significant improvements in that area Yes that's true for any Mamba model, but the problem with those is that the model has to not only decide what info is important (which can be loosely described as a problem of having to predict what info will turn out to be relevant/useful later), but any info deemed not important gets permanently lost. In contrast, here the memories not currently in context brought back up again if they end up being useful at some future time period, meaning the model does not have to guess as to what is going to be useful in the future. To be fair the specifics of this method involve only keeping the surprising memories and actually losing anything outside of that context buffer around said memories, but that could be easily changed for anyone looking to make their own version, they'd just adjust the hyper parameters until practically everything is put into a memory (which would mean a more expensive qk memory lookup, probably not worth it).
@technokicksyourass
@technokicksyourass 18 сағат бұрын
Mamba is a state space model, it doesn't work based on attention, rather it does something more like learning a differential function that maps the context to a latent space, then integrating the function over long sequence lengths. That's how it scales linearly in context rather than polynomials. Attention compares every token in the sequence to every other token in a giant table.
@johnkintree763
@johnkintree763 Күн бұрын
The ShareLM paper does sound interesting. A variation would allow people to select parts of conversations they want to contribute to collective intelligence. The LM extracts entities and relationships to be merged into a global shared graph representation. Keeping the human in the loop is important to catch and correct mistakes made by the LM before merging. Scale up the system to merge the knowledge and sentiment expressed in millions of simultaneous conversations with people around the world, with real-time fact checking. Imagine having a conversation with such a collective human and digital intelligence.
@corgirun7892
@corgirun7892 Күн бұрын
I personally think that the mechanism behind human episodic memory is far more complicated than this. When humans return to a specific situation, they can instantly recall things that happened decades ago. Does the human brain really store kv caches for decades? I don't believe it.
@markonfilms
@markonfilms Күн бұрын
I think we encode some kinda sparse representation. Also our dreams etc seem to really help us form long term memories, so maybe it requires a kinda dreaming so to speak and a lot of reflection and updating our connections and weights dynamically. Plus there are many aspects we don't understand at all yet.
@thehipponugget3287
@thehipponugget3287 Күн бұрын
Aren't we also sortve remembering memories of memories? Like the memory gets rewritten everytime it comes up. Off the top of my head I can't remember anything I haven't remembered in years, unless specifically triggered by something like a smell or music
@attilaszekeres7435
@attilaszekeres7435 Күн бұрын
Indeed, memories are not stored "in the physical brain" as we understand locality, physicality and the brain, but instead in the so-called biofield, the magnetosphere and other internally coherent, nested, interpenetrating domains of a spatiotemporally distributed holarchy which comprises the physical correlates of our past, present and future selves. The intricacies of these concepts are well beyond your command, in stark contrast to the straightforward matter of your car's extended warranty, which we'd like to discuss with you now.
@BasiC7786
@BasiC7786 17 сағат бұрын
@@attilaszekeres7435 human magnetospehre, Biofield? bullshit. Try science not some magic.
@TiagoTiagoT
@TiagoTiagoT 15 сағат бұрын
Humans "in-paint" memories from bits and pieces, many studies have demonstrated eye-witnesses are not reliable.
@alexanderbrown-dg3sy
@alexanderbrown-dg3sy Күн бұрын
In theory it works, but not practically. Systems like these, need to be coupled with thinking tokens, so the most semantically contextual segments are retrieved based on attention, more specifically model reasoning like humans, and instead of relative segment similarity…BUT there are a lot of ideas I took from this part. Like NLL for novel observations and event boundary detection. FYI this is what I used to actually make quiet-star useful, explicitly but autonomously allowing the model to generate useful thoughts, not to mention I use it for the basis for this new style of meta self-supervision I created for the offline token re-weighting phase. So, all and all - pretty amazing ideas in this paper, the value from some of the underlying principles are vastly understated. Great vid bro. No paper is safe lol. I see you meant that ha. Keep them coming bro.
@81618j
@81618j 22 сағат бұрын
nah, bruh.. it works. Vector systems use semantic search to identify relevant content. What this paper says is create a boatload of stuff that will trigger a !found from the vector when typing in a query.
@girlmoment669
@girlmoment669 19 сағат бұрын
ponderocity rambanctious reciprocity segmentation
@GNARGNARHEAD
@GNARGNARHEAD Күн бұрын
neat
@be1tube
@be1tube Күн бұрын
This sounds pretty simple to implement (at least as this type of paper goes.) It would be really useful when writing narrative text simulations. E.g. ... (history of simulation for all characters up to 10:00) ... "What happens between 10:00 and 10:05 from the perspective of <character 1>", ... "What happens between 10:00 and 10:05 from the perspective of <character 2>" ... "Eliminate contradictions" ...
@cinnamonroll5615
@cinnamonroll5615 Күн бұрын
So, technically, the next gpt could have adhd ? And if so, did we just solved the mathematical form for adhd ?
@dadsonworldwide3238
@dadsonworldwide3238 Күн бұрын
Lol, apparently it's only historically miss aligned companies involved as if tuning is more for censorship less about letting congruent line of measure go
@Tunadorable
@Tunadorable Күн бұрын
Maybe I just forgot something I said in the video, but I'm curious as to why you drew a relation between this and ADHD. Could you elaborate?
@dadsonworldwide3238
@dadsonworldwide3238 Күн бұрын
@Tunadorable I don't recall you mentioning chemistry. Lol miss aligned measure does exist in this topic & understanding in our public domain even experts struggle with it . a lot of miss aligned 1 Teaching methodology 2 poor diagnosis 3 weak understanding of intelligence, intellectual iq testing, etc etc etc
@dadsonworldwide3238
@dadsonworldwide3238 Күн бұрын
@Tunadorable if anything, your strengthening evidence that memory is less chemical and more thermodynamical. If a human doesn't see value to enter a memory in the first place, it won't ever be remembered or encoded. No matter what reasons
@cinnamonroll5615
@cinnamonroll5615 3 сағат бұрын
​@Tunadorable so like, in 3.2 and 3.3 they said something about the positional embedding that could improve the robustness of the model sine its a fixed positional embedding, now, instead of a fixed positional embedding, you just got static noise as it is, and sometimes, the static repeat in some pattern that can move the token in their latent space ( similar to how adhd can relate seemingly random topic) And 3.3 they have theorised that the event can be recall most efficiently is the event that is correlate in some fields, but since our embedding is noise, sometimes, the noise can move the token that makes it related (like how apple being moved close to phone due to random noise and we got iphone, how episodic memory got moved to AI and somehow we got adhd ?) Im aware that the dimenson of the latent space is so large that we cant just deploy numpy.random to move token around, we need sthg something random but still predictable to some degree to maybe mimic adhd brain ?
@c0ffe3caf3
@c0ffe3caf3 2 күн бұрын
"Patch-Level Training for LLMs" reminds me of the "smeared key" trick from the Antropic "In-context Learning and Induction Heads" work
@lefthookouchmcarm4520
@lefthookouchmcarm4520 2 күн бұрын
Did this disprove it?
@WmJames-rx8go
@WmJames-rx8go 2 күн бұрын
Thanks!
@Tunadorable
@Tunadorable 2 күн бұрын
no thank you ☺️
@WmJames-rx8go
@WmJames-rx8go 2 күн бұрын
This is an excellent podcast. You are raising profound concepts to ponder. I am sure I will watch this many times and reflect differently on the things said here each time I do watch. Thank you very much. I send out a paper thanks to you also. PS. One of the many ideas you brought up was the concept of a pause token. I thought that was a very valuable insight to making LLM's function better.
@dianes6245
@dianes6245 3 күн бұрын
The problem with this explanation is that its not adequate. Some people become very aware of their internal mind. Super aware. This is actually a goal of meditation. Sure, for those who are not aware of thier mental activity and mind , one is kind of a robot who acts without agency. But some become master's of mental states with high levels of awareness. The tendency is to study typical people. Instead, study the outliers. Usually, this is not a good idea. But it is the exceptions that will challenge your hypothesis. It is likely that you, as researchers, are exceptions too. High achievers are people who have to master their inner states. Because those states will derail much of what they try to do, if not dealt with. Look at people who dont get fat... those who graduate, who get careers and sucede. Who stay fit their entire lives. Those who dont commit crimes, who dont get divorced. Who stay healthy for 90 years. The ones who do not become addicts or alcoholics. These are people who get control of their minds and make those subconscious processes work for them. And maybe they are not really subconscious after all. But to take the position that agency is important, one has to give up the idea of a drifting, meaningless mind, that is ruled by unknown neural processes, that are invisible to the person. You have to ask if Abraham Lincoln was such a person, or Leonardo, or Einstein. If a ANY OF THEM had true conscious agency, then a theory based on almost random neural processes must be wrong. It tough to explain that the ethical compass Lincoln had is a by product. Its also very early in the neurological game. VERY. Whats going on is guess work. Not well founded science. Science has to be reductionist, or its not science. That's a truism. Nevertheless - reductionism usually results in wrong ideas. Actually - always. We've never had a purely reductionist idea work very long.
@zerotwo7319
@zerotwo7319 3 күн бұрын
This is basic stuff. Lots of philosophers and even games like soma already knew this. The guy who wrote the paper is just taking advantage that computer science guys generaly don't know philosophers like nietzsche and probably many other names and theories that already explain that. This probably has many names in diferent fields. Our cells specialize to stimuli, specialization means lack of diversity. A computer might see the world just as sum. There won't be any other type of operation to that 'perception of logic gates'. Some insects can see the infrared spectrum, isn't it part of reality? That means our perceptions aren't truthful.
@descai10
@descai10 3 күн бұрын
I highly doubt consciousness doesn't have a function. I think that study was flawed, but even if the conclusion is true, it doesn't prove that consciousness has no casual effects. Even if the body decides before the consciousness realizes, it doesnt mean your previous conscious experience didnt influence that decision. Regardless of if the chicken or the egg came first, they both feed into each other.
@nathanhelmburger
@nathanhelmburger 3 күн бұрын
Seems like this applies more to a simple creature with a simple behavioral repertoire. For a complex animal, perceiving the world accurately (where possible ) is generically advantageous because the exact moment to moment details of a high fitness percept are constantly changing throughout life. A safe sleeping location doesn't look like food , and neither of those look like an opportunity to mate.
@ruroruro
@ruroruro 4 күн бұрын
How cute. They wanted to speed up the transformer, so they added a teeny-tiny LSTM on top of the output head.
@moji2363
@moji2363 4 күн бұрын
Reality appears to be infinitely complex, with countless emergent properties, each having numerous values. Regardless of how sophisticated our interface with reality becomes, it can never fully capture this complexity. This leads us to a crucial point: at some stage, we must translate infinite complexity into a finite model within our brains. This is where the concept of interface becomes significant. Our senses are inherently limited, necessitating the translation of phenomena into sensory inputs or feelings we can process. From an evolutionary standpoint, there's little benefit in perceiving quantum properties, the intricacies of the standard model, or covalent bonds. Instead, evolution has equipped us with more practical responses, such as feeling nauseous when oxygen levels are inappropriate. While it's true that accommodating more objectives may lead to a more truthful representation, this doesn't negate the fundamental limitation of our interface with reality. No matter how many objectives we incorporate, we're still dealing with a simplified model of an infinitely complex universe.
@jyjjy7
@jyjjy7 4 күн бұрын
@@moji2363 That any aspect of the universe is infinite is pure speculation. That black hole entropy has a limit seems to indicate there are not infinite degrees of freedom in a given area of spacetime, while the apparent flatness of spacetime may be just that; apparent rather than actual, just like the apparent flatness of the Earth. But anyways yes, our conscious perceptions are a sparse course grained symbolic model of local physics as derivable from patterns in sensory impulses. That said these patterns certainly do contain real information about the statistical properties of large collections of quantum mechanical interactions. Emergent properties are real, in fact properties/dynamics that aren't emergent are pure philosophical speculation. Even if such a thing were to exist it may be completely inaccessible to observes operating at our scale of size and energy.
@Tunadorable
@Tunadorable 4 күн бұрын
“at least effectively infinite relative to the amount of complexity our brains can actually feasibly model” might’ve been better
@jyjjy7
@jyjjy7 4 күн бұрын
@@Tunadorable Yeah, physical infinities are permanently in the category of speculation. It's impossible to distinguish between something that's larger than your ability to measure and something actually infinite
@spiraldude
@spiraldude 4 күн бұрын
Hey buddy, I really enjoy your videos. You give a very good breakdown of papers and more importantly explain what's goin on underneath. It's helped me a lot in fostering a deeper understanding of what I am doing for a living haha. That being said, you have a really annoying speech tendency, where you raise the voice at the end of a sentence? It makes every other sentence sound like a question? It makes listening to you create internal friction in me? You're a really smart guy, I'm sure you can debug yourself if you put some attention to it. Cheers and keep going :)
@Tunadorable
@Tunadorable 4 күн бұрын
hahaha question, how long have you been watching the videos? i used to do one-take absolutely zero video editing practically live streams whereas now ive got this program that automatically removes silent portions meaning i literally will pause in the middle of a sentence to think and sometimes i probably get distracted and switch to a new sentence or something. if you haven’t already seen them could you do me a favor by watching a video from like two months ago and tell me if it was also an issue back then before this new editing thing i’m doing? if it’s a figment of this new editing process then i think i know how to go about fixing it, but if it was even the case earlier then unfortunately for you i’m completely tone deaf in general so that wouldn’t be solveable
@adommoore7805
@adommoore7805 4 күн бұрын
I know I don't have the technical vocabulary to explain things in a standardized way.. 😅 so I use terms like "sensory fidelity". But I find it interesting how certain perceptions are more or less evolved based on this complex dynamic of necessity. Like, how we have often high degrees of fidelity with certain smells, tied strongly to memory, because those smells signify important survival related data. Yet other smells which also would give us important survival data, we simply cannot detect at all. C02 for example, most biology cannot detect it. And pockets of it under lakes, or sands, can leak creating traps where thousands of creatures wander into these areas unknowingly, since they cannot smell this gas. And then die by breathing in this invisible substance. But it makes sense because, biology doesn't encounter this phenomenon regularly enough to have devoted evolution toward detecting it. So, it's ultimately interesting how, in the struggle for systemic equilibrium, life cannot just endlessly add new features, but must make the most fit possible combination within the parameters of the likely energy consumption, which must be met. Adding new things to a sense, or increasing the fidelity of a sense is naturally an issue of available nutrition, and a systems capacity for converting food into usable energy fast enough. The limitations make for the diversity. As was mentioned though, the brains ability to change faster than the actual structure of the sensors. Just like in computing and robotics, the hardware is a harder problem than the software. So it's as if nature has made it possible to tweak and update the deep hardware structure to compensate for dynamic change. Similar to software patches, only accumulative in how the change occurs. Yet, far more rapidly than evolutionary time for the hardware of the species. I often wonder how the new dynamic between necessity and intent will play out in this technological era, as we take the reigns of our own evolution. I would imagine that as solutions are innovated, not only will our fidelity increase, but also our sensory toolset. Leading to and incredible ability to accurately interpret the vibrational substrate.
@OpenSourceAnarchist
@OpenSourceAnarchist 4 күн бұрын
I'm a Buddhist and agree with most of the critiques from the paper. Your highlight of Western/Eastern thought is really apt; ontological egoism is just assumed as the only model by so many western philosophers and everyday people. I did want to point you towards the REBUS model for self-organizing brain dynamics, essentially a more robust version of Friston's free energy principle. Seems to suggest bidirectional organization from bottom-up neuronal circuits to top-down higher-order circuits and back, globally attempting to minimize the "surprise" of stimuli , e.g. variational free energy. QRI, David Pearce, and Friston all point towards the same conclusions... it wouldn't be efficient to only have bottom-up control flows!
@cinnamonroll5615
@cinnamonroll5615 4 күн бұрын
if brain activity cause consciousness , then gpt would have some sort of weird pop-in pop-out each time we promp, and some rare case, it can see and dream
@huytruonguic
@huytruonguic 4 күн бұрын
the paper gives a beautiful analogy to something very intuitive, being that natural selection only selects for survival
@jyjjy7
@jyjjy7 4 күн бұрын
@@huytruonguic *and reproduction. Sometimes these goals are counter to each other, as every non-virgin male preying mantis knows... for the few seconds its head can know things after being severed at least
@pokerandphilosophy8328
@pokerandphilosophy8328 5 күн бұрын
The authors of this paper defend an epiphenomenalist conception of consciousness that aligns closely with reductionist and hard-determinist stances on the philosophy of free-will and determinism. They seem to follow rather closely in the footsteps of Daniel Wegner (The Illusion of Conscious Will), whom they quote and cite. Hence, they can't contemplate the possibility of genuine mental causation. Mental causation is a case of downward-causation, whereby emergent phenomena that are apparent at a high-level of organization of a complex system can causally influence the low-level processes that merely enable them (but don't direct them). The authors seem to be blind to the possibility of genuine emergence and non-reductive explanations of the causal structure of complex systems. The recent (emergent) achievements of LLMs in point of high level cognitive abilities seem to strikingly undermine the thesis of those authors since those abilities emerge from a process of training on the basis of massive corpus of human generated texts without any explicit guidance from low-level "unconscious" or pre-programmed processes. The abilities that LLMs have to understand the user query and provide meaningful responses are striking examples of top-down causation: meanings and intentions being causally responsible for the selective activation of the low-level features that are being represented in the underlying neural network.
@Tunadorable
@Tunadorable 4 күн бұрын
i think this is why i have a comment section
@be1tube
@be1tube 5 күн бұрын
I see two errors: treating the environment as having a fixed utility and not considering that one has a veridical perception of utility. Different situations have different utilities depending on the response. Hot needs sweat and cold needs clothes. With those responses, utility can be equalized. And you need to distinguish them to respond appropriately. Second, the "utility based" perception at the bottom 10:14 is a perception of a true characteristic of the environment (how good it is for the agent), it's just not the aspect of the environment Hoffman was interested in. Choosing a good basis for perception that filters out noise is what agents should do. When one creates a noisy environment, one should not call the signal plus the noise "the truth" and call the agent incorrect for ignoring the noise.
@angelorf
@angelorf 5 күн бұрын
Didn't expect the part about your own experiments to be more interesting than the research than you're describing in the first part!
@simonstrandgaard5503
@simonstrandgaard5503 5 күн бұрын
What about cutting a sphere?
@Ginto_O
@Ginto_O 5 күн бұрын
you look stoopid
@BjornHeijligers
@BjornHeijligers 5 күн бұрын
@tunadorable Awesome video on the evolutionary consequences of reality sensing. I'm impressed you actually did your own work in this field. Making the distinction between sensation systems and perception systems is genius in my book. Do you have any papers or publication on that work ? May I introduce you to the concept of "Sense making"? In my career as data scientist I used to work with piramid model with "data" at the bottom, adding metadata provides "information", recognizing patterns provides "knowledge" and the ability to solve problems, being able to use the knowledge to predict the consequences of actions results in "wisdom". Only in the last 10 years have I come to appreciate that the choice of language and concepts at the lower levels fundamentally impacts what can be achieved at the higher levels and that a process of "Sense making" is required to inject new words and /or prune old words is an essential part of sustainable adaptive planning. Of course maybe this is already trivial for you. in that case: Keep up the great work!
@Tunadorable
@Tunadorable 5 күн бұрын
No actual published paper, but here's a GitHub repo with the code, results, and a pdf of the original pre-rough-draft that I was working on in 2022. Please forgive my ugly inefficient code, I was actually using this project to learn python for the first time. And thanks! github.com/evintunador/SenPer
@infuriatinglyopaque57
@infuriatinglyopaque57 5 күн бұрын
Great video! Hoffman’s ideas are fun to think about, and he’s been really prolific at advocating them on podcasts - which has resulted in many lay people taking his ideas for granted, despite the existence of sensible counter arguments. Would be interesting to see how these sorts of evolutionary simulations play out with more complex environments, and more flexible agents - e.g., mid-sized ANN agents with the potential to acquire a wide range of compressed representations. Here are some relevant papers at the intersection of cogsci & deep learning - which touch on these issues to some extent, and which might overlap more with your current interests than those using the more simplistic toy simulations: Unsupervised learning predicts human perception and misperception of gloss. Nature Human Behaviour (2021) Sensory perception relies on fitness-maximizing codes. Nature Human Behaviour, (2023) Neural representation in active inference: Using generative models to interact with-and understand-the lived world. Annals of the New York Academy of Sciences (2024).