Dmitry Krotov | Modern Hopfield Networks for Novel Transformer Architectures

  Рет қаралды 9,807

Harvard CMSA

Harvard CMSA

Күн бұрын

Пікірлер: 8
@Anikung17
@Anikung17 Жыл бұрын
Excellent talk, very interesting developments with the energy transformer
@Lippman-y9z
@Lippman-y9z Жыл бұрын
thanks for sharing
@maxkho00
@maxkho00 Жыл бұрын
Ngl, this was pretty confusing. For one, the two energy formulae at 12:32 are only equivalent if i=j, i.e. if the contribution of each feature neuron is evaluated independently; now, the second formula can be intuitively understood as representing the extent to which the state vector's shape in the latent space matches the shape of each of the memories, but the first formula is harder to conceptualise, and it's never explained how the first formula can be practically reduced to the second (i.e. why not considering the interdependencies between the feature neurons in the energy formula doesn't make a practical difference). Secondly, without an update rule or at least a labelled HLA diagram, it was really hard to visualise the mechanics of the network; I had to pause the video and google the update rule to understand how dense Hopfield networks are even supposed to work. Dmitry did make the very vague statement that "the evolution of the state vector" is described, in some way, by the attention function, but he didn't explain in what way (is it the update rule? Is it a change vector? Is it something else? What does "V" correspond to? etc), which was pretty frustrating. For anyone watching, the attention function is the update rule where V is a linear transform of K; the value of the attention vector is substituted for Q, and the formula can be applied recursively. In general, I think more high-level explanations ─ especially within a consistent framework ─ would've been very helpful.
@joeysmoey3004
@joeysmoey3004 Жыл бұрын
For your first point, this is not true because the square of the sum is not the sum of the squares. There are cross terms which give you the non-independence.
@revimfadli4666
@revimfadli4666 4 ай бұрын
I wonder if this can somehow link with state space models like mamba, or with liquid networks
@xynonners
@xynonners 4 ай бұрын
there's a paper proving diffusion and modern hopfield networks are identical
@michaelcharlesthearchangel
@michaelcharlesthearchangel 9 ай бұрын
Only geniuses realize the interconnectiveness between the relationship between Hopfield Networks and Neural Network Transformer models then latter Neural Network Cognitive Transmission models.
Lecture 4 | Introduction to Neural Networks
1:13:59
Stanford University School of Engineering
Рет қаралды 683 М.
ВЛОГ ДИАНА В ТУРЦИИ
1:31:22
Lady Diana VLOG
Рет қаралды 1,2 МЛН
26. Chernobyl - How It Happened
54:24
MIT OpenCourseWare
Рет қаралды 2,9 МЛН
Modern Hopfield Networks - Dr Sepp Hochreiter
1:26:34
IARAI Research
Рет қаралды 9 М.
Dmitry Krotov - Generative AI models through the lens of Dense Associative Memory - IPAM at UCLA
1:10:15
Institute for Pure & Applied Mathematics (IPAM)
Рет қаралды 517
Lecture 11/16 : Hopfield nets and Boltzmann machines
55:52
Blitz Kim
Рет қаралды 15 М.
Spiking Neural Networks for More Efficient AI Algorithms
55:42
WaterlooAI
Рет қаралды 70 М.
Invariance and equivariance in brains and machines
52:51
MITCBMM
Рет қаралды 8 М.
Vision Transformer Basics
30:49
Samuel Albanie
Рет қаралды 35 М.
Dense Associative Memory in Machine Learning
56:26
MITCBMM
Рет қаралды 5 М.