Thanks Yannic! I found it odd that they used L2 instead of Crossentropy + "label smoothing", it should have the same regularization effect of not pushing the logits to infinity.
@YannicKilcher4 жыл бұрын
Good point!
@edoardoguerriero24644 жыл бұрын
It seems to be really sensitive to the image size. If you draw a number big as the size of the window the cells start getting a hard time finding an equilibrium.
@albertwang59744 жыл бұрын
Great, it's a mini implement of thousand brains theory.
@JuanColonna4 жыл бұрын
Great video and explanations. Thanks
@etiennetiennetienne4 жыл бұрын
isn't this just a convolutional rnn? if so can't they pass a hidden state bounded between -1,1 and not the decoded logits ?
@YannicKilcher4 жыл бұрын
yes there's certainly a lot in common
@etiennetiennetienne4 жыл бұрын
if it is a rnn, i don't think this L2 stuff is needed, othewise people would do the same for standard rnn training. either detach the prediction from the state or feedback the probabilities after a softmax ?
@patf97703 жыл бұрын
There's something about this very reminiscent to Geoffrey Hinton's GLOM
@shairuno4 жыл бұрын
Wow, I realize that a transformer in NLP is also similar to the message passing mechanism. But for a transformer, the number of passing messages is limited.
@potatooflife86034 жыл бұрын
It's so cute.
@rickandelon93744 жыл бұрын
I like your name :~)
@RickeyBowers4 жыл бұрын
It definitely needs a "this is not a number" option.
@sayakpaul31524 жыл бұрын
I see many grounds to connect this to implicit neural representations.
@herp_derpingson4 жыл бұрын
9:30 Its like the children's game called Chinese whispers? 10:30 In non-trivial problems, we might run into some bandwidth issues. 17:30 Cells deteriorating over time. Is this analogous to ageing? 26:40 is that what I think it is? :)
@YannicKilcher4 жыл бұрын
It's like chinese whispers with backprop, whatever that translates to :D The analogy to ageing might be pretty weak, but maybe :) something like being more and more set in your ways over time ;)
@qw43164 жыл бұрын
Hi your explanation is good ,is there any paper about CNN
@АлексейТучак-м4ч4 жыл бұрын
that recurrent residual convolution resembles neural ode paper
@004307ec4 жыл бұрын
what if the signal processing part use localized convolution (with some constraint to prevent kernels from being too different from each other)
@1998sini4 жыл бұрын
How and where do you find these interesting papers?