Neural Networks for beginners: How to write general notation!

  Рет қаралды 252

jen foxbot

jen foxbot

Күн бұрын

We're learning more about the math behind neural networks, the foundation of Large Language Models (LLMs), a type of AI model that gives us ChatGPT!
This is part 3, where we learn how to write the general notation for a neural.network of any size!

Пікірлер: 5
@dc1049
@dc1049 3 ай бұрын
Really enjoying the content and I appreciate your approach in making it conceptually accessible! I'm going to have follow up questions soon but first I'd like to chew on it.
@mininao
@mininao 4 ай бұрын
wow, super clear and enlightening ! thank you so much for making this video :) !
@JenFoxBot
@JenFoxBot 4 ай бұрын
yayyy so glad to hear!! thank you for sharing :D
@andresj5512
@andresj5512 4 ай бұрын
I'm not going to lie, since I'm a potato when it comes to maths everything after the addition of layer 4 is a blurry hieroglyph hahaha. My question is: We have a number of inputs on layer 1 with a number of layers doing calculations on "how influential" is the input of the previous layer into the result (hope this is correct) How we avoid getting all the results at once; What is the function that "filters" undesired results? From what I got, we are only saying from 0 to 1 the result of the influence from the previous layer, the result should be closer to "this". But if we are not discarding any inputs, it's hard for me to think that the result will be anything different that the neural network training +/- some difference according to the input. I know I'm missing much of the "real work" that it's done inside the artificial neuron, but we are always taking its result for the next layer or not necessarily? Thanks for the video!
@JenFoxBot
@JenFoxBot 4 ай бұрын
If i understand your question correctly, you're wondering how we get to a single prediction from all those big layers? The output of a neural network is that h_theta(x) function --> this is the prediction of our neural network, the hypothesis function. For example, for LLMs, the prediction is "what is the next most likely word". During training, we train the hypothesis function over MANY rounds, or epochs (typically >>1000). It starts by being very, very inaccurate, and then the full training program does gradient descent (i.e., is the next prediction better or worse than the last, if better keep going in that direction, if worse then try diff direction). After training, we test the hypothesis function on a new dataset to check accuracy. If it's good, we may release into the world! But, yes, a lot of complicated layers go into the hypothesis function which then outputs a single # between 0 and 1. Simplifying a bit but hopefully that helps. Plz LMK if I did not capture your question!
Neural Networks for beginners: cost function!
18:39
jen foxbot
Рет қаралды 626
Neural Networks for beginners: intro to back propagation!
16:45
Шокирующая Речь Выпускника 😳📽️@CarrolltonTexas
00:43
Глеб Рандалайнен
Рет қаралды 11 МЛН
Can you beat this impossible game?
00:13
LOL
Рет қаралды 56 МЛН
Чай будешь? #чайбудешь
00:14
ПАРОДИИ НА ИЗВЕСТНЫЕ ТРЕКИ
Рет қаралды 2 МЛН
neural networks for beginners: generalized backprop!
15:51
jen foxbot
Рет қаралды 967
Transistors in 100 Seconds
1:40
V Electronics
Рет қаралды 3,7 М.
the most valuable skill you can learn!
1:54
jen foxbot
Рет қаралды 494
Quantum Connectedness!
3:33
jen foxbot
Рет қаралды 949
yayy for radical self acceptance!!
2:39
jen foxbot
Рет қаралды 165
Physical constants and the miracle of life!
3:03
jen foxbot
Рет қаралды 126
Google Data Center 360° Tour
8:29
Google Cloud Tech
Рет қаралды 5 МЛН
Шокирующая Речь Выпускника 😳📽️@CarrolltonTexas
00:43
Глеб Рандалайнен
Рет қаралды 11 МЛН