Awesome as always! Worth noting that the added columns (""Sex_male", "Sex_female", etc) are now bools rather than ints, so you need to explicitly coerce the df[indep_cols] explicitly at around @25.42 -- t_indep = tensor(df[indep_cols].astype(float).values, dtype=torch.float)
@mohamedahmednaji55444 ай бұрын
❤
@senditco6 ай бұрын
I might be cheating a lil because I've already done a deep learning subject at Uni, but this course so far is fantastic. It's really helping me flesh out what I didn't fully understand before.
@zzznavarrete Жыл бұрын
Amazing as always Jeremy
@howardjeremyp Жыл бұрын
Glad you think so!
@blenderpanzi10 ай бұрын
Semi off topic: What I really dislike about Python is the lack of types (or that type hints are optional). It really makes it difficult to understand things if you learn complicated new stuff like this. Is that argument a float or a tensor? What is the shape of the tensor? If that would be in a type of the function argument it would make reading the code much more easy when learning this stuff.
@Kevin-mw6kc9 ай бұрын
If python had a strong type system it would be misaligned with it's purpose.
@noahchristie52677 ай бұрын
You can force more typing and type restrictions with different external sources and scripts
@minkijung3 Жыл бұрын
Thank yo so much for this lecture, Jeremy🙏
@420_gunna10 ай бұрын
Does anyone have an alternate way of explaining what he's trying to get across at 1:05:00?
@mdidactics7 ай бұрын
If the coefficients are too large or too small, they create gradients that are either too steep or too gentle. When the gradient is too gentle, a small horizontal step won't take you down very far, and the gradient descent will take a long time. If the gradient is too steep, a small horizontal step will correspond to a big vertical drop and a big vertical swoop up the other side of the valley. So you might even get further away from the minimum. So what you want is something in between.
@jimshtepa5423 Жыл бұрын
why use sigmoid and not just round the absolute values of predictions to either 1 or 0?
@hausdorffspace11 ай бұрын
Clipping the values might not be too bad if they are mostly in the range of 0 to 1, but if they were evenly spread out between 0 and 1000 (for example) then most of the values would get clipped to 1. You would have to scale them down first, and that means you would need to know how much to scale them by. With the sigmoid, it doesn't matter how big the numbers get, they will be squashed to the range of 0 to 1. Also, the sigmoid is differentiable, which makes it easy to calculate a gradient.
@tegsvid86772 жыл бұрын
Why we divide the layer1 with n_hidden?
@rizakhan293811 ай бұрын
54:43 looks like its tough to control here.Good Question
@ansopa Жыл бұрын
coeffs =torch.rand(n_coeff)-0.5 What is the use of subtracting 0.5 from the coefficients? Is there a problem that the values are just between 0 and 1? Thanks a lot.
@m-gopichand10 ай бұрын
torch.rand() generates random numbers range 0 to 1, subtracting 0.5 from the random coefficients is a simple technique to center the random values around zero, I believe that help in optimizing the gradient descent.
@emirkanmaz605910 ай бұрын
Shifting the range between -0.5, 0.5 so it can take positive and negative. There is different strategies you can google "weight initialization strategy" Libraries does this auto for relu or tanh etc
@thomasdeniffel21224 ай бұрын
In `one_epoch` at 44.09, there is a `coeffs.grad.zero_()` missing :-)
@iceiceisaac8 ай бұрын
I get different results after the matrix section.
@thomasdeniffel21224 ай бұрын
Adding a dimension at kzbin.info/www/bejne/laO7q5iNppl2bNk is very important as otherwise the minus in the loss function, which then would do incorrect broadcasting leading do an model, which achieves at most 0.55 accuracy. The error is silent, as the mean in the loss function hides this.
@navrajsharma24258 ай бұрын
27:18 Why don't we have a constant in our model? How can we know that there's not going to be a constant in the equation? Can someone explain this to me?
@Deco3548 ай бұрын
I think the dummy variables effectively act as a constant because they’re either 1 or 0
@mustafaaytugkaya3020 Жыл бұрын
@howardjeremyp I haven't examined the best-performing Gender Surname Model for the Titanic dataset in detail, but something seems rather strange to me. Isn't using the survival status of other family members constituting a data leak? After all, at the time of inference, which is before the Titanic incident, I would not have this information.
@twisted_cpp Жыл бұрын
Depends on how you look at it. If you're truing to predict whether a person has survived or not, and you already have a list of confirmed survivors and casualties then it's probably a good way to make the prediction, as in if Mrs X has died, then it's safe to assume that Mr X has died as well. Or if their children have died, then it's safe to assume that both their parents are dead if you consider that women and children board the lifeboats first.
@hausdorffspace11 ай бұрын
This is going to sound very pedantic, you use the word "rank" where I think "order" would be more correct. Rank usually means the number of independent columns in a matrix. At about 1:02:00, you say that the coefficients vector is a rank 2 matrix, but I would say its rank is 1 and its order is 2.
@leee54877 ай бұрын
torch.rand(n_coeff, n_hidden) How does one set of coeffs, output 20 n_hidden values? I mean, mathematically, a single set of coefficients multiplied by a specific set of values will alway equal the same thing right?
@bobuilder44446 ай бұрын
Im assuming you are in the section about NN (before deep learning). The term n_hidden is a bad variable name. Its only 1 hidden layer, but the hidden layer is the linear combination of n_hidden relu's. Each of the relus have coefficients to learn which we store in a matrix size n_coeff by n_hidden.
@mattambrogi80045 ай бұрын
Great lesson! I found myself a bit confused by the predictions and loss.backward() at ~37:00. did some digging to clear my confusion up which might be helpful for others: - At 37:00 minutes when we're creating the predictions, Jeremy says we're going to add up (each independent variable * coef) over the columns. There's nothing wrong with how he said this, it just didn't click for my brain: we're creating a prediction for each row by adding up each of the indep_vars*coeffs. So at the end we have a predictions vector with the same number of predictions as we have rows of data. - This is what we then calculate the loss on. Then using the loss, we do gradient descent to see how much changing each coef could have changed the loss (backprop). Then we go and apply those changes to update the coefs, and that's one epoch.
@blenderpanzi10 ай бұрын
1:02:38 Does trn_dep[:, None] do the same as trn_dep.reshape(-1, 1)? For me reshape seems a tiny bit less cryptic (though the -1 is still cryptic).
@michaelphines1 Жыл бұрын
If the gradients are updated inline, don't we have to reset the gradients after each epoch?
@LordMichaelRahl Жыл бұрын
Just to note, this has been fixed in the actual downloadable "Linear model and neural net from scratch" notebook.
@garfieldnate2 жыл бұрын
It drives me absolutely batty to do matrix work in Python because it's so difficult to get the dimension stuff right. I always end up adding asserts and tests everywhere, which is sort of fine but I would rather not need them. I really want to have dependent types, meaning that the tensor dimensions would be part of the type checker and invalid operations would fail at compile time instead of run time. Then you could add smart completion, etc. to help get everything right quickly.
@howardjeremyp2 жыл бұрын
You might be interested in hasktorch, which does exactly that!
@garfieldnate2 жыл бұрын
@@howardjeremyp Hey that's pretty neat! Wish it worked in Python, though :D
@c.c.s.11022 жыл бұрын
What helped me was reading the PyTorch source code with the `??` operator and thinking about the operations in terms of linear algebra. It's hard to keep all of the ranks in mind. At the end of the day I just have to keep hacking through the errors.
@blenderpanzi10 ай бұрын
What does Categorify do? I looked it up and didn't understand. Is it converting names ("male", "female") to numbers (1, 2) or something?
@VolodymyrBilyachat5 ай бұрын
Instead of splitting code to cells I like to run notebook in VsCode and i can debug as normal
@СтаниславКолчин-н7ы5 ай бұрын
it's awesome! thanks a lot
@DevashishJose Жыл бұрын
Thank you for this lecture jeremy.
@alyaaka827 ай бұрын
I was wondering why you replaced the NaN in the data frame with the mode not the mean?
@AzhaanNazim3 ай бұрын
what would be the mean of names ?
@anthonypercy17708 ай бұрын
Simply brilliant workshop... I had to change/add dtype=float e.g pd.get_dummies(tst_df, columns=["Sex","Pclass","Embarked"], dtype=float) to get it to work maybe due to a later version of pandas?
@ukaszgandecki9106Ай бұрын
Thanks! should be pinned, I had the same problem. For people googling, error was: "can't convert np.ndarray of type numpy.object_"
@tumadrep00 Жыл бұрын
What a great lesson given by the one and only Mr. Random Forests