@15:41 "with great complexity comes....great power" with great power comes great responsibility. with great responsibility comes great expectations. with great expectations comes great sacrifice. with great sacrifice comes great reward. And thus... the objective function was maximized
@nishkarshtripathi61234 жыл бұрын
But we have to minimize it here.
@RahulMadhavan4 жыл бұрын
@@nishkarshtripathi6123 Thank you for the correction! min f(x) = max -f(x) and thus the great sacrifices were not in vain :-)
@lakshman5873 жыл бұрын
Awesome!!!
@rahulpramanick2001 Жыл бұрын
@@RahulMadhavan this is only true if max f(x) is the global maxima.
@RahulMadhavan Жыл бұрын
@@rahulpramanick2001But alas we only seek from the function great reward, and not the greatest reward. For achieving such greatness, you need a dash of convexity apart from the aforementioned complexity!
@syedhasany18095 жыл бұрын
Shouldn't W_L at 6:31 be 'kxn' and not the other way around?
@jagadeeshkumarm33336 жыл бұрын
In a = b+w*h formula either w should be transposed or w size should be (no.of outputs by no.of inputs). only then the matrix multiplication w*h happens as expected.
@mratanusarkar4 жыл бұрын
Ya, It completely depends on how you represent the X vectors... If you make it a column vector or a row vector, the matrix will be re-written accordingly! get the idea, and you can do the math yourself... with so many courses out there, different people do it differently, but the idea remains the same... while writing the formula, write down the vector/matrix dimensions and proceed accordingly... in the end, the summation formula should hold...
@BhuShu9724 жыл бұрын
I think the objective loss function (yi_hat-yi)^2 is correct. It minimizes the error for all the samples while training which are i = 1 to N. What you did was write the error function in granularly. bith are needed.
@rijuphilipjames28602 жыл бұрын
y hat and y are both of dimension k. they are column vectors. had the same doubt. tq👍
@ashiqhussainkumar13913 жыл бұрын
There is a slight mistake in the formula ai = bi + W(i)`*h(i-1) It makes sense when we see which weight wi is multiplied by which xi
@vaibhavthalanki63172 жыл бұрын
@7:38 , b11 = b12= b13?
@coolarun2832 жыл бұрын
Not necessarily..
@mlofficial91755 жыл бұрын
Can anyone plz explain the last error. What does summation over i instances mean?
@TheDelcin5 жыл бұрын
We are trying to fit the model for ‘N’ number of training data. So we are trying to minimise the error of training data as a collection. And since the output is a vector he sums error in each elements of a vector also. Gradient descent algorithm will work only if f(x) is a real number.
@vin_jha4 жыл бұрын
So actual y_i corresponding to each training example i, will be a k dimensional vector, with 1 at co-ordinate of the vector for the class it belongs to and 0 for the rest. That is, if the example lies in class 'p', then 'pth' co-ordinate of the vector y_i will be 1 and 0 for rest of the dimensions. Now our NN can spit out arbitrary k dimension vector. So our loss function is sample mean of element wise difference of the 2 vectors.
@anshrai62 ай бұрын
it will be min(i/k(fun)) not min(i/n(fun))
@MuhammadWaseem-vb3qe4 жыл бұрын
Find if following is a Linearly Separable Problem or not. ((¬A OR B) AND 1) OR 0 Also create a Neural Network for given equation with a suitable set of weights.