Principal Component Analysis (PCA) 2 [Python]

  Рет қаралды 26,842

Steve Brunton

4 жыл бұрын

This video describes how the singular value decomposition (SVD) can be used for principal component analysis (PCA) in Python (part 2).
Book Website: databookuw.com
Book PDF: databookuw.com/databook.pdf
These lectures follow Chapter 1 from: "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control" by Brunton and Kutz
Amazon: www.amazon.com/Data-Driven-Science-Engineering-Learning-Dynamical/dp/1108422098/
Brunton Website: eigensteve.com
This video was produced at the University of Washington

Пікірлер: 25
@christinag3732
@christinag3732 3 жыл бұрын
Thank you! Your explanations are amazing!! Keep posting more videos like this!!!
@Eigensteve
@Eigensteve 3 жыл бұрын
Thank you! Will do!
@prashantsharmastunning
@prashantsharmastunning 3 жыл бұрын
i have a doubt, why didnt we centered the data by subtracting the mean matrix ?
@armada1e
@armada1e 3 жыл бұрын
Hey Steve! Love your videos. Two questions: why didnt you mean center the data before computing the SVD, and why did you project the data onto 3 columns of V when previously you said the principal components were US or BV. Thanks!
@subramaniannk3364
@subramaniannk3364 3 жыл бұрын
Great Lecture! Just a question-are the data points centered around the origin in this program?
@nicok3345
@nicok3345 4 жыл бұрын
First of all thanks for this outstanding series on SVD. I still have a question though why we are truncating the rows which represent the patients and are not truncating the columns which present the genes. From an intuitive view, I would rather create an eigengene by using the PCA on the columns than creating an eigenpatient by using the PCA on the rows.
@saitaro
@saitaro 4 жыл бұрын
Thank you, your explanations are really great! ps. 'eigensteve' lol
@MrBvidal
@MrBvidal 4 жыл бұрын
Thank a lot for these wonderful series. I can't seem to find the ovarian cancer dataset anywhere. You mention in the video that it can be download from the book webside. Could you put a link to it? Thank you very much in advance
@sumanvadla9762
@sumanvadla9762 Жыл бұрын
Sir, thankyou for your excellent lectures, we have focused on the U matrix when dealing with last example of gaussian distributed data ; can you explain why we are now looking at the U matrix in this ovarian cancer example?
@bindusankhala7980
@bindusankhala7980 Жыл бұрын
Thank you for the wonderful session, I just have one doubt in previous lecture where you explained PCA you have used mean centered data to calculate Principal component, but in this example you have not calculated mean.
@apoorvshrivastava3544
@apoorvshrivastava3544 4 жыл бұрын
great intution sir
@HawkinsOkeyoEng
@HawkinsOkeyoEng 3 жыл бұрын
Hi Dr. @Steve, there seems to be a confusion with your size of the VT matrix(I guess just miss-spoken). On the Matlab videos, you said that it is a 4000x4000 matrix with U being 216x216 meaning that in this case, U is columnwise trancated. But on the Python videos, VT is communicated as 216x4000 which is the case. It is not a big deal, but can lead to some confusion. Now to my question: why is U trancated?
@Jathothveeranna
@Jathothveeranna 3 жыл бұрын
Excellent video..which software bused for video making..plz suggest it will helps me a lot for teaching students..I'm Assistant Professor in University. Thanks
@VivekGangwar02
@VivekGangwar02 9 ай бұрын
people are confused of U T should be a square matrix, im confused if you take only 3 principal components, there are 216 patients, so the information you shown will be 3X216 not 3X4000, can anyone clear my doubt *?* in case someone in future can have this. ThankYou
@mordehaym3239
@mordehaym3239 3 жыл бұрын
Thanks for videos!! I think there is a mistake at 3:09, it's said V is 216x4000 matrix but in wikipedia V is 4000x4000 matrix.
@pavansatya2749
@pavansatya2749 2 жыл бұрын
i got the same doubt but when i checked the dimensions of VT it is 216x4000 but i don't know why it's is 216x4000 instead of 4000x4000
@pavansatya2749
@pavansatya2749 2 жыл бұрын
oops i think it's because it is an economy svd and also a fat/wide matrix so the rows of VT is reduced to 216
@PegasusDesigns
@PegasusDesigns 2 жыл бұрын
Hey Steve, I believe you say something incorrect when explaining the dimensions of the V matrix. At 3:12 you mention that V has dimensions 216x4000 which is not correct. V and U are always square matrices in SVD. It is only the sigma (S) matrix which is a diagonal square matrix and in this case has dimensions 216x4000. I am sure that you are aware of this, and that it was a mistake, but thought i would point it out so to not confuse people watching this video.
@MDLHoff
@MDLHoff 2 жыл бұрын
In the previous videos he was using features x patients (samples) configuration. Now he is using the transposed version of it (patients x features). If you transpose a matrix, then in the decomposition you're going to get the inversed order. So, in other words, U plays the role of eigenmixtures and VT plays the row of having the eigengenes in each for
@user-ym8rz6mw5r
@user-ym8rz6mw5r 2 жыл бұрын
you are not finishing the example. you need to say, "now lets look at the new patient, show how to calculate where he lands in those 3 dimensions then show his dot on the 3d plot among the original data
@bobwatkins1271
@bobwatkins1271 11 ай бұрын
projected_obs = obs @ VT[:3].T for (x,y,z), group in zip(projected_obs, grp): ...
@augustye3489
@augustye3489 3 жыл бұрын
I only like you and 3brown1blue
@HuadongWu
@HuadongWu 3 жыл бұрын
this does not make sense: without relating the classification results (cancer or not) with the observation matrix, the observation matrix PCA analysis by itself is meaningless -- it only describes the distribution of sampled data
@alexistremblay1076
@alexistremblay1076 3 жыл бұрын
You can train a model using only the first few principal components. PCA can be a preprocessing step for your input data.
@hubstrangers3450
@hubstrangers3450 4 жыл бұрын
First of all thank you for the content, however most of these should have published early 2013 or something to that effect, it's observable the tecnology space and educational space are in cohorts, reasons (Jupyter note book, pandas, numpy, matplotlib, cloud accounts, ML tools all these things where available around that space , not these levels of contents), is this a modern day masking effect something similar to that, would like to hear some kind of reasons from both spaces and parties, Also lately there is this "polynote" book and to "VMS" to complement these too. thanks.
Sigma Girl Past #funny #sigma #viral
00:20
CRAZY GREAPA
Рет қаралды 28 МЛН
Василиса наняла личного массажиста 😂 #shorts
00:22
Денис Кукояка
Рет қаралды 9 МЛН
ОСКАР ИСПОРТИЛ ДЖОНИ ЖИЗНЬ 😢 @lenta_com
01:01
GamePad İle Bisiklet Yönetmek #shorts
0:26
Osman Kabadayı
Рет қаралды 67 М.
#miniphone
0:16
Miniphone
Рет қаралды 3,7 МЛН
iPhone 16 с инновационным аккумулятором
0:45
ÉЖИ АКСЁНОВ
Рет қаралды 737 М.
Урна с айфонами!
0:30
По ту сторону Гугла
Рет қаралды 8 МЛН