Amazon Data Science Interview: Linear Regression

  Рет қаралды 17,103

Exponent

Exponent

Күн бұрын

Don't leave your data science career to chance. Sign up for Exponent's Data Science interview course today: bit.ly/3IO5o2I
Watch our mock Amazon Data Science interview. Angie asks Jimmy (Amazon ML Engineer) a question about linear regression
👉 Subscribe to our channel: bit.ly/exponentyt
📧 Sign up for our email newsletter with PM interview lessons: bit.ly/exponentpm
🕊️ Follow us on Twitter: bit.ly/exptweet
💙 Like us on Facebook for special discounts: bit.ly/exponentfb
📷 Check us out on Instagram: bit.ly/exponentig
ABOUT US:
Did you enjoy this interview question and answer? Want to land your dream career? Exponent is an online community, course, and coaching platform to help you ace your upcoming interview. Exponent has helped people land their dream careers at companies like Google, Microsoft, Amazon, and high-growth startups. Exponent is currently licensed by Stanford, Yale, UW, and others.
Our courses include interview lessons, questions, and complete answers with video walkthroughs. Get access to hours of real interview videos, where we analyze what went right or wrong, as well as our 1000+ community of expert coaches and industry professionals, to help you get your dream job and more!
#datascience #amazon #tech #regression #entrepreneurship #exponent
Chapters -
00:00:00 - Introduction
00:00:59 - Question
00:01:09 - Answer
00:01:50 - Follow-up questions
00:18:35 - Interview analysis

Пікірлер: 27
@tryexponent
@tryexponent 2 жыл бұрын
Don't leave your data science career to chance. Sign up for Exponent's Data Science interview course today: bit.ly/3Eg26E7
@Isaias-wq7wu
@Isaias-wq7wu 2 жыл бұрын
Really good job!
@tusharbedse9523
@tusharbedse9523 2 жыл бұрын
Wow awsome we want some more like dis
@claybar3998
@claybar3998 Жыл бұрын
There’re other motivations for squared-error loss. For one, it arises from the mle estimator when using a model that assumes independent Gaussian errors. For another, squared-error loss has an exact algebraic solution.
@ahmadnomanwaziri4320
@ahmadnomanwaziri4320 9 ай бұрын
That is not true. The exact closed form solution depends on the hypthesis/ prediction function, not the cost function. I.e. MSE for linear reg has a closed form solution whereas MSE for logistic regression is a non convex function
@leomiao5959
@leomiao5959 2 жыл бұрын
Very good~ Thank you!!
@natecartwright2319
@natecartwright2319 Жыл бұрын
wow this is amazing. it doesnt seem like an interview at all more like two collogues having an intellectual conversation on a topic i don't know shit about lol!
@edwardyudolevich3352
@edwardyudolevich3352 Жыл бұрын
*Squared* loss is used because quadratic functions are differentiable and absolute value functions are not, not because it penalizes outliers more. Penalizing outliers is usually considered a downside to OLS, not a good thing or a “reasoning”
@edwardyudolevich3352
@edwardyudolevich3352 Жыл бұрын
Also did he just say that linear regression parameters are estimated using gradient decent?
@bassimeledath2224
@bassimeledath2224 11 ай бұрын
Agreed. MAP penalizes outliers much more than MSE. The latter is much more sensitive to outliers. I believe the nuance here is that MSE penalizes large errors (which is very different from saying it penalizes outliers).
@bassimeledath2224
@bassimeledath2224 11 ай бұрын
@@edwardyudolevich3352 the parameters of linear regression can be estimated in several ways: one of them is via gradient descent. Gradient descent is nice because it is very generalizable and can be used to estimate the parameters of many other ML algorithms.
@joelwillis2043
@joelwillis2043 3 ай бұрын
@@edwardyudolevich3352 typical silly computer science answer -- they never actually learn math, cs degrees gave up on that 30+ years ago
@timothybaker2822
@timothybaker2822 2 ай бұрын
@edwardyudolevich3352 Whether penalizing outliers is good or bad is dependent on the application. It's not true that it's always good or always bad. It is just one feature of MSE that differentiates it from MAE.
@sergigalve1902
@sergigalve1902 5 ай бұрын
At 16:10, it is said that independence of a collection of random variables is equivalent to pairwise independence, but that generally does not imply independence as a whole. Independence implies pairwise independence, but the converse is not true.
@DaewonTech
@DaewonTech 2 жыл бұрын
Nice!
@jungirliez
@jungirliez Жыл бұрын
Fantastic interview, learn so much! More of this please
@tryexponent
@tryexponent Жыл бұрын
Glad you enjoyed it!
@Squeed79
@Squeed79 2 ай бұрын
I would not denote the Y as label, especially because it is "continuous" (the real Y). Otherwise, it is a bit confusing with Logistic Regression.
@josephwong2832
@josephwong2832 2 жыл бұрын
nice
@ChowzorLoves2Tap
@ChowzorLoves2Tap Жыл бұрын
FWIW the closed form solution is m = (X^TX)X^Ty but you're close.
@santhosh20071993
@santhosh20071993 Жыл бұрын
Andrew NG
@allenliu107
@allenliu107 29 күн бұрын
Only question is how close is this type of mock interview compared to the real one?
@tryexponent
@tryexponent 25 күн бұрын
Hey allenliu107! While there might be variations for different companies and seniorities, this is generally accurate for a data science / machine learning technical question interview. For this type of interview, you'll need to know your algorithms and models well because it'll be a deep dive rather than a touch-and-go on the concepts e.g. "How do you select the value of 'k' in the k-means algorithm?"
@maxron8788
@maxron8788 Жыл бұрын
A real question: what’s the use even for a ML engineer to know in such mathemtical details how algorithms work? As long as you know the intuition, the assumption, in which cases we can use or not . Because you’ll never have to build these algo from scratch. There’s always be a library around there. This question is not valid if you are of course a researcher working on developping new AI algorithms
@pranav7471
@pranav7471 Жыл бұрын
An SDE need not really understand OS, DBSM etc. cuz it doesn't help in 99% of the day to day use cases, but in that 1% of cases where shit hits the fan a 10x developer who understands the minute intricacies of all the tech stack used, is absolutely essential to save the day. I think the same principle applies here
@joelwillis2043
@joelwillis2043 3 ай бұрын
this is not detail, this is a gross overview
@jasdeepsinghgrover2470
@jasdeepsinghgrover2470 27 күн бұрын
Background Clarification: I am an MS CS candidate preparing for such interviews The main reason to know the math is that ML is VERY VERY DIFFERENT from development. I have 2 years of development experience at a reputed bank and I can tell you that the steps in a normal development process change about 10 to 20% in most of the cases. If you know how to create one type of report you know how to create most of the reports. If you know how to solve certain "Patterns" of problems you know how to solve most of the problems belonging to them. I am not only talking about Leetcode but also about many applications. Unless I need special optimizations in performance/cost etc. I can use the same API or library irrespective of the use case. That is the very reason they exist. Coming to ML and Stats. You only know sample data points. You never know if the underlying distribution assumptions were true. Let me give an example, if the residuals in the video example are normal we get MSE loss but if they were assumed to be uniform distributed then we would have to take the smallest parallelogram having all the points (leave gradient descent, this convex hull from DP and CLRS!!!!!). If they were exponentially(earthquakes etc) or binomially distributed (maybe student scores) again a different MLE algorithm would be needed. Different MAP algorithms and regularization too. The fact that outliers screw up linear regression is essentially because they break/weaken the normal error assumption (normal distribution dies down at the ends). Besides this imagine someone told "Ohh! this Data has some time series features like the price of a good/ likes on a video in previous month". Then bang! You broke the iid assumption. Now correct for it. Finally if this wasn't enough, if you have too few data points then you can use the equation form, too many then lbfgs(faster than grad descent) and if you have too many features and Hessians are too large then gradient descent is the only savior. (Oh I forgot, did you know a solution could not even exist! Try linear regression with 2 columns having the same/ very close values. Now remember all this is needed after you have enough features to make sure the relationship of y and x is linear. The main problem is libraries like sklearn don't even address these issues (Statsmodels does for many cases but not all) Even after this you need to test multi-collinearity otherwise you won't know which features are not telling anything extra. Test p values for coefficients and assure they are normally distributed. For many CS students MSE loss and linear regression is like a hammer and we have been seeing everything as if it was a nail!!! Bang! Bang! Bang! Causing Zillow crash and other issues. Did you ever see something like this while making your second website?? At least I never did😢😢
Snap Data Science Mock Interview: Improve Camera Speed
29:28
Exponent
Рет қаралды 4,8 М.
ISSEI funny story😂😂😂Strange World | Pink with inoCat
00:36
ISSEI / いっせい
Рет қаралды 13 МЛН
Teenagers Show Kindness by Repairing Grandmother's Old Fence #shorts
00:37
Fabiosa Best Lifehacks
Рет қаралды 25 МЛН
YouTube's Biggest Mistake..
00:34
Stokes Twins
Рет қаралды 36 МЛН
Linear Regression, Clearly Explained!!!
27:27
StatQuest with Josh Starmer
Рет қаралды 194 М.
Watch This To Not Fail Your Next ML Interview
11:04
Boris Meinardus
Рет қаралды 6 М.
5 Concepts in Statistics You Should Know | Data Science Interview
20:48
DATA SCIENCE INTERVIEW GUIDE | every type of interview question explained
16:34
The Almost Astrophysicist
Рет қаралды 68 М.
Amazon Data Scientist Interview Prep | Interview Coach
29:13
DataInterview
Рет қаралды 13 М.
3 Types of Data Science Interview Questions
8:09
Joma Tech
Рет қаралды 640 М.
Фокусы оператора 😀 с канцелярской резинкой. Он в кадре.
0:41
Polinka_girla (Полинка и Оператор)
Рет қаралды 2,4 МЛН
ОБЫЧНЫЙ ДЕНЬ РОЖДЕНИЯ 😅 |  #mellstroy #shorts
0:18
ТАЙ МАССАЖЫ:  СПА САЛОНЫНДАҒЫ СҰМДЫҚ / KOREMIZ
46:56
Көреміз / «KÖREMIZ»
Рет қаралды 248 М.
Думала, что попала в рабство
0:58
Cinematnica
Рет қаралды 7 МЛН