Amazon Data Scientist Mock Interview - Fraud Model

  Рет қаралды 25,973

DataInterview

DataInterview

Күн бұрын

🚀 Land your dream data job using datainterview.com/.
====== ✅ Details ======
🤔 Try these machine learning questions asked in Amazon's data science interviews:
"Q1 - What is the variance and bias trade-off?"
"Q2 - What's the difference between boosting and bagging?"
"Q3 - How would you detect seller fraud on Amazon?"
This is a mock interview session covering machine learning questions asked in Amazon's data science interviews. The interviewer was a data scientist at Google and PayPal. The interviewee is a candidate preparing for data science interviews at FAANG companies.
👍 Make sure to hit the like, and check out datainterview.com/
====== ⏱️ Timestamps ======
0:00 Intro
00:55 Variance & Bias Trade-Off
03:47 Boosting vs Bagging
06:33 Seller Fraud Modeling
26:24 Assessment
====== 📚 Other Useful Contents ======
1. Principles and Frameworks of Product Metrics | KZbin Case Study
Link: / principles-and-framewo...
2. How to Crack the Data Scientist Case Interview
Link: / crack-the-data-scienti...
3. How to Crack the Amazon Data Scientist Interview
Link: / crack-the-amazon-data-...
====== Connect ======
📗 LinkedIn - / danleedata
📘 Medium - / datainterview

Пікірлер: 22
@hsoley
@hsoley 2 жыл бұрын
Great Video Dan, it was eye-opening! Thank you so much from NYC! just one note that, boosting and Bagging methods are not just for the tree-based ML systems and can be used with any ML method. However, they are much more popular for tree-based methods due to their fast training time and relatively straightforward application.
@ashokjaiswal7384
@ashokjaiswal7384 2 жыл бұрын
Hmm interesting
@mikekwabs5131
@mikekwabs5131 2 жыл бұрын
Thanks very much! learned a lot🤗
@gpprudhvi
@gpprudhvi Жыл бұрын
PCA is a feature extraction technique. Feature selection techniques would choose from features list, extraction techniques would create features which capture the majority of vairance. Whatever the interviewee chose for feature selection are good I feel.
@corymaklin7864
@corymaklin7864 2 жыл бұрын
Good stuff!
@shilashm5691
@shilashm5691 2 жыл бұрын
In classification we use to have precision-recall tradeoff ryt?
@danielxing1034
@danielxing1034 2 жыл бұрын
Great mock interview and I believe it is pretty representative! Thanks for providing this!
@ozziejin
@ozziejin 2 жыл бұрын
excellent mock
@user-bm6pb9sr8t
@user-bm6pb9sr8t 2 жыл бұрын
interesting, thank you
@rr00676
@rr00676 Жыл бұрын
Concerning '# of positive reviews' feature: I have to assume that there exists a subset of fraudulent sellers using bots/review farms to boost #/ratio of positive review. If positive reviews are locally important for non-fraudulent true positives, I imagine that this could potentially lead to a recall problem in our model. thoughts?
@aaronrasquinha
@aaronrasquinha 10 ай бұрын
Is this a typical interview for an L4 or L5 role?
@shilashm5691
@shilashm5691 2 жыл бұрын
Im bagging, We won't say a model as weak leaner's.We use the word weak learners only in boosting and to specifically in Adaboost, because it only has a stumps for prediction not a full tree so only we say adaboost models as a weak learners
@zakiyahfathimam.7786
@zakiyahfathimam.7786 2 жыл бұрын
HI SIR I AM ZAKIYAH FATHIMA M. I AM 12 YEARS OLD .I USED TO WATCH YOUR VIDEOS AND SUNDAS MAM'S CHANNEL. MY DREAM IS TO BECOME A DATASCIENTIST . I KNOW THE PROGRAM LANGUAGE PYTHON .
@tuanseattle
@tuanseattle Жыл бұрын
Isn't the term "variance" in the first question better phrased as "precision" ? I think that's the term we use in econometric class would hate to be unable to answer a question well because of terminology.
@Drewbie_T
@Drewbie_T Жыл бұрын
Higher variance means more flexibility? In general, can't you look at variance in the same way you look at overfitting. I.e., a model with vary high variance will capture outliers, tend to overfit data that doesn't accurately represent the underlying phenomena that produced the data. In this case, wouldn't it make sense to say it does NOT correspond to more flexibility, since the higher variance means it is better suited for ONLY the training data? Just curious where my logic is straying from the interviewers. Thank you for posting this it has been very informative!
@tuanseattle
@tuanseattle Жыл бұрын
I thought more flexibility (like a neural net model is more flexible than linear regression) means more precision (aka lower variance) but risk overfitting.
@Drewbie_T
@Drewbie_T Жыл бұрын
@@tuanseattle so the part we are disagreeing on is the definition of variance. In my head, I was using variance as in the separation from the mean (in which case, increasing precision captures strays from the mean, thus increasing variance) whereas you are using variance as the opposite of precision, I.e. Separation from the true data set, it seems. Nonetheless, what you say makes sense as well when looking at it that way.
@bhujithmadav1481
@bhujithmadav1481 3 ай бұрын
@Drewbie_T By flexible, Dan means complexity of the model. More complex the model is, i.e the decision boundaries have been fit in such a way that the model performs exceeding well on the training data, then the chances are high that the model might not perform well on testing data. This is the case of high variance and low bias.
@xEl_ence
@xEl_ence Жыл бұрын
is it just me or you'll rather do clustering to find labels, then classify....
@MrMandarpriya
@MrMandarpriya 9 ай бұрын
? from where hyperparamter comes into decision boundary. which kind of intangible things are they cooking on their own. God please save.
@yoyo-ue5pf
@yoyo-ue5pf Ай бұрын
I feel like the dude got lost in the sauce with seller based, listing based type shit.
【獨生子的日常】让小奶猫也体验一把鬼打墙#小奶喵 #铲屎官的乐趣
00:12
“獨生子的日常”YouTube官方頻道
Рет қаралды 93 МЛН
0% Respect Moments 😥
00:27
LE FOOT EN VIDÉO
Рет қаралды 41 МЛН
The magical amulet of the cross! #clown #小丑 #shorts
00:54
好人小丑
Рет қаралды 12 МЛН
Amazon Data Scientist Interview Prep | Interview Coach
29:13
DataInterview
Рет қаралды 13 М.
Starting a Career in Data Science (10 Thing I Wish I Knew…)
10:42
Sundas Khalid
Рет қаралды 53 М.
Building a Fraud Detection Platform using AI and Big Data
20:43
Amazon Web Services
Рет қаралды 57 М.
Data Scientist Interview: DoorDash Opportunity Sizing
41:55
DataInterview
Рет қаралды 8 М.
Start from 0 at any point on the T1 Digital Tape Measure
0:14
REEKON Tools
Рет қаралды 37 МЛН
Эволюция телефонов!
0:30
ТРЕНДИ ШОРТС
Рет қаралды 5 МЛН
Windows или Linux: что выбрать?
0:57
CompShop Shorts
Рет қаралды 1,4 МЛН
Добавления ключа в домофон ДомРу
0:18
Samsung or iPhone
0:19
rishton_vines😇
Рет қаралды 3,6 МЛН