Top Data Science Interview Question And Answer Mistakes 2021 [Asked By Amazon]

  Рет қаралды 6,345

StrataScratch

StrataScratch

Күн бұрын

Пікірлер: 29
@dominic2446
@dominic2446 3 жыл бұрын
7:14 people make this mistake because of the phrasing of the question. "customer" should be changed to "customer(s)" and "first name" should be changed to "first name(s)". this way, the question suggests the possibility that the solution can contain more than one person. 8:09 what is "data at scale issues"?
@stratascratch
@stratascratch 3 жыл бұрын
I agree with you. It's important to talk to the interviewer to identify potential issues like this. You won't always get a perfectly phrased question in real life, especially on the job, so the purpose of this question is really to think of all possible scenarios and build a solution that solves for it. Doing it without someone prompting you to look for these scenarios is even better. Data at scale issues are data issues that pop-up only when you're dealing with a massive amount of data (ie, the amount of data that Facebook or Google might encounter). Because of this you might have to handle the data differently.
@culpgrant21
@culpgrant21 4 жыл бұрын
What do you think about using the RANK function to handle the case of ties?
@tripy75
@tripy75 4 жыл бұрын
Or even simpler: a max() on the computed total without sorting and limit. This could reduce the I/O a lot depending the table size.
@stratascratch
@stratascratch 4 жыл бұрын
max() as mentioned would be great to optimize the code. And then match the max value with others in the table.
@BirSozben
@BirSozben 3 жыл бұрын
thank you, Nate! Luv the t-shirt!
@stratascratch
@stratascratch 3 жыл бұрын
haha I'm a big fan of Boiler Room!
@anuragsingh4766
@anuragsingh4766 4 жыл бұрын
Great video.
@stratascratch
@stratascratch 4 жыл бұрын
Thanks for watching! Let me know if you have any feedback or topic suggestions.
@NeelanshuniSingh
@NeelanshuniSingh 7 ай бұрын
Hi , just one thing if we could aggregate the total order value at customer level between the dates then we can accurately get the results. As in solution, it might be case that one customer has various order value and after aggregating we get total order value for each customer and then decide which is the top ones.
@MyRevoltec
@MyRevoltec 4 жыл бұрын
Awesome video as always!
@stratascratch
@stratascratch 4 жыл бұрын
Thanks for watching! Let me know if you have any feedback or topic suggestions.
@IanWitherow
@IanWitherow 4 жыл бұрын
Trying out the StrataScratch question - I'm returning two records, Mia and Farida, both with 400 for the total and their respective order dates, but it's not accepting it. The Expected Output button is showing 430 for the order totals, but I'm not seeing that order total in the data. Am I missing something? Edit: Ah, multiple orders on the same date, got it. Was thrown off since you had 400 in your video there.
@stratascratch
@stratascratch 4 жыл бұрын
Hi, yes, there's actually a change in the solution because I realized that the 400 corresponded to 400 for 1 item that was purchased on a specific day, not all purchases on a specific day. When you count all purchases made by 1 customer on a specific day, the max is 430. So the code below will get you there. The concept is the same though. In the video, I really wanted to highlight the fact that LIMIT 1s are common mistakes people make. To properly find the customers with the highest number, you have to consider that many people meet that max, and write your solution accordingly. Thanks for trying the code out on the platform. Sorry for the confusion. SELECT first_name, order_date, sum(total_order_cost) FROM (SELECT first_name, order_cost * order_quantity AS total_order_cost, order_date FROM orders o LEFT JOIN customers c ON o.cust_id = c.id WHERE order_date BETWEEN '2019-02-1' AND '2019-05-1' ) a GROUP BY first_name, order_date HAVING sum(total_order_cost) = (SELECT max(total_order_cost) FROM (SELECT sum(order_cost * order_quantity) AS total_order_cost FROM orders WHERE order_date BETWEEN '2019-02-1' AND '2019-05-1' GROUP BY cust_id, order_date) b )
@dominic2446
@dominic2446 3 жыл бұрын
@@stratascratch the question is vague, because it could be interpreted as the total cost for the week or the month too.
@mdabulkalamazad6775
@mdabulkalamazad6775 4 жыл бұрын
great video as always, Thanks
@stratascratch
@stratascratch 4 жыл бұрын
Thanks for watching as always
@hta-bi249
@hta-bi249 3 жыл бұрын
Hello, Thank you for the great video. I have a question! Since we typed in the subquery limit 1, why the overall output still returning 2 rows? cheerz!
@stratascratch
@stratascratch 3 жыл бұрын
The limit one was just to get the max total order cost. Once I got that, I used it to get all the customers that matched the max. And that's why you see 2 rows in the output. Hope that helps and thanks for watching!
@niveditakumari701
@niveditakumari701 3 жыл бұрын
Hey, we can also do this using DENSE_RANK() ?
@vikrant.9039
@vikrant.9039 4 жыл бұрын
Your github is not updated since past one year Nate?
@stratascratch
@stratascratch 4 жыл бұрын
My public one is not other than the one that I linked to in the description. I have one at work and private ones for StrataScratch. But I'll be posting more projects throughout the year and putting them on youtube.
@huseyindansman7415
@huseyindansman7415 3 жыл бұрын
select top 1 with ties (ms sql server) ,I was trying to do it exactly the way I do in ms sql server then found out it doesnt work out in postgreSQL
@abhisheksinghchouhan8506
@abhisheksinghchouhan8506 3 жыл бұрын
Your explanation very and easy to understand..but your site name is very difficult to remember
@stratascratch
@stratascratch 3 жыл бұрын
haha I know right? If I had to do it all over again, I would have picked an easier name =)
@classkori5507
@classkori5507 4 жыл бұрын
Excellent
@stratascratch
@stratascratch 4 жыл бұрын
Thanks for watching. Let me know if you have any feedback or ideas for videos!
@rakibraihanrimon8784
@rakibraihanrimon8784 4 жыл бұрын
Thank You
@stratascratch
@stratascratch 4 жыл бұрын
Thanks for watching!
SQL Case Statements For Data Science Interviews in 2021
14:44
StrataScratch
Рет қаралды 45 М.
Caleb Pressley Shows TSA How It’s Done
0:28
Barstool Sports
Рет қаралды 60 МЛН
ССЫЛКА НА ИГРУ В КОММЕНТАХ #shorts
0:36
Паша Осадчий
Рет қаралды 8 МЛН
Andro, ELMAN, TONI, MONA - Зари (Official Music Video)
2:50
RAAVA MUSIC
Рет қаралды 2 МЛН
Amazon Coding Sample (SIP)
16:16
Inside Amazon
Рет қаралды 379 М.
The Amazon Data Science Interview
13:01
Jay Feng
Рет қаралды 34 М.
Why I Fail Candidates During Google Interviews.
5:40
Alex Nguyen
Рет қаралды 161 М.
SQL Aggregate Window Functions | COUNT, AVG, SUM, MAX, MIN | #SQL Course #4
1:07:34
Tricky Data Science Interview Question [By Facebook]
12:26
StrataScratch
Рет қаралды 5 М.
Caleb Pressley Shows TSA How It’s Done
0:28
Barstool Sports
Рет қаралды 60 МЛН