7:14 people make this mistake because of the phrasing of the question. "customer" should be changed to "customer(s)" and "first name" should be changed to "first name(s)". this way, the question suggests the possibility that the solution can contain more than one person. 8:09 what is "data at scale issues"?
@stratascratch3 жыл бұрын
I agree with you. It's important to talk to the interviewer to identify potential issues like this. You won't always get a perfectly phrased question in real life, especially on the job, so the purpose of this question is really to think of all possible scenarios and build a solution that solves for it. Doing it without someone prompting you to look for these scenarios is even better. Data at scale issues are data issues that pop-up only when you're dealing with a massive amount of data (ie, the amount of data that Facebook or Google might encounter). Because of this you might have to handle the data differently.
@culpgrant214 жыл бұрын
What do you think about using the RANK function to handle the case of ties?
@tripy754 жыл бұрын
Or even simpler: a max() on the computed total without sorting and limit. This could reduce the I/O a lot depending the table size.
@stratascratch4 жыл бұрын
max() as mentioned would be great to optimize the code. And then match the max value with others in the table.
@BirSozben3 жыл бұрын
thank you, Nate! Luv the t-shirt!
@stratascratch3 жыл бұрын
haha I'm a big fan of Boiler Room!
@anuragsingh47664 жыл бұрын
Great video.
@stratascratch4 жыл бұрын
Thanks for watching! Let me know if you have any feedback or topic suggestions.
@NeelanshuniSingh7 ай бұрын
Hi , just one thing if we could aggregate the total order value at customer level between the dates then we can accurately get the results. As in solution, it might be case that one customer has various order value and after aggregating we get total order value for each customer and then decide which is the top ones.
@MyRevoltec4 жыл бұрын
Awesome video as always!
@stratascratch4 жыл бұрын
Thanks for watching! Let me know if you have any feedback or topic suggestions.
@IanWitherow4 жыл бұрын
Trying out the StrataScratch question - I'm returning two records, Mia and Farida, both with 400 for the total and their respective order dates, but it's not accepting it. The Expected Output button is showing 430 for the order totals, but I'm not seeing that order total in the data. Am I missing something? Edit: Ah, multiple orders on the same date, got it. Was thrown off since you had 400 in your video there.
@stratascratch4 жыл бұрын
Hi, yes, there's actually a change in the solution because I realized that the 400 corresponded to 400 for 1 item that was purchased on a specific day, not all purchases on a specific day. When you count all purchases made by 1 customer on a specific day, the max is 430. So the code below will get you there. The concept is the same though. In the video, I really wanted to highlight the fact that LIMIT 1s are common mistakes people make. To properly find the customers with the highest number, you have to consider that many people meet that max, and write your solution accordingly. Thanks for trying the code out on the platform. Sorry for the confusion. SELECT first_name, order_date, sum(total_order_cost) FROM (SELECT first_name, order_cost * order_quantity AS total_order_cost, order_date FROM orders o LEFT JOIN customers c ON o.cust_id = c.id WHERE order_date BETWEEN '2019-02-1' AND '2019-05-1' ) a GROUP BY first_name, order_date HAVING sum(total_order_cost) = (SELECT max(total_order_cost) FROM (SELECT sum(order_cost * order_quantity) AS total_order_cost FROM orders WHERE order_date BETWEEN '2019-02-1' AND '2019-05-1' GROUP BY cust_id, order_date) b )
@dominic24463 жыл бұрын
@@stratascratch the question is vague, because it could be interpreted as the total cost for the week or the month too.
@mdabulkalamazad67754 жыл бұрын
great video as always, Thanks
@stratascratch4 жыл бұрын
Thanks for watching as always
@hta-bi2493 жыл бұрын
Hello, Thank you for the great video. I have a question! Since we typed in the subquery limit 1, why the overall output still returning 2 rows? cheerz!
@stratascratch3 жыл бұрын
The limit one was just to get the max total order cost. Once I got that, I used it to get all the customers that matched the max. And that's why you see 2 rows in the output. Hope that helps and thanks for watching!
@niveditakumari7013 жыл бұрын
Hey, we can also do this using DENSE_RANK() ?
@vikrant.90394 жыл бұрын
Your github is not updated since past one year Nate?
@stratascratch4 жыл бұрын
My public one is not other than the one that I linked to in the description. I have one at work and private ones for StrataScratch. But I'll be posting more projects throughout the year and putting them on youtube.
@huseyindansman74153 жыл бұрын
select top 1 with ties (ms sql server) ,I was trying to do it exactly the way I do in ms sql server then found out it doesnt work out in postgreSQL
@abhisheksinghchouhan85063 жыл бұрын
Your explanation very and easy to understand..but your site name is very difficult to remember
@stratascratch3 жыл бұрын
haha I know right? If I had to do it all over again, I would have picked an easier name =)
@classkori55074 жыл бұрын
Excellent
@stratascratch4 жыл бұрын
Thanks for watching. Let me know if you have any feedback or ideas for videos!