Creating a short course on this was such a great idea! Great, valuable content! 👏🏻👏🏻👏🏻
@karolinasowinska4 жыл бұрын
Thank you Cla Udia! :)
@karolinasowinska3 жыл бұрын
My free course is just an introduction to the data engineering world. If you'd like to learn data engineering and be ready to apply for jobs, I recommend following the 4 simple steps: 1. Learn Python I recommend following the Python for Everybody specialization course on Coursera, which is one of the most popular courses there: imp.i384100.net/x9gVO3 2. Learn SQL SQL is still the lingua franca of data. I recommend going with Learn SQL Basics for Data Science course, because it contains some chapters which are very releavant to data engineering in partcular, e.g. distributed computing with Spark imp.i384100.net/QOMZ09 3. Learn Bash scripting/Linux I wouldn't take a full course on it, but at least read a good article. if you do prefer to take a course/guided project, I think this one is short and good: www.coursera.org/projects/command-line-linux 4. Learn how to develop on the cloud, e.g. on AWS There are a few good courses around there, but I think the Coursera one is the most comprehensive imp.i384100.net/P0MJBM Discloure: The above contain affiliate links, meaning when you click the links and make a purchase, I receive a commission!
@soumyamoharana36732 жыл бұрын
You are a gem @Karolina. Prior to this I was unable to visualize the ETL process but this short video explains everything to give a kickstart. 👌
@slothpete7773issleeping3 жыл бұрын
I was wondered how the ETL process work and Medium article doesn't give me much of a clear picture but this mini-course gave me a very CLEAR of how it is. Thank you a lotttttt.
@karolinasowinska3 жыл бұрын
I'm super happy to hear that, thank you! :)
@michaelshoemaker56353 жыл бұрын
Thanks for the video. One suggestion I have would be to change except: print("Data already exists in the database") to except Exception as e: print("There was an issue loading the data. " + str(e)) The way it currently is in the code if there is no dataframe at all or any other error it will print "Data already exists in the database".
@melbbb56733 жыл бұрын
I appreciate and enjoy this small, short course! All three parts are easy to understand and easy to follow. I think it hits on just enough that I need to know while also getting me interested in the possibilities. Thank you!
@karolinasowinska3 жыл бұрын
I'm super glad that this is your experience! :) Thanks!
@adeyinkaadekanmi65862 жыл бұрын
Your explanation is so simple and very precise
@benniahsalami92473 жыл бұрын
This is the explanation i have been looking for ! very simple and straight to the point 😃
@sprintwithcarlos3 жыл бұрын
At first the code didn't worked for me, but when changing conn = sqlite3.connect('my_played_tracks.sqlite') to conn = sqlite3.connect('my_played_tracks.db') it worked. Thanks!
@ander30242 жыл бұрын
Thank you! I was having the same problem :)
@sabrinae5903 жыл бұрын
I have a data engineering internship interview later today and just watched this tutorial series and I’m so glad I did! It was soooo helpful!!! Thank you!
@karolinasowinska3 жыл бұрын
I'm so glad! :) I hope it went well!
@sabrinae5903 жыл бұрын
@@karolinasowinska it did thank you!
@imdeepu78555 ай бұрын
Narrative style is appreciated
@HansOnProduction19842 жыл бұрын
This series has been extremely useful - currently working on a certificate for data engineering and we are doing a project on ETL and Databases. Everything you have demonstrated has been fantastic. Thanks for this series.
@kalvincelcius76244 жыл бұрын
Can I get a heart? I like data engineering and your channel. You are amazingly smart and beautiful.
@karolinasowinska4 жыл бұрын
How flattering! ;) And sure, I've got no shortage of hearts ;)
@domenicalozada32374 жыл бұрын
I loved this project! Can you please make the other video to learn the automation?
@karolinasowinska4 жыл бұрын
I'm glad that you've gone through the course and enjoyed it! :) Sure, I've got that video in my plans!
@higormonteiro76363 жыл бұрын
Very nice, Karolina! In less than 30 minutes you showed a very succinct and clear introduction to the field. Many thanks from Brazil! And if you are willing to do it, I'd appreciate the databases video, there are so many software out there that I'm doing a careful research to which one I should choose. Keep going :)
@karolinasowinska3 жыл бұрын
I'm super glad that my videos brought some value! Thanks Higor! :)
@azemoning4 жыл бұрын
Great video as always, looking forward for the relational and non-relational database video
@karolinasowinska4 жыл бұрын
Thanks, that one is definitely on my list! :)
@xBaphometHx3 жыл бұрын
Extremely good and simple explanation of what ETL is. I understood everything. Good job!
@imranjustimran3 жыл бұрын
Karolina these videos were fantastic - so clear and useful! really really appreciated thanks so much
@karolinasowinska3 жыл бұрын
I'm so glad! Thanks! :)
@vivekkeshri5364 жыл бұрын
Perfectly explained in sequence like a pro ,thanks for small and quick session waiting for future learning videos
@karolinasowinska4 жыл бұрын
My pleasure, more vidoes are coming soon! ;)
@mr_braimah89663 жыл бұрын
You are officially my mentor thank you soo much for such wonderful videos.
@karolinasowinska3 жыл бұрын
My pleasure!! :)
@salimz.85872 жыл бұрын
You bring so much value
@jamesross51203 жыл бұрын
I NEED MORE OF THESE !!!! i learnt so much and it was EASY!
@davidaosei-appiah50013 жыл бұрын
Hi Karolina. Where can we find where "my_played_tracks" saved to? Trying to find it so I can load it into DBeaver, but I'm having trouble with it. Thanks!
@alisasnag70972 жыл бұрын
Many-many thanks to you for all your videos! You are really great!
@sagarthasal3473 жыл бұрын
This short course was really helpful especially for beginners 🙌 thank you.
@matttamer50963 жыл бұрын
This was excellent, would love more.
@karolinasowinska3 жыл бұрын
Thanks! Stay tuned for future videos, and feel free to explore the channel;)
@lonestarstatechris2 жыл бұрын
Thanks! Very helpful. Yesterday I was learning how to do sql window functions with pyspark. Would love to see a video on other advanced sql with pyspark. Just throwing out an idea. I read a helpful article, but didn’t see any helpful yt videos.
@ngee49253 жыл бұрын
Great series!
@alejandragutierrez17094 жыл бұрын
I am with you so far!! When will the next video come?? Thanks so much for all the great tutorials.
@karolinasowinska4 жыл бұрын
I'm super glad that you've gone through the course and enjoyed it! The next video might be coming as soon as next week - I've actually just published a poll to see what people would like to see most, feel free to give your vote! :)
@iamchesco4 жыл бұрын
As always, very informative!! Keep the videos coming.
@karolinasowinska4 жыл бұрын
I'm glad you found the video useful - I'll carry on uploading! :)
@BillyDTourist4 жыл бұрын
I liked the video. It's either too early or the videos are getting way better. I liked the twist of how it did not load. I liked the extra zoom in certain moments and the "product placement" part for dbeaver Mozart sounds scary! Keep going Karo PS any idea why I didn't get a notification from youtube ?
@karolinasowinska4 жыл бұрын
Thanks so much! I'm sure Mozart was a scary guy! ;) I have no idea why KZbin didn't send you a notification! Is your bell still switched on? If it is, then I'll investigate what's happening!
@BillyDTourist4 жыл бұрын
@@karolinasowinska it was on the default , personalized bell, perhaps that was it. Change it to all, will make sure to notice what happens next week 🙂.
@michellebastossilva70322 жыл бұрын
I really loved it!! Thank you so much for share it!!
@MUSKAN08962 жыл бұрын
Just hopped onto your channel and absolutely loving the content!! Keep on doing good stuff, you're amazinggg💜
@evertonlins44393 жыл бұрын
"I can’t explain, but I wanna try, there’s this ETL feeding a Pipeline and it goes running by in the morning and in the night time... I’m not sure if I should show you what I’ve found, is that data good? Or is it just outliers around?" - AM (Data version) Very nice content, keep going!
@karolinasowinska3 жыл бұрын
Haha perfect, this will be my new lullaby :D
@crimsonseattle4 жыл бұрын
Great video! I finding the loading part very useful. Thanks for sharing.
@karolinasowinska4 жыл бұрын
My pleasure Nelson! ;)
@crimsonseattle4 жыл бұрын
@@karolinasowinska Is it possible if you could do a video about appending new data to the current table? Ex. Upsert function in Pandas. I am currently struggling with the an issue having new data that conflicts with old data. My goal is to drop the old data and overwrite with new data. Let me know what you think and insights.
@karolinasowinska4 жыл бұрын
@@crimsonseattle if you want to overwrite the existing data, just swap if_exists="append" with if_exists="replace" :) More info here: pandas.pydata.org/pandas-docs/version/0.23.4/generated/pandas.DataFrame.to_sql.html
@yuvarajmalawat64774 жыл бұрын
Hi Karolina, great video.Will wait for the automation part of it. Meanwhile, if you are thinking of doing another course can you add all the tools and technology company like Uber , Walmart etc are using to give us more insight to how it all happen in real world
@karolinasowinska4 жыл бұрын
Awesome, thank you for the suggestion! I'll definitely consider making more courses in the future! :)
@Joe-w3 жыл бұрын
Thank you so much for this series it was exactly what I was looking for! 😄😄
@karolinasowinska3 жыл бұрын
Glad it was helpful! :)
@aadhithkrishna56823 жыл бұрын
Hola! Thank you Karolina.Learned a lot as a newbee.Loved it
@rawcodes2 жыл бұрын
Thank you so much for preparing this course, it's really helpful. Please I've been looking for the last video on the spotify ETL workflow automation. Can you please make the video?
@airamfuentes52164 жыл бұрын
Thanks!! It was really clear to undestand and very well written Still waiting for the Job scheduling part :P
@karolinasowinska4 жыл бұрын
Glad to hear that! The job scheduling bit will come in the not-so-far future! :)
@vnpikachu46274 жыл бұрын
Thank you for making these valuable knowledge free 🤗
@karolinasowinska4 жыл бұрын
My pleasure, thanks for stepping by! :)
@gabrielveloz32963 жыл бұрын
awesome three part series ! really helpful.
@karolinasowinska3 жыл бұрын
Thanks! I'm glad you enjoyed the series!:)
@Vikermajit Жыл бұрын
Awesome video series. Tq Karolina.
@theailady49013 жыл бұрын
Excellent content
@nathankomer86993 жыл бұрын
Thanks, this was a great and very entertaining introduction to the topic.
@karolinasowinska3 жыл бұрын
Glad you enjoyed it! :)
@mezzofemme4 жыл бұрын
Excellent course! It's been very useful so far. Will you push these changes to your github repo? Thanks!!!
@karolinasowinska4 жыл бұрын
Thanks Juan! It's pushed now! ;)
@gonzaloplazag2 жыл бұрын
Really helpful!! Thank a lot!
@pereirarupal3 жыл бұрын
Found your channel by accident. Amazing videos. Thank you so much. I am eagerly waiting for next video on automation.
@karolinasowinska3 жыл бұрын
Hey, the video on automation is already out, you can find it on the Data engineering playlist:)
@pereirarupal3 жыл бұрын
@@karolinasowinska Found it. Thank you very much. Will let you know how it goes.
@kevinmenesesgonzalez61122 жыл бұрын
very good content, easy of understand and funny to watch too :)
@abhishekchakravorty62504 жыл бұрын
Great content as ever, it helped me. Its best to get taught from a cool teacher like you;)
@karolinasowinska4 жыл бұрын
Haha thanks, I'm glad you enjoy the course! ;)
@abdulrahmanmohamed28244 жыл бұрын
Awesome tutorial, i see all the 3 videos and I ask if you can make another one for automation? Thanks a lot Karolina
@karolinasowinska4 жыл бұрын
Thanks so much, I see a lot of people asking about it, so it's definitely on my list :)
@slh5433 жыл бұрын
Hi Karolina, thanks a lot for the content! Could you please help with my issue? I am changing days=30 but am still only getting tracks from yesterday. I restarted my kernel and removed the timestamp check. Where could I be going wrong?
@karolinasowinska3 жыл бұрын
Hey, it's a bit hard to tell without seeing the code. My first thought - did you listen to any songs in the last 30 days except for yesterday? If you didn't, then that would be the reason.
@rubenalejandro77634 жыл бұрын
You should do the video on relational vs non relational databases! I would definitively watch it since I just starting to learn Data Science
@karolinasowinska4 жыл бұрын
Cool, thanks for letting me know - that's helpful! The video will be on my list then :)
@rubenalejandro77634 жыл бұрын
@@karolinasowinska Great! :)
@smiles46754 жыл бұрын
Can you please talk about certifications for data engineer?Best one to do, value, how it would help/enhance for people experienced in ETL, would be helpful, thanks!
@karolinasowinska4 жыл бұрын
I myself didn't study for any data engineering certifications, but I might ask my friends around and gather some opinions! I'll make a video if I get some interesting insights! :)
@codingevelynmiami42304 жыл бұрын
Thank you for the series, I appreciate it.
@karolinasowinska4 жыл бұрын
My pleasure!
@victornzemeke17632 жыл бұрын
Thank you for this video!
@TaylorNelson14 жыл бұрын
Nice stuff, can you get into best practice for scheduling though? For me, that's what I know less about, since I am a DS and not a DE. I am really really interested in best practices, perhaps using airflow or spark.
@TaylorNelson14 жыл бұрын
btw, I know about CRON, really interested in better practices.
@karolinasowinska4 жыл бұрын
That's the plan! I'm going to release the last part of this series which is going to be about job scheduling in Airflow :)
@TaylorNelson14 жыл бұрын
@@karolinasowinska Thanks a lot!! I think scheduling and monitoring and optimizing the code are what I need help with.
@pereirarupal3 жыл бұрын
Hi Karolina, I am getting this error when I run the code. for song in data["items"]: KeyError: 'items'
@karolinasowinska3 жыл бұрын
Regenerate your spotify token on the website:)
@pereirarupal3 жыл бұрын
@@karolinasowinska Thanks That worked. Cheers
@aianIII3 жыл бұрын
Hello. I'm new to data engineering and I'm thankful for this video. I just want to know if ETL is different from data pipeline. I searched a lot of answers in the net and it's still vague for me. Some says no, some says yes. Thank you.
@karolinasowinska3 жыл бұрын
I'd say they pretty much mean the same. :) you can think of an 'ETL' as a name for the whole process, whereas 'data pipeline' is a set of scripts that you've written (for the ETL process). I guess, sometimes data pipelines might take different forms and shapes, and perhaps skip one of the steps from the ETL. Let's say they don't save data to database, but do some in-memory processing (so no load stage). But that's probably extremely rare. I hope that helps :)
@sudippandit70512 жыл бұрын
Hi Karolina, thank you for your nice ETL series! I am wondering to know that how could I generate dynamic row key of HBase (NoSQL database) and so that I can load the data from Hive to HBase having their connection?
@codingevelynmiami42304 жыл бұрын
Please make a separate video about relational and non-relational databases
@karolinasowinska4 жыл бұрын
Will do at one point! Thanks for letting me know that you'd like to see it!
@chadstrachan96963 жыл бұрын
Please make a video on relational vs non-relational databases!!!!
@karolinasowinska3 жыл бұрын
Putting that on my backlog! :)
@nachynoorzhak9574 жыл бұрын
Hi Karolina! I have just started watching your videos, they are amazing and so interesting! I am from Russia and I study international economic relations. Recently I have been looking for a job, but almost everywhere it was required to know python and etc. so that’s why I started to code and I can say that it is pretty fun) I have been thinking of working as a machine learning engineer in the future as I like maths and python. In one of your videos you said that having a mentor is very important. It sounds kind of insolent and strange but could you please be my mentor 😂 as I have questions and doubts. Thank you in advance!
@karolinasowinska4 жыл бұрын
Hey Nachyn, I'm really glad that you're thinking of moving into tech! I started this channel in order to help out as many people as possible, let's call it group mentorship! Always feel free to ask me any questions, I'll either answer directly or I'll make a video with an answer;) Thanks for stepping by, and I hope to see you around here! :)
@ranniex10544 жыл бұрын
I'll be expecting the automate process video, thank you for this course :) Also, I keep having the error about the last 24 hours, Spotify time is different than my pc so it gives an error :( Or thats what I think its happening
@karolinasowinska4 жыл бұрын
I'll hopefully get down to making that video soon! Oh, interesing. I wonder if the error has something to do with you validation check - perhaps try removing the validation function for now and see if the error disappears :)
@domenicalozada32374 жыл бұрын
I just changed this on the code "yesterday = today - datetime.timedelta(days=2)" It was also showing data from the same day
@Emily-is3cz3 жыл бұрын
I also experienced the same problem and this is the approach I took, 1. Spotify data is in UTC time so you will need to do a time conversion in python. Both before you do the data request and after when you have the data pulled. This will correct your issue with wrong dates for the columns played_at and timestamp. 2. The 2nd thing you want to fix is removing rows for played_at/timestamps for the current day and keep only yesterday's data using python. The reason why we have current date data is because when you requests the data the HTTP was listed as "after={time}". My thoughts on why it's important to only keep/ check we only have yesterday's data is because once we have the automation video we don't want to double count the data going into the database. Once you fix these 2 things you will be error free, also I'm new to python and data engineering so if someone is more experiences has also dealt with this issue feel free to comment or correct.
@weshouse85583 жыл бұрын
@@Emily-is3cz Thanks, Emily! Good notes. I'm curious if you came up with the code to do this as I'm still getting the exception error raised.
@ankushojha50893 жыл бұрын
silents clips are funny 🙂 , in interviews getting que like what kind of comolex prob we faced on daily work in data engg.. if possible can you pls suggest few good points .. thanks
@karolinasowinska3 жыл бұрын
It's best to talk about your own experience! But make sure to highlight successes and areas where you made the difference!
@emesestudy62874 жыл бұрын
Can I get a heart? I love how you explain that. My mommy would understand as well❤️❤️❤️
@karolinasowinska4 жыл бұрын
Thank you, I'm really happy that you enjoyed watching the video! ;)
@cutepets75973 жыл бұрын
support your course ! come on
@mohsenbenromdhane76433 жыл бұрын
I noticed you never talked about Microsoft Products even when naming the main cloud providers
@karolinasowinska3 жыл бұрын
True, they are big players too!
@RakeshG6193 жыл бұрын
did u write a query in python...? share pdf of ETL.... pls share notes rply to this...
@karolinasowinska3 жыл бұрын
Link to the whole script is in the description
@RakeshG6193 жыл бұрын
@@karolinasowinska Some more deep please with extracting... Data explain with more understanding... To us.... Where you written query in python or sql didn't understand... Say some linux commands...
@codingevelynmiami42304 жыл бұрын
Where is the next video on how to automate the jobs/queries?
@karolinasowinska4 жыл бұрын
It's coming soon :)
@mustaphaafaynou51183 жыл бұрын
Do this with S3 and Spark!
@karolinasowinska3 жыл бұрын
Maybe in the future!
@mustaphaafaynou51183 жыл бұрын
All time of data Engineer job is creating scrip Python for ETL process!! Collect Processing Save
@resap.91283 жыл бұрын
I love your videos!
@karolinasowinska3 жыл бұрын
I'm super glad to hear that! The next part of this course is coming on Thursday! ;)
@resap.91283 жыл бұрын
@@karolinasowinskayey! :) thanks for the efforts Karolina :)
@mryderoc3 жыл бұрын
Great course, but not using sqlalchemy broke my heart :D
@thealys2 жыл бұрын
I need mentorship....
@ismahenelarbi54032 жыл бұрын
You are an INTP lol. Awesome!
@karolinasowinska2 жыл бұрын
Haha hmm I've never seen myself as an INTP...!
@candyfloss1844 жыл бұрын
Why last video?
@karolinasowinska4 жыл бұрын
The last one for now. I'm still planning to release the job scheduling video, but I can't say when exactly. However, the general weekly Thursday uploads will go as planned :)
@candyfloss1844 жыл бұрын
Karolina Sowinska I will miss you 😘 friend
@karolinasowinska4 жыл бұрын
Hey, I'm not going anywhere;) I'm uploading my next video next Thursday, as always! :)
@candyfloss1844 жыл бұрын
Karolina Sowinska in my childhood I once read a book called Heidi. There was a chapter in there called “Promises to meet again.” Suddenly remembered it. 😊
@Robi20094 жыл бұрын
Ooo poznaję gitarę ;)
@karolinasowinska4 жыл бұрын
Nie umywa się do Twojej! ;)
@Robi20094 жыл бұрын
@@karolinasowinska Oj tam :P
@nikos90823 жыл бұрын
nice i will like to see relastional non relastional .....
@MikeSWu4 жыл бұрын
are you russian or polish? i normally dont trust anything someone attractive says, but man you have been spewing truths
@karolinasowinska4 жыл бұрын
Haha how flattering! I'm Polish :)
@stevie34272 жыл бұрын
Saludos desde Soledad, mamasita linda!
@alexandrunknown14563 жыл бұрын
so agressive!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
@karolinasowinska3 жыл бұрын
Me? :D
@alexandrunknown14563 жыл бұрын
@@karolinasowinska sorry, i was drunk)))) your videos are great for lonely solo alcoholic evenings =D
@karolinasowinska3 жыл бұрын
@@alexandrunknown1456 Hahahah ok
@vrinderdoe467 Жыл бұрын
Hey, the database "my_played _tracks.py.sqlite" never appears in my project folder. Why?