Beginner Kaggle Data Science Project Walk-Through (Titanic)

  Рет қаралды 406,303

Ken Jee

Ken Jee

Күн бұрын

Пікірлер: 452
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching! Feel free to upvote the kaggle notebook if you found it helpful! Kaggle notebook: www.kaggle.com/kenjee/titanic-project-example My Kaggle Profile: www.kaggle.com/kenjee Try watching my kaggle project from scratch series next! kzbin.info/www/bejne/qF6ng2x-d8acprc&ab_channel=KenJee
@olabisioremade4784
@olabisioremade4784 4 жыл бұрын
Hi Ken, please long does the whole data science course 365datascience take?
@alexmyers3716
@alexmyers3716 Жыл бұрын
I'm here because of GPT4. Before GPT4 was released, I had a decent basic understaning of data science applications, but did not have the time to learn all of the Python syntax. Now, with GPT4, all I have to do it understand how to explain what i want to do, and GPT4 takes care of all the coding. It wouldn't be hard to create this entire notebook in 2-3 hours of time. Wild times we live in!
@larryhatcher8927
@larryhatcher8927 10 ай бұрын
I took several days and went through this. It was a good starting point. You simply can't learn it all in a few days. As Ken said, this is to be used as a framework. Learning the various models and the revaluations are extremely important
@hornfan722
@hornfan722 4 жыл бұрын
Thanks ken- never used Kaggle or even done any data science projects. The detailed analysis (including the nuances MOST IMPORTANTLY) is really making this digestible- not to mention applicable
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Will do my best to include even more nuances going forward!
@vedantbhardwaj4058
@vedantbhardwaj4058 4 жыл бұрын
I gotta be honest here, started learning Data Science on my own but every now and then I become lazy AF and I just stop for a period of 2-3 weeks. It's difficult to be consistently committed to the program and learning. Although I hope I slowly complete the training.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
It happens to me as well! I use youtube to have people hold me accountable to continued learning. Maybe find a friend or someone to keep you on top of your learning journey!
@omotayoonike9825
@omotayoonike9825 2 жыл бұрын
Pls don't bring this bad energy here everybody who is a data scientist feels the same even me myself don't want to do it and God know it difficult but if you stick around the barber shop for long you will get your hair cut if you like become somebody through data science or another means but all is difficult.
@davidologunoba4703
@davidologunoba4703 2 жыл бұрын
Same sort of situation with me. But you know what, let's keep moving, we can do it!.
@onii-chan2811
@onii-chan2811 7 ай бұрын
We all go through this bro.. the field is demanding
@MaiNguyen-nl3pp
@MaiNguyen-nl3pp 4 жыл бұрын
You have saved us hours of self-exploration! Thank you, Ken :D Hope you can make more videos like this!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
You should still definitely self explore as well! Thank you for watching, more to come!
@sandrafield9813
@sandrafield9813 4 жыл бұрын
Thank you so much for your videos, I watch them all the time. I'm in a masters DS program, and I feel like I'm actually on the titanic right now, going down down down. Here you are handing me a raft, a dingy, and also giving me a map to a huge lush closeby island where there's an escape airport.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching them Sandra!! I also love the analogy haha. Hopefully, one day I will provide you with a more stable yacht so you can enjoy the data science journey in style!
@struanclark5971
@struanclark5971 4 жыл бұрын
You’re content is always top class Ken! As a beginner in this field you’ve taught me so much through your videos. Please keep them coming
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for the kind words Struan! I will definitely keep them coming!
@omjeeshukla5758
@omjeeshukla5758 4 жыл бұрын
I don't understand who are these people to dislike. If you can't support him stop disliking him at least someone is putting in efforts to make knowledge and learning process easy what is the problem of you dislikers.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for looking out for me omjee! All work has its detractors though. I am always looking to improve, so constructive feedback is welcomed!
@sarthaksharma070
@sarthaksharma070 4 жыл бұрын
Great video dude, exactly what i was looking for, its really great to see creators actually listening to the audience and working on it. Keep it up pal
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad it was what you were looking for!! Thanks for watching!
@paigec5017
@paigec5017 4 жыл бұрын
This video came at such good timing! I just taught myself python and started the titanic project today but was feeling so unsure about everything! Thank you for your videos!!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Great stuff! Thank you for watching them!
@emmanuelagyemang3738
@emmanuelagyemang3738 2 жыл бұрын
How did you teach yourself how to code?
@mehsaniphysicsmathmatics2147
@mehsaniphysicsmathmatics2147 3 жыл бұрын
Thank you Ken, just now I finished one course that focused on Titanic survival, your attitude makes more sense for me.
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Awesome!! Thank you for following along!
@DataProfessor
@DataProfessor 4 жыл бұрын
Ken, Great video and great initiative! Sounds like fun, I also haven't done a Kaggle submission yet, will follow your path and do one soon.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Let's definitely partner on one!
@salikmalik7631
@salikmalik7631 4 жыл бұрын
@@KenJee_ds Yes. It'll great to watch..
@DataProfessor
@DataProfessor 4 жыл бұрын
@@KenJee_ds Yes, let's definitely do that 😃
@lucrieffel5018
@lucrieffel5018 4 жыл бұрын
This video was extremely helpful, I have been searching the internet for a video that would walk me through this exact project! Your videos are the best, keep up the good work!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Excellent! Glad it was what you were looking for!
@nikhilatluri1569
@nikhilatluri1569 4 жыл бұрын
Thank Ken Jee For spending your time during this lockdown for educating youngsters like us
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad I could help!
@2ash94
@2ash94 Жыл бұрын
Wow this is a gold mine! Can't believe you went through all that work! Looking through all this, it seems like to become a great data scientist, it's not just about the skill. It is about intelligence and your ability to understand and see things that aren't clear to the normal human being. I have a fairly normal IQ and i am currently wondering if i should continue building my skills in order to become a data scientist.
@KenJee_ds
@KenJee_ds Жыл бұрын
I don't think you have to have a high IQ. You can learn to ask the right questions and create frameworks for yourself. I could not have done the analysis in the same way when I started. I am certain you can learn to approach the problem in the same way I did!
@Om-id1qr
@Om-id1qr 2 жыл бұрын
I'd like to say that I discovered a gem of a channel today.
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Makes me really happy to hear!
@adamploof3528
@adamploof3528 4 жыл бұрын
Looking forward to more videos like this . It's incredibly helpful to get an experienced viewpoint on how to think about and dissect these sorts of problems.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad you found it helpful! Thanks for watching Adam!
@communicationvast9949
@communicationvast9949 2 жыл бұрын
fantastic video, my friend. I started this project in R studio, ran into some walls, and got extremely frustrated. Listening to your process is extremely helpful. Thanks for the upload.
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Thanks for watching!! Really glad to hear it was helpful
@moghegaurav
@moghegaurav 4 жыл бұрын
Love your videos, Ken. They are no-nonsense and stick to just DS. Your content is well made up and your voice is clear. Thanks for sharing your knowledge. I am sure with such quality content you will soon hit 100k subscribers and more.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for the kind words and for watching my videos!
@kbillotta
@kbillotta 4 жыл бұрын
Thanks Ken... I just got my physics degree and i want to become a data scientist..Your videos are helping a lot! Thanks
@KenJee_ds
@KenJee_ds 4 жыл бұрын
That's what I like to hear! Thanks for watching!
@denizbalkaya8356
@denizbalkaya8356 4 жыл бұрын
Hi Ken....Deniz is speaking from Turkey! Your videos are helping me a lot! You force me to keep up :)
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad to hear they are helping! Thank you for watching!
@fablab21
@fablab21 4 жыл бұрын
Since you made a confession at the beginning, Imma hit you with one myself: I'm been trying to study DS consistently for a year and half and bruh... I find it incredibly frustrating. I do not feel particularly smart enough to do projects on my own but I really like your content, so I will stick around. 😬
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Confessions are important! I am confident you can do it. I think you would actually be quite surprised at the progress you've made. I've come a long way myself, and even now I still have impostor syndrome or feel like I don't know as much as I should. I would watch my "the data scientist's secret video", I think it may give you a little boost! kzbin.info/www/bejne/jnilmpqgi95le9E
@ramonsantiago4573
@ramonsantiago4573 4 жыл бұрын
IMO its unlikely that you're not smart enough to learn this stuff, its probably the way you go about learning it. You need to spend a lot of time on the basics and have a really good understanding of python. Its hard... i personally kept trying to jump ahead and go through concepts as fast as possible but it didn't really work. However, now that I've been studying at a slower pace everything is starting to make sense, and i managed to complete a few ML projects completely by myself. A really good slow paced course that teaches the majority of the basics was "Python for Data Science and Machine Learning Bootcamp" by Jose Portilla. Good luck!
@SimonYung-g2g
@SimonYung-g2g 7 ай бұрын
Thanks! It was insightful and too the point. A lecturer pointed to this website. I'm just starting off but haven't put my learnings together such as this video shows. Now I'm aware I have more self teaching to do
@anurekha137
@anurekha137 4 жыл бұрын
I am glad that I came across your channel. Always wanted to try titanic dataset on kaggle but didn't. now I m gonna try it. thanks.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
That is one of my favorite things to hear! It makes me really happy that my video helped you get started!
@hugochung9909
@hugochung9909 4 жыл бұрын
I've been following your videos for a while now and making my way through all the microcourses on Kaggle. This is the exact video I was looking for to begin the next stage of learning by diving into some data science projects . Top content and keep up the great work Ken!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for the kind words! This is exactly what I like to hear haha. Glad you found it helpful!
@ashikka5902
@ashikka5902 4 жыл бұрын
Thank you Ken! Doing this first thing in the morning tomorrow!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I hope it helps!!
@wasimraja2980
@wasimraja2980 4 жыл бұрын
Done ?
@hendrywijaya1017
@hendrywijaya1017 2 жыл бұрын
Ken, I Think about the project planning which on Histogram and Boxplot should be place after missing data, So Here's the plan order from the top - understand the Type of data - value counts - missing data - histogram and boxplot Then continue by following step you make from - correlarion analysis - exploring interesting fact Until scaling
@josefftan1203
@josefftan1203 4 жыл бұрын
Aw, kaggle series here we goooo ♥️
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Enjoy!
@ahmedhassan9379
@ahmedhassan9379 2 жыл бұрын
Thanks so much, i feel happy that i could undersrand 90% of the content months ago i didnt knew a thing!
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Amazing!!
@justinhuang8034
@justinhuang8034 4 жыл бұрын
Love your content man! Keep it up 100k subs is just around the corner!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks Justin! Glad to hear the content has been useful to you!
@aimenbaig6201
@aimenbaig6201 4 жыл бұрын
You are my absolute Guide to data science. THANKYOU KEN
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for the kind words Aimen!! Glad the videos have been helpful!
@zahinnazhan7200
@zahinnazhan7200 4 жыл бұрын
This is great walkthrough for beginner like me. Thanks Ken Jee
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad it was helpful Zahin!
@DatascienceConcepts
@DatascienceConcepts 4 жыл бұрын
Nice insights Ken Jee. In fact I remember working with this dataset in my early days of ML :)
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Awesome! I definitely think this dataset is a great starting point. It was even helpful for me to go back and review some of the basics!
@MonaChangizi
@MonaChangizi 4 ай бұрын
Thank you for this helpful video! I'm really bigginer in machine learning but I love solving problems like it and your content helps me in this journey. 😊
@AdityaKumar-cj2ms
@AdityaKumar-cj2ms 4 жыл бұрын
It was a very insightful explanation of this project, really liked it. And, at cell [5] if you execute training.describe(include = "all"), it will also give you the values which appear the most for every categorical variable. Which I think can be really helpful.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I actually didn't know that! Thank you for sharing!
@henriquebonacelli2981
@henriquebonacelli2981 4 жыл бұрын
Man, great video! I'm starting on data science and this hands on project explanation was super helpfull!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad to hear it was helpful! Thank you for watching!
@fahadreda3060
@fahadreda3060 4 жыл бұрын
Thanks Ken, I was waiting for this video , Good Luck
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I hope you enjoy it Fahad!
@abdelrahmanashraf7636
@abdelrahmanashraf7636 2 жыл бұрын
Thanks a lot for this video, having learning a lot of things and didn't know how to tie all the ropes together. This video was for it. Thanks a lot Ken Jee :)
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Thanks for checking it out!
@augustthenerd4213
@augustthenerd4213 2 жыл бұрын
Thanks for the video! I have some DS experience but it was very helpful to see how an expert would approach a Kaggle problem.
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Thanks for watching!
@chinmaygondhalekar2591
@chinmaygondhalekar2591 4 жыл бұрын
Just the notification I was waiting for thanks man 👍
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I hope you enjoy!
@MrBlack-cv8qn
@MrBlack-cv8qn 3 жыл бұрын
Huge thanks from beginner DS switching from mechanical engineering!
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Thanks for watching! Glad to hear it was helpful!
@RichardOnData
@RichardOnData 4 жыл бұрын
Loving this video and the thumbnail dude!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for noticing the thumbnail Richard! Would love to colab at some point if you're interested!
@RichardOnData
@RichardOnData 4 жыл бұрын
@@KenJee_ds Absolutely! My email is richardondata@gmail.com - I have a number of items on my backlog of videos that I'd love to cover in the future as I'm sure you do too, and some of them I think would make total sense! I'll drop you an email in a day or two myself.
@arthurmlcc
@arthurmlcc 4 жыл бұрын
Keep up with great the work you've been doing in this channel ken, really helping us beginners.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I absolutely will! Thanks for watching!
@mustafamegahed7873
@mustafamegahed7873 4 жыл бұрын
Great job! Thank you so much! Sadly, I have some work at college and couldn't finish the video but I will definitely come back to it hopefully next week.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
No problem! It is there for you to learn at your own pace!
@Mario-ox5dm
@Mario-ox5dm 4 жыл бұрын
I sense a rising Kaggle Grandmaster in the future!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Haha I don't know about that! Long road ahead
@anoopashware9539
@anoopashware9539 3 жыл бұрын
thank you sir to make this video I can't explain it in words. how much information in this video. which is really helpful for me to become a good data scientist. thank you so much
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Really glad to hear this video helped!
@mimikoko4299
@mimikoko4299 4 жыл бұрын
U have a best data science chanel, I love u
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thank you Mimi!
@dakadoodle
@dakadoodle 4 жыл бұрын
Literally was looking at this today
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Awesome!
@saurabhjoshi4887
@saurabhjoshi4887 4 жыл бұрын
Hi ken Great Video, I just completed your 7 part data science from beginning series. I am a beginner in data science and your video helped me a lot. Thanks 😊
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for working through the whole project series! I hope that this video helps you as well!
@daedalusdreamjournal5925
@daedalusdreamjournal5925 4 жыл бұрын
Hello there :) I haven't watched the full video yet, but there's a reason for this and is linked to a suggestion I'd like to propose to you for similar videos in the future: Despite being very VERY green in this, I decided to have a first go at this all by myself ... and boy was and is it still frustrating :P The reason behind this was that I wanted to try a first attempt without a guiding hand. Once I finished my first model, I quickly realized that there were tons of ways where I blundered like a total noob ... which is actually totally fine :) And despite the frustration of the experience, it felt like I gathered valuable experience from this. And it is only now that I am starting to watch this video .. but only bit by bit, as I want to try to do as much by myself as possible (mistake be damned since they are being done at home where it won't hurt anyone and where I can learn safely from the experience). SO my suggestion is this: Could it be possible for future similar videos to have it in several parts? Or, at the very least, to timestamp the different section of your handling of a particular problem? I feel like it could be very valuable, especially for very recent newcomers like me. Anyways, thanks a ton for your videos, very much appreciated ! (especially some of the code where you use apply and lambda functions to handle data transformations, this is definitely something that will be useful for me in the near and long future! :) Signed: A total newbie at this.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
This is a great idea! I think I will try the time stamp portion for the next one. I would also recommend my project from scratch series: kzbin.info/aero/PL2zq7klxX5ASFejJj80ob9ZAnBHdz5O1t . I broke this one into each phase of the data science lifecycle. I think your approach is really great though! I highly recommend that for other people going through this.
@sauravsahay8803
@sauravsahay8803 2 жыл бұрын
I keep getting tired and demotivated and I keep picking myself up to learn this :(
@bianchialex
@bianchialex 4 жыл бұрын
Just came here to see what you got. I used random forest and got .76 on my first try and then a little tuning got it to .77. I think I could make it better so I will continue to play around. I got to the point in my course curriculum mid-lecture I said "this is more advanced than I need to get started on beginner projects" and instantly hopped off to do titanic. I had it all worked up in my head to be some super hard task but it turned out to be relatively painless! I am going to do a couple other smaller projects and then try something of my own, probably using youtube data because I am a massive geek for the algorithim.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I think the smaller projects is a good idea! All about building some momentum!
@arick2050
@arick2050 4 жыл бұрын
Super informative, thanks Ken!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching Aric!!
@albertosei3558
@albertosei3558 Жыл бұрын
I will try this very soon. Bookmarking this
@KenJee_ds
@KenJee_ds Жыл бұрын
💪
@ImportData1
@ImportData1 4 жыл бұрын
Learned something new - VotingClassifier!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Awesome! Yeah, it is super useful and easy to use! Next time I will probably experiment more with some pipelines to clean up the feature engineering a bit!
@ImportData1
@ImportData1 4 жыл бұрын
@@KenJee_ds I find the feature engineering/selection process the toughest. Sometimes you think you engineered features well enough, but the model accuracy doesn't necessarily resonate. Would love to see how you experimenet with pipelines!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
@@ImportData1 Yep! This is definitely the case where I could have done more!
@s8x.
@s8x. 9 ай бұрын
thanks for this video. Just started this problem and realized I have no idea what I'm doing
@tomasagustin2243
@tomasagustin2243 4 жыл бұрын
Amazing!, i learn a lot with your videos, thanks for sharing your knowledge, hug from Argentina!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thank you for watching! I would love to visit Argentina some day!
@tomasagustin2243
@tomasagustin2243 4 жыл бұрын
Hope you comee there are a lot of beatiful people here and a lot of party hahahaha
@prabirbiswas440
@prabirbiswas440 4 жыл бұрын
Wow what a in-depth analysis. You really put a lots of efforts into this. This is my first try in Kaggle too, after spending this much time i wonder how much time it will take for even tougher Data , i also checked the House Rent Competition. It have 81 Features. how can we do such a detailed analysis on all the features. Not sure how the real-world ML problems are solved where they might have 100 or even more features. I am really excited to know more :)
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching! I will be doing the housing dataset next, so stay tuned!!
@alyona1311
@alyona1311 2 жыл бұрын
I learned so much from your video, thank you!
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Amazing! Thank you for watching!
@eminalopatyuk6812
@eminalopatyuk6812 Ай бұрын
Thank you so much for your tutorial, Ken.
@AIPlayerrrr
@AIPlayerrrr 4 жыл бұрын
I’d be super interested in seeing you competing in a real Kaggle Competition.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I will likely be trying one in a few months! Stay tuned!
@AIPlayerrrr
@AIPlayerrrr 4 жыл бұрын
Ken Jee great! I am excited
@Gamma3
@Gamma3 4 жыл бұрын
Me too! Great channel
@sadiakamal6866
@sadiakamal6866 4 жыл бұрын
Great job..Please do these sort of videos more often!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thank you for watching! Will definitely be trying to make more of these!
@bencantc2548
@bencantc2548 4 жыл бұрын
Amazing video! I hope you do a similar video on regression and clustering problems in the future!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching! I plan to do a regression problem next!
@jfr543
@jfr543 4 жыл бұрын
This video is gold!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for the kind words! I'm glad you found it helpful!
@jonasschroder7244
@jonasschroder7244 4 жыл бұрын
Great! Very inspiring and helpful!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching Jonas!
@manasagrawal8365
@manasagrawal8365 3 жыл бұрын
thanks Ken this was really helpful
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Thanks for watching!
@kartikeyanamdev4471
@kartikeyanamdev4471 4 жыл бұрын
First of all Thanks Ken for this, and secondly it's a request if you can make a dedicated video on how data analysis can work in cricket, I know you may not be knowing about the game but I really want to implement some data analysis into the game of cricket, so just need your help and it will do great if you make a video on the same. Have a good day mate.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I will try to bring someone in who is familiar on the game!
@ΧρυσόστομοςΠαπαδόπουλος-κ5π
@ΧρυσόστομοςΠαπαδόπουλος-κ5π 4 жыл бұрын
I think it would be great if you could show how you would present this project in a markdown file in order to add it to your github. Thanks for the great work!!!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I will work on it!
@solaawodiya7360
@solaawodiya7360 4 жыл бұрын
Hi Ken, thanks for the help on learning about data science. I struggle a lot using Kaggle to learn python. The user experience for me is quite intimidating compared to other platforms I used as there are times even when I know the question, I get lost on how to answer and follow the steps.
@karlduckett
@karlduckett 3 жыл бұрын
Really awesome! My only minor criticism is that in the first half of the video, most of the pivot tables and charts are displaying counts. When comparing counts between categories (i.e. survival rate by age) it really needs to display the percentage of that grouping... Sadly I'm too much of a noob to figure it out just yet :(
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Totally fair, thanks for the feedback!
@moajjem04
@moajjem04 4 жыл бұрын
This video is a great help!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Glad to hear! Thank you for watching!
@JBB685
@JBB685 3 жыл бұрын
Would you consider doing one for the linear regression example you suggested on your 3 beginners’ projects? It’s the Aimes housing prices project
@mohithedaoo6968
@mohithedaoo6968 4 жыл бұрын
This was much needed... Thank you very much!!l
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Happy I could help! Thank you for watching!
@imakonkonvicted
@imakonkonvicted 4 жыл бұрын
Thanks! I will try to do this alongside your video! :D
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Awesome! Would love to hear how it goes!
@dhristovaddx
@dhristovaddx 4 жыл бұрын
Thank you for the great video! It's very helpful! ^_^
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching! Glad it was helpful!
@dxzgamingtricks5938
@dxzgamingtricks5938 2 жыл бұрын
you are a genius!!!!
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Thanks for watching!
@kefahelhelou9418
@kefahelhelou9418 Жыл бұрын
Thanks for the great efforts
@KenJee_ds
@KenJee_ds Жыл бұрын
Thanks for watching!
@NivenShado
@NivenShado 4 жыл бұрын
Thank you for this. This was my dabble in data science and I may not have understood all of what you did, but I'm happy I tried it. Where should I go from here? I am thinking about taking the data science bootcamp course on Udemy.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I'm happy you tried it as well! I recommend the free micro courses on kaggle.com actually. They are a great way to continue what you started with this walk through!
@NivenShado
@NivenShado 4 жыл бұрын
@@KenJee_ds Thanks! I'll do that. I also haven't submitted the walk through as it is 99% your code - and am not sure if that is okay to do or if I am supposed to do that. The other 1% was me finding the proportion of how many people named William survived on the boat since there seemed to be a lot of Williams on the Titanic lol
@KenJee_ds
@KenJee_ds 4 жыл бұрын
@@NivenShado As long as you don't make it public, I think it is generally fine to submit your results to see what score you get. If you do make it public, you want to make sure you either fork the notebook, or give credit at the top and through the workbook for the cod portions you used. I hope this helps!
@AlexKite68
@AlexKite68 3 жыл бұрын
Thank you for this great video! I've already subscribed to your channel, digging to find a lots of DS insights )) But please improve the audio quality in future videos: background noises are really frustrating, and a background music seems to be a little bit loud. But again, you're making a great resource that is very useful for Data Science beginners like me!
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Thanks for watching! I have adjusted the music in the newer videos
@jmiller1095
@jmiller1095 3 жыл бұрын
Ken, I'm a long time listener first time caller :) This is a terrific video .. and I have one question! ... at around 22:00 you clearly tell us that concatenating training and test datasets together and then pre-processing them all together is NOT the way it should be done in real -world (real world way, as I understood: train encoder scaler on training dataset then transform test dataset using encoder scaler trained (only) from training dataset). So .. do you have (or do you know of one) similar to this one but which demos the real world way of doing business?
@KenJee_ds
@KenJee_ds 3 жыл бұрын
Good question! I don't have one on my channel, but I think Nicholas Renotte probably has a tutorial on his where he does it correctly. There are plenty of ways to do it correctly. You can create functions that label the data in the same way as you did in the tests set, more specifically, you can use the sklearn pipeline. If you go to the kaggle kernels, I expect there will be quite a few that do it correctly. Thanks for watching!
@hemantgautam1
@hemantgautam1 4 жыл бұрын
Hi Ken, Please create a separate play list for kaggle videos. 🙂
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Will do!
@OnlineGreg
@OnlineGreg 2 жыл бұрын
13:27 why exactly do you choose 'Ticket' for values in the pivot-table? Dont understand that
@deepakshiarora835
@deepakshiarora835 4 жыл бұрын
take a drink every time ken says actually.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Oof, something I'm working on improving (actually) haha
@deepakshiarora835
@deepakshiarora835 4 жыл бұрын
@@KenJee_ds you're so humble (actually).
@velevki
@velevki Жыл бұрын
I am trying to understand the one-hot encoder method you suggested around 22:03, applying this method to real-world datasets. I ran into this issue before and had to merge train and test data sets to be able to apply a one-hot encoder. What is the solution to the issue of not having the same columns when we apply a one-hot encoder to train and test sets separately? I don't know how to approach that.
@KenJee_ds
@KenJee_ds Жыл бұрын
You can create a pipeline to create the same columns in the dataset. I do that better in this tutorial here: kzbin.info/www/bejne/hIK0ZHeImb10jqc&ab_channel=KenJee
@adaptronankidu2611
@adaptronankidu2611 Жыл бұрын
@@KenJee_ds Thank you!
@ultimategohan2190
@ultimategohan2190 Жыл бұрын
What an in-depth analysis! By the way, how many possible approaches are there to this dataset?
@KenJee_ds
@KenJee_ds Жыл бұрын
Thanks for watching! As it turns out, there are really infinite ways you can approach a problem like this haha.
@xdominator365
@xdominator365 4 жыл бұрын
Thank You Ken for providing this video...your new subscriber from India...🧡✌
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Awesome! Thank you for subscribing! I hope my other videos are helpful as well!
@nikhilatluri1569
@nikhilatluri1569 4 жыл бұрын
@@KenJee_ds yes for sure Watched almost all your videos And got a lot of information in building my career
@xdominator365
@xdominator365 4 жыл бұрын
@@KenJee_ds just finished my data science beginners playlist...🙂✌... after this going to start my first project for beginners....thank you so much, Ken, earlier I was going in the wrong path, I will be your fan...🧡🧡🧡want to get in touch with you please sir...
@surajkumarmaurya8088
@surajkumarmaurya8088 4 жыл бұрын
Thanks a lot Sir, this help me a lot to clear my doubts.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Great!! Thanks for watching!
@nailujretuas2093
@nailujretuas2093 2 жыл бұрын
very helpful, thank you. comment for the algorithm.
@KenJee_ds
@KenJee_ds 2 жыл бұрын
Thank you!
@johnyanergos
@johnyanergos Жыл бұрын
Actually Capt is not for the captain of the ship it seems , after some internet search i found out that Capt is referred to a different captain that was onboard the titanic as a passenger who unfortunatly lost his life in the sinking.
@KenJee_ds
@KenJee_ds Жыл бұрын
Cool! Did not know that!
@elsins9790
@elsins9790 4 жыл бұрын
Thank you for this video and your explanation how you approach data science problems. I was just able to reach the baseline from the Titanic tutorial by my own approach with XGB and GridsearchCV. Did you tried stacked denoising autoencoder in your projects and how did it work out? It is kinda like an automatic unsupervised learning approach that can be fed into a neural network. Your channel is golden! Keep it up and stay healthy!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching! For this one, I didn't stack any models. That would be a good way to go a step past what I did though!
@DarkPrince1996
@DarkPrince1996 3 жыл бұрын
You did a great job explaining your approach to solving the task at hand and walking us through the process and so Im wanting to know what would be the next steps for someone wanting to use this competition to learn data science? Like I dont have a detailed understanding of all the algorithms that you used in this competition so would it be best to pick the one that produced the best score and learn how to tune that particular algorithm model metrics to get a better score or would it be best to transfer your process to another beginner competition altogether to create a better understanding of the complete data science process as a whole?
@KenJee_ds
@KenJee_ds 3 жыл бұрын
I think this is great for learning how to tune the algorithms and seeing what results you get with different ones. It is also a good one for practicing feature engineering like I did with some of the seats etc.. I think transferring things to another competition would be a good idea!
@DarkPrince1996
@DarkPrince1996 3 жыл бұрын
@@KenJee_ds appreciate your advice and I will definitely do that.
@gisinatrag
@gisinatrag 2 ай бұрын
Thank you for this video, it's very useful to understand how a beginner should face a real problem. After your video I have decided to rewrite my notebook using your way to plan the steps. I am trying to go over 80% without NN, my best score is 0,78468 top 14%. What would it be a good score for a beginner?
@shaikhkashif9973
@shaikhkashif9973 2 жыл бұрын
U did feature Engineering first?? Then remove outliers 🤔
@D4nte-RN
@D4nte-RN 11 ай бұрын
Thanks for your great tutorial on youtube! I have a question In regrading to this analysis. In 24:40 you start talking about cross validation. I'm not sure If I clear understand your code here. You're making cross validation on X_train /y_train data set? Is this correct aproach? Or we should do this on whole data set X/y?
@MohamedKhaled-el7vv
@MohamedKhaled-el7vv 4 жыл бұрын
I have a question @24:15 after you made some changes for the entire dataset you split the data to train and test so doesn't this cause data leakage?
@KenJee_ds
@KenJee_ds 4 жыл бұрын
This would in fact cause data leakage. This is one of the things that separates kaggle from real data science problems. In this scenario, we want to maximize performance using all the data available to us.
@gupnir
@gupnir 4 жыл бұрын
Hi Ken, your videos are really helpful for beginners like me. Can you do a similar walk-through video for House Prices problem as well.... thanks in advance.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
I plan to! Thanks for watching Nirmit!
@MarsLanding91
@MarsLanding91 4 жыл бұрын
Thanks Ken!
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Thanks for watching!
@amrelshabasy1183
@amrelshabasy1183 2 жыл бұрын
Thanks, Ken for this great video. Can you please explain, how did you measure that the Model XGboost is overfitting?
@nateMan5370
@nateMan5370 4 жыл бұрын
Hello Ken, really cool video. I have a question for you. You gave us all the important variables that contributed to the model at around 32:30. I was wondering, with those variables in mind, are we able to dig a little deeper and see what the actual values were that contributed most to the model? If that sounds confusing, here is an example. Age is the second most significant variable in the model. If we can program what I just asked you, the model would "spit" out ages 15-18 years, I do not know if that's true, old were the most significant and contributed most to the model. This is my first data science project, so I want to apologize in advance if my words do not sound "data sciency" enough.
@KenJee_ds
@KenJee_ds 4 жыл бұрын
Hi Nate - Thanks for watching! I'm glad you're starting your journey with my material! We aren't able to isolate the specific range in a continuous variable and split out a certain range like you had illustrated with the current method. If we wanted, we could make a variable for different age ranges and see what the impact would be from running those through the model though!
@nateMan5370
@nateMan5370 4 жыл бұрын
@@KenJee_ds Another question, I am using a gradient boosting model for the titanic dataset. Thankfully I did not run into any errors but I noticed that the model, specifically the r2, is very inconsistent. Every time I re run the model, it gives me a different r2. Is that normal thing to happen and why does it happen?
@nateMan5370
@nateMan5370 4 жыл бұрын
also, sorry for the barrage of questions, but from time 25:00 to 28:00 you mention all the models you created. like logistic regression, tree, KNN, etc. The numbers in the array [] --> could you explain what those represent? I am thinking they are the r2 but I do not think that is correct.
How to do the Titanic Kaggle Competition
18:28
Aladdin Persson
Рет қаралды 80 М.
Quilt Challenge, No Skills, Just Luck#Funnyfamily #Partygames #Funny
00:32
Family Games Media
Рет қаралды 55 МЛН
ML Was Hard Until I Learned These 5 Secrets!
13:11
Boris Meinardus
Рет қаралды 357 М.
The Harsh Reality of the Data Job Market
6:43
Ken Jee
Рет қаралды 109 М.
All Machine Learning Concepts Explained in 22 Minutes
22:22
Infinite Codes
Рет қаралды 74 М.
The One and Only Data Science Project You Need
13:05
StrataScratch
Рет қаралды 309 М.
Why Does Diffusion Work Better than Auto-Regression?
20:18
Algorithmic Simplicity
Рет қаралды 432 М.
Anthony Goldbloom - How to Win Kaggle Competitions
44:18
Weights & Biases
Рет қаралды 55 М.
How I'd learn ML in 2025 (if I could start over)
16:24
Boris Meinardus
Рет қаралды 146 М.