Thanks for watching! Feel free to upvote the kaggle notebook if you found it helpful! Kaggle notebook: www.kaggle.com/kenjee/titanic-project-example My Kaggle Profile: www.kaggle.com/kenjee Try watching my kaggle project from scratch series next! kzbin.info/www/bejne/qF6ng2x-d8acprc&ab_channel=KenJee
@olabisioremade47844 жыл бұрын
Hi Ken, please long does the whole data science course 365datascience take?
@alexmyers3716 Жыл бұрын
I'm here because of GPT4. Before GPT4 was released, I had a decent basic understaning of data science applications, but did not have the time to learn all of the Python syntax. Now, with GPT4, all I have to do it understand how to explain what i want to do, and GPT4 takes care of all the coding. It wouldn't be hard to create this entire notebook in 2-3 hours of time. Wild times we live in!
@larryhatcher892710 ай бұрын
I took several days and went through this. It was a good starting point. You simply can't learn it all in a few days. As Ken said, this is to be used as a framework. Learning the various models and the revaluations are extremely important
@hornfan7224 жыл бұрын
Thanks ken- never used Kaggle or even done any data science projects. The detailed analysis (including the nuances MOST IMPORTANTLY) is really making this digestible- not to mention applicable
@KenJee_ds4 жыл бұрын
Will do my best to include even more nuances going forward!
@vedantbhardwaj40584 жыл бұрын
I gotta be honest here, started learning Data Science on my own but every now and then I become lazy AF and I just stop for a period of 2-3 weeks. It's difficult to be consistently committed to the program and learning. Although I hope I slowly complete the training.
@KenJee_ds4 жыл бұрын
It happens to me as well! I use youtube to have people hold me accountable to continued learning. Maybe find a friend or someone to keep you on top of your learning journey!
@omotayoonike98252 жыл бұрын
Pls don't bring this bad energy here everybody who is a data scientist feels the same even me myself don't want to do it and God know it difficult but if you stick around the barber shop for long you will get your hair cut if you like become somebody through data science or another means but all is difficult.
@davidologunoba47032 жыл бұрын
Same sort of situation with me. But you know what, let's keep moving, we can do it!.
@onii-chan28117 ай бұрын
We all go through this bro.. the field is demanding
@MaiNguyen-nl3pp4 жыл бұрын
You have saved us hours of self-exploration! Thank you, Ken :D Hope you can make more videos like this!
@KenJee_ds4 жыл бұрын
You should still definitely self explore as well! Thank you for watching, more to come!
@sandrafield98134 жыл бұрын
Thank you so much for your videos, I watch them all the time. I'm in a masters DS program, and I feel like I'm actually on the titanic right now, going down down down. Here you are handing me a raft, a dingy, and also giving me a map to a huge lush closeby island where there's an escape airport.
@KenJee_ds4 жыл бұрын
Thanks for watching them Sandra!! I also love the analogy haha. Hopefully, one day I will provide you with a more stable yacht so you can enjoy the data science journey in style!
@struanclark59714 жыл бұрын
You’re content is always top class Ken! As a beginner in this field you’ve taught me so much through your videos. Please keep them coming
@KenJee_ds4 жыл бұрын
Thanks for the kind words Struan! I will definitely keep them coming!
@omjeeshukla57584 жыл бұрын
I don't understand who are these people to dislike. If you can't support him stop disliking him at least someone is putting in efforts to make knowledge and learning process easy what is the problem of you dislikers.
@KenJee_ds4 жыл бұрын
Thanks for looking out for me omjee! All work has its detractors though. I am always looking to improve, so constructive feedback is welcomed!
@sarthaksharma0704 жыл бұрын
Great video dude, exactly what i was looking for, its really great to see creators actually listening to the audience and working on it. Keep it up pal
@KenJee_ds4 жыл бұрын
Glad it was what you were looking for!! Thanks for watching!
@paigec50174 жыл бұрын
This video came at such good timing! I just taught myself python and started the titanic project today but was feeling so unsure about everything! Thank you for your videos!!
@KenJee_ds4 жыл бұрын
Great stuff! Thank you for watching them!
@emmanuelagyemang37382 жыл бұрын
How did you teach yourself how to code?
@mehsaniphysicsmathmatics21473 жыл бұрын
Thank you Ken, just now I finished one course that focused on Titanic survival, your attitude makes more sense for me.
@KenJee_ds3 жыл бұрын
Awesome!! Thank you for following along!
@DataProfessor4 жыл бұрын
Ken, Great video and great initiative! Sounds like fun, I also haven't done a Kaggle submission yet, will follow your path and do one soon.
@KenJee_ds4 жыл бұрын
Let's definitely partner on one!
@salikmalik76314 жыл бұрын
@@KenJee_ds Yes. It'll great to watch..
@DataProfessor4 жыл бұрын
@@KenJee_ds Yes, let's definitely do that 😃
@lucrieffel50184 жыл бұрын
This video was extremely helpful, I have been searching the internet for a video that would walk me through this exact project! Your videos are the best, keep up the good work!
@KenJee_ds4 жыл бұрын
Excellent! Glad it was what you were looking for!
@nikhilatluri15694 жыл бұрын
Thank Ken Jee For spending your time during this lockdown for educating youngsters like us
@KenJee_ds4 жыл бұрын
Glad I could help!
@2ash94 Жыл бұрын
Wow this is a gold mine! Can't believe you went through all that work! Looking through all this, it seems like to become a great data scientist, it's not just about the skill. It is about intelligence and your ability to understand and see things that aren't clear to the normal human being. I have a fairly normal IQ and i am currently wondering if i should continue building my skills in order to become a data scientist.
@KenJee_ds Жыл бұрын
I don't think you have to have a high IQ. You can learn to ask the right questions and create frameworks for yourself. I could not have done the analysis in the same way when I started. I am certain you can learn to approach the problem in the same way I did!
@Om-id1qr2 жыл бұрын
I'd like to say that I discovered a gem of a channel today.
@KenJee_ds2 жыл бұрын
Makes me really happy to hear!
@adamploof35284 жыл бұрын
Looking forward to more videos like this . It's incredibly helpful to get an experienced viewpoint on how to think about and dissect these sorts of problems.
@KenJee_ds4 жыл бұрын
Glad you found it helpful! Thanks for watching Adam!
@communicationvast99492 жыл бұрын
fantastic video, my friend. I started this project in R studio, ran into some walls, and got extremely frustrated. Listening to your process is extremely helpful. Thanks for the upload.
@KenJee_ds2 жыл бұрын
Thanks for watching!! Really glad to hear it was helpful
@moghegaurav4 жыл бұрын
Love your videos, Ken. They are no-nonsense and stick to just DS. Your content is well made up and your voice is clear. Thanks for sharing your knowledge. I am sure with such quality content you will soon hit 100k subscribers and more.
@KenJee_ds4 жыл бұрын
Thanks for the kind words and for watching my videos!
@kbillotta4 жыл бұрын
Thanks Ken... I just got my physics degree and i want to become a data scientist..Your videos are helping a lot! Thanks
@KenJee_ds4 жыл бұрын
That's what I like to hear! Thanks for watching!
@denizbalkaya83564 жыл бұрын
Hi Ken....Deniz is speaking from Turkey! Your videos are helping me a lot! You force me to keep up :)
@KenJee_ds4 жыл бұрын
Glad to hear they are helping! Thank you for watching!
@fablab214 жыл бұрын
Since you made a confession at the beginning, Imma hit you with one myself: I'm been trying to study DS consistently for a year and half and bruh... I find it incredibly frustrating. I do not feel particularly smart enough to do projects on my own but I really like your content, so I will stick around. 😬
@KenJee_ds4 жыл бұрын
Confessions are important! I am confident you can do it. I think you would actually be quite surprised at the progress you've made. I've come a long way myself, and even now I still have impostor syndrome or feel like I don't know as much as I should. I would watch my "the data scientist's secret video", I think it may give you a little boost! kzbin.info/www/bejne/jnilmpqgi95le9E
@ramonsantiago45734 жыл бұрын
IMO its unlikely that you're not smart enough to learn this stuff, its probably the way you go about learning it. You need to spend a lot of time on the basics and have a really good understanding of python. Its hard... i personally kept trying to jump ahead and go through concepts as fast as possible but it didn't really work. However, now that I've been studying at a slower pace everything is starting to make sense, and i managed to complete a few ML projects completely by myself. A really good slow paced course that teaches the majority of the basics was "Python for Data Science and Machine Learning Bootcamp" by Jose Portilla. Good luck!
@SimonYung-g2g7 ай бұрын
Thanks! It was insightful and too the point. A lecturer pointed to this website. I'm just starting off but haven't put my learnings together such as this video shows. Now I'm aware I have more self teaching to do
@anurekha1374 жыл бұрын
I am glad that I came across your channel. Always wanted to try titanic dataset on kaggle but didn't. now I m gonna try it. thanks.
@KenJee_ds4 жыл бұрын
That is one of my favorite things to hear! It makes me really happy that my video helped you get started!
@hugochung99094 жыл бұрын
I've been following your videos for a while now and making my way through all the microcourses on Kaggle. This is the exact video I was looking for to begin the next stage of learning by diving into some data science projects . Top content and keep up the great work Ken!
@KenJee_ds4 жыл бұрын
Thanks for the kind words! This is exactly what I like to hear haha. Glad you found it helpful!
@ashikka59024 жыл бұрын
Thank you Ken! Doing this first thing in the morning tomorrow!
@KenJee_ds4 жыл бұрын
I hope it helps!!
@wasimraja29804 жыл бұрын
Done ?
@hendrywijaya10172 жыл бұрын
Ken, I Think about the project planning which on Histogram and Boxplot should be place after missing data, So Here's the plan order from the top - understand the Type of data - value counts - missing data - histogram and boxplot Then continue by following step you make from - correlarion analysis - exploring interesting fact Until scaling
@josefftan12034 жыл бұрын
Aw, kaggle series here we goooo ♥️
@KenJee_ds4 жыл бұрын
Enjoy!
@ahmedhassan93792 жыл бұрын
Thanks so much, i feel happy that i could undersrand 90% of the content months ago i didnt knew a thing!
@KenJee_ds2 жыл бұрын
Amazing!!
@justinhuang80344 жыл бұрын
Love your content man! Keep it up 100k subs is just around the corner!
@KenJee_ds4 жыл бұрын
Thanks Justin! Glad to hear the content has been useful to you!
@aimenbaig62014 жыл бұрын
You are my absolute Guide to data science. THANKYOU KEN
@KenJee_ds4 жыл бұрын
Thanks for the kind words Aimen!! Glad the videos have been helpful!
@zahinnazhan72004 жыл бұрын
This is great walkthrough for beginner like me. Thanks Ken Jee
@KenJee_ds4 жыл бұрын
Glad it was helpful Zahin!
@DatascienceConcepts4 жыл бұрын
Nice insights Ken Jee. In fact I remember working with this dataset in my early days of ML :)
@KenJee_ds4 жыл бұрын
Awesome! I definitely think this dataset is a great starting point. It was even helpful for me to go back and review some of the basics!
@MonaChangizi4 ай бұрын
Thank you for this helpful video! I'm really bigginer in machine learning but I love solving problems like it and your content helps me in this journey. 😊
@AdityaKumar-cj2ms4 жыл бұрын
It was a very insightful explanation of this project, really liked it. And, at cell [5] if you execute training.describe(include = "all"), it will also give you the values which appear the most for every categorical variable. Which I think can be really helpful.
@KenJee_ds4 жыл бұрын
I actually didn't know that! Thank you for sharing!
@henriquebonacelli29814 жыл бұрын
Man, great video! I'm starting on data science and this hands on project explanation was super helpfull!
@KenJee_ds4 жыл бұрын
Glad to hear it was helpful! Thank you for watching!
@fahadreda30604 жыл бұрын
Thanks Ken, I was waiting for this video , Good Luck
@KenJee_ds4 жыл бұрын
I hope you enjoy it Fahad!
@abdelrahmanashraf76362 жыл бұрын
Thanks a lot for this video, having learning a lot of things and didn't know how to tie all the ropes together. This video was for it. Thanks a lot Ken Jee :)
@KenJee_ds2 жыл бұрын
Thanks for checking it out!
@augustthenerd42132 жыл бұрын
Thanks for the video! I have some DS experience but it was very helpful to see how an expert would approach a Kaggle problem.
@KenJee_ds2 жыл бұрын
Thanks for watching!
@chinmaygondhalekar25914 жыл бұрын
Just the notification I was waiting for thanks man 👍
@KenJee_ds4 жыл бұрын
I hope you enjoy!
@MrBlack-cv8qn3 жыл бұрын
Huge thanks from beginner DS switching from mechanical engineering!
@KenJee_ds3 жыл бұрын
Thanks for watching! Glad to hear it was helpful!
@RichardOnData4 жыл бұрын
Loving this video and the thumbnail dude!
@KenJee_ds4 жыл бұрын
Thanks for noticing the thumbnail Richard! Would love to colab at some point if you're interested!
@RichardOnData4 жыл бұрын
@@KenJee_ds Absolutely! My email is richardondata@gmail.com - I have a number of items on my backlog of videos that I'd love to cover in the future as I'm sure you do too, and some of them I think would make total sense! I'll drop you an email in a day or two myself.
@arthurmlcc4 жыл бұрын
Keep up with great the work you've been doing in this channel ken, really helping us beginners.
@KenJee_ds4 жыл бұрын
I absolutely will! Thanks for watching!
@mustafamegahed78734 жыл бұрын
Great job! Thank you so much! Sadly, I have some work at college and couldn't finish the video but I will definitely come back to it hopefully next week.
@KenJee_ds4 жыл бұрын
No problem! It is there for you to learn at your own pace!
@Mario-ox5dm4 жыл бұрын
I sense a rising Kaggle Grandmaster in the future!
@KenJee_ds4 жыл бұрын
Haha I don't know about that! Long road ahead
@anoopashware95393 жыл бұрын
thank you sir to make this video I can't explain it in words. how much information in this video. which is really helpful for me to become a good data scientist. thank you so much
@KenJee_ds3 жыл бұрын
Really glad to hear this video helped!
@mimikoko42994 жыл бұрын
U have a best data science chanel, I love u
@KenJee_ds4 жыл бұрын
Thank you Mimi!
@dakadoodle4 жыл бұрын
Literally was looking at this today
@KenJee_ds4 жыл бұрын
Awesome!
@saurabhjoshi48874 жыл бұрын
Hi ken Great Video, I just completed your 7 part data science from beginning series. I am a beginner in data science and your video helped me a lot. Thanks 😊
@KenJee_ds4 жыл бұрын
Thanks for working through the whole project series! I hope that this video helps you as well!
@daedalusdreamjournal59254 жыл бұрын
Hello there :) I haven't watched the full video yet, but there's a reason for this and is linked to a suggestion I'd like to propose to you for similar videos in the future: Despite being very VERY green in this, I decided to have a first go at this all by myself ... and boy was and is it still frustrating :P The reason behind this was that I wanted to try a first attempt without a guiding hand. Once I finished my first model, I quickly realized that there were tons of ways where I blundered like a total noob ... which is actually totally fine :) And despite the frustration of the experience, it felt like I gathered valuable experience from this. And it is only now that I am starting to watch this video .. but only bit by bit, as I want to try to do as much by myself as possible (mistake be damned since they are being done at home where it won't hurt anyone and where I can learn safely from the experience). SO my suggestion is this: Could it be possible for future similar videos to have it in several parts? Or, at the very least, to timestamp the different section of your handling of a particular problem? I feel like it could be very valuable, especially for very recent newcomers like me. Anyways, thanks a ton for your videos, very much appreciated ! (especially some of the code where you use apply and lambda functions to handle data transformations, this is definitely something that will be useful for me in the near and long future! :) Signed: A total newbie at this.
@KenJee_ds4 жыл бұрын
This is a great idea! I think I will try the time stamp portion for the next one. I would also recommend my project from scratch series: kzbin.info/aero/PL2zq7klxX5ASFejJj80ob9ZAnBHdz5O1t . I broke this one into each phase of the data science lifecycle. I think your approach is really great though! I highly recommend that for other people going through this.
@sauravsahay88032 жыл бұрын
I keep getting tired and demotivated and I keep picking myself up to learn this :(
@bianchialex4 жыл бұрын
Just came here to see what you got. I used random forest and got .76 on my first try and then a little tuning got it to .77. I think I could make it better so I will continue to play around. I got to the point in my course curriculum mid-lecture I said "this is more advanced than I need to get started on beginner projects" and instantly hopped off to do titanic. I had it all worked up in my head to be some super hard task but it turned out to be relatively painless! I am going to do a couple other smaller projects and then try something of my own, probably using youtube data because I am a massive geek for the algorithim.
@KenJee_ds4 жыл бұрын
I think the smaller projects is a good idea! All about building some momentum!
@arick20504 жыл бұрын
Super informative, thanks Ken!
@KenJee_ds4 жыл бұрын
Thanks for watching Aric!!
@albertosei3558 Жыл бұрын
I will try this very soon. Bookmarking this
@KenJee_ds Жыл бұрын
💪
@ImportData14 жыл бұрын
Learned something new - VotingClassifier!
@KenJee_ds4 жыл бұрын
Awesome! Yeah, it is super useful and easy to use! Next time I will probably experiment more with some pipelines to clean up the feature engineering a bit!
@ImportData14 жыл бұрын
@@KenJee_ds I find the feature engineering/selection process the toughest. Sometimes you think you engineered features well enough, but the model accuracy doesn't necessarily resonate. Would love to see how you experimenet with pipelines!
@KenJee_ds4 жыл бұрын
@@ImportData1 Yep! This is definitely the case where I could have done more!
@s8x.9 ай бұрын
thanks for this video. Just started this problem and realized I have no idea what I'm doing
@tomasagustin22434 жыл бұрын
Amazing!, i learn a lot with your videos, thanks for sharing your knowledge, hug from Argentina!
@KenJee_ds4 жыл бұрын
Thank you for watching! I would love to visit Argentina some day!
@tomasagustin22434 жыл бұрын
Hope you comee there are a lot of beatiful people here and a lot of party hahahaha
@prabirbiswas4404 жыл бұрын
Wow what a in-depth analysis. You really put a lots of efforts into this. This is my first try in Kaggle too, after spending this much time i wonder how much time it will take for even tougher Data , i also checked the House Rent Competition. It have 81 Features. how can we do such a detailed analysis on all the features. Not sure how the real-world ML problems are solved where they might have 100 or even more features. I am really excited to know more :)
@KenJee_ds4 жыл бұрын
Thanks for watching! I will be doing the housing dataset next, so stay tuned!!
@alyona13112 жыл бұрын
I learned so much from your video, thank you!
@KenJee_ds2 жыл бұрын
Amazing! Thank you for watching!
@eminalopatyuk6812Ай бұрын
Thank you so much for your tutorial, Ken.
@AIPlayerrrr4 жыл бұрын
I’d be super interested in seeing you competing in a real Kaggle Competition.
@KenJee_ds4 жыл бұрын
I will likely be trying one in a few months! Stay tuned!
@AIPlayerrrr4 жыл бұрын
Ken Jee great! I am excited
@Gamma34 жыл бұрын
Me too! Great channel
@sadiakamal68664 жыл бұрын
Great job..Please do these sort of videos more often!
@KenJee_ds4 жыл бұрын
Thank you for watching! Will definitely be trying to make more of these!
@bencantc25484 жыл бұрын
Amazing video! I hope you do a similar video on regression and clustering problems in the future!
@KenJee_ds4 жыл бұрын
Thanks for watching! I plan to do a regression problem next!
@jfr5434 жыл бұрын
This video is gold!
@KenJee_ds4 жыл бұрын
Thanks for the kind words! I'm glad you found it helpful!
@jonasschroder72444 жыл бұрын
Great! Very inspiring and helpful!
@KenJee_ds4 жыл бұрын
Thanks for watching Jonas!
@manasagrawal83653 жыл бұрын
thanks Ken this was really helpful
@KenJee_ds3 жыл бұрын
Thanks for watching!
@kartikeyanamdev44714 жыл бұрын
First of all Thanks Ken for this, and secondly it's a request if you can make a dedicated video on how data analysis can work in cricket, I know you may not be knowing about the game but I really want to implement some data analysis into the game of cricket, so just need your help and it will do great if you make a video on the same. Have a good day mate.
@KenJee_ds4 жыл бұрын
I will try to bring someone in who is familiar on the game!
@ΧρυσόστομοςΠαπαδόπουλος-κ5π4 жыл бұрын
I think it would be great if you could show how you would present this project in a markdown file in order to add it to your github. Thanks for the great work!!!
@KenJee_ds4 жыл бұрын
I will work on it!
@solaawodiya73604 жыл бұрын
Hi Ken, thanks for the help on learning about data science. I struggle a lot using Kaggle to learn python. The user experience for me is quite intimidating compared to other platforms I used as there are times even when I know the question, I get lost on how to answer and follow the steps.
@karlduckett3 жыл бұрын
Really awesome! My only minor criticism is that in the first half of the video, most of the pivot tables and charts are displaying counts. When comparing counts between categories (i.e. survival rate by age) it really needs to display the percentage of that grouping... Sadly I'm too much of a noob to figure it out just yet :(
@KenJee_ds3 жыл бұрын
Totally fair, thanks for the feedback!
@moajjem044 жыл бұрын
This video is a great help!
@KenJee_ds4 жыл бұрын
Glad to hear! Thank you for watching!
@JBB6853 жыл бұрын
Would you consider doing one for the linear regression example you suggested on your 3 beginners’ projects? It’s the Aimes housing prices project
@mohithedaoo69684 жыл бұрын
This was much needed... Thank you very much!!l
@KenJee_ds4 жыл бұрын
Happy I could help! Thank you for watching!
@imakonkonvicted4 жыл бұрын
Thanks! I will try to do this alongside your video! :D
@KenJee_ds4 жыл бұрын
Awesome! Would love to hear how it goes!
@dhristovaddx4 жыл бұрын
Thank you for the great video! It's very helpful! ^_^
@KenJee_ds4 жыл бұрын
Thanks for watching! Glad it was helpful!
@dxzgamingtricks59382 жыл бұрын
you are a genius!!!!
@KenJee_ds2 жыл бұрын
Thanks for watching!
@kefahelhelou9418 Жыл бұрын
Thanks for the great efforts
@KenJee_ds Жыл бұрын
Thanks for watching!
@NivenShado4 жыл бұрын
Thank you for this. This was my dabble in data science and I may not have understood all of what you did, but I'm happy I tried it. Where should I go from here? I am thinking about taking the data science bootcamp course on Udemy.
@KenJee_ds4 жыл бұрын
I'm happy you tried it as well! I recommend the free micro courses on kaggle.com actually. They are a great way to continue what you started with this walk through!
@NivenShado4 жыл бұрын
@@KenJee_ds Thanks! I'll do that. I also haven't submitted the walk through as it is 99% your code - and am not sure if that is okay to do or if I am supposed to do that. The other 1% was me finding the proportion of how many people named William survived on the boat since there seemed to be a lot of Williams on the Titanic lol
@KenJee_ds4 жыл бұрын
@@NivenShado As long as you don't make it public, I think it is generally fine to submit your results to see what score you get. If you do make it public, you want to make sure you either fork the notebook, or give credit at the top and through the workbook for the cod portions you used. I hope this helps!
@AlexKite683 жыл бұрын
Thank you for this great video! I've already subscribed to your channel, digging to find a lots of DS insights )) But please improve the audio quality in future videos: background noises are really frustrating, and a background music seems to be a little bit loud. But again, you're making a great resource that is very useful for Data Science beginners like me!
@KenJee_ds3 жыл бұрын
Thanks for watching! I have adjusted the music in the newer videos
@jmiller10953 жыл бұрын
Ken, I'm a long time listener first time caller :) This is a terrific video .. and I have one question! ... at around 22:00 you clearly tell us that concatenating training and test datasets together and then pre-processing them all together is NOT the way it should be done in real -world (real world way, as I understood: train encoder scaler on training dataset then transform test dataset using encoder scaler trained (only) from training dataset). So .. do you have (or do you know of one) similar to this one but which demos the real world way of doing business?
@KenJee_ds3 жыл бұрын
Good question! I don't have one on my channel, but I think Nicholas Renotte probably has a tutorial on his where he does it correctly. There are plenty of ways to do it correctly. You can create functions that label the data in the same way as you did in the tests set, more specifically, you can use the sklearn pipeline. If you go to the kaggle kernels, I expect there will be quite a few that do it correctly. Thanks for watching!
@hemantgautam14 жыл бұрын
Hi Ken, Please create a separate play list for kaggle videos. 🙂
@KenJee_ds4 жыл бұрын
Will do!
@OnlineGreg2 жыл бұрын
13:27 why exactly do you choose 'Ticket' for values in the pivot-table? Dont understand that
@deepakshiarora8354 жыл бұрын
take a drink every time ken says actually.
@KenJee_ds4 жыл бұрын
Oof, something I'm working on improving (actually) haha
@deepakshiarora8354 жыл бұрын
@@KenJee_ds you're so humble (actually).
@velevki Жыл бұрын
I am trying to understand the one-hot encoder method you suggested around 22:03, applying this method to real-world datasets. I ran into this issue before and had to merge train and test data sets to be able to apply a one-hot encoder. What is the solution to the issue of not having the same columns when we apply a one-hot encoder to train and test sets separately? I don't know how to approach that.
@KenJee_ds Жыл бұрын
You can create a pipeline to create the same columns in the dataset. I do that better in this tutorial here: kzbin.info/www/bejne/hIK0ZHeImb10jqc&ab_channel=KenJee
@adaptronankidu2611 Жыл бұрын
@@KenJee_ds Thank you!
@ultimategohan2190 Жыл бұрын
What an in-depth analysis! By the way, how many possible approaches are there to this dataset?
@KenJee_ds Жыл бұрын
Thanks for watching! As it turns out, there are really infinite ways you can approach a problem like this haha.
@xdominator3654 жыл бұрын
Thank You Ken for providing this video...your new subscriber from India...🧡✌
@KenJee_ds4 жыл бұрын
Awesome! Thank you for subscribing! I hope my other videos are helpful as well!
@nikhilatluri15694 жыл бұрын
@@KenJee_ds yes for sure Watched almost all your videos And got a lot of information in building my career
@xdominator3654 жыл бұрын
@@KenJee_ds just finished my data science beginners playlist...🙂✌... after this going to start my first project for beginners....thank you so much, Ken, earlier I was going in the wrong path, I will be your fan...🧡🧡🧡want to get in touch with you please sir...
@surajkumarmaurya80884 жыл бұрын
Thanks a lot Sir, this help me a lot to clear my doubts.
@KenJee_ds4 жыл бұрын
Great!! Thanks for watching!
@nailujretuas20932 жыл бұрын
very helpful, thank you. comment for the algorithm.
@KenJee_ds2 жыл бұрын
Thank you!
@johnyanergos Жыл бұрын
Actually Capt is not for the captain of the ship it seems , after some internet search i found out that Capt is referred to a different captain that was onboard the titanic as a passenger who unfortunatly lost his life in the sinking.
@KenJee_ds Жыл бұрын
Cool! Did not know that!
@elsins97904 жыл бұрын
Thank you for this video and your explanation how you approach data science problems. I was just able to reach the baseline from the Titanic tutorial by my own approach with XGB and GridsearchCV. Did you tried stacked denoising autoencoder in your projects and how did it work out? It is kinda like an automatic unsupervised learning approach that can be fed into a neural network. Your channel is golden! Keep it up and stay healthy!
@KenJee_ds4 жыл бұрын
Thanks for watching! For this one, I didn't stack any models. That would be a good way to go a step past what I did though!
@DarkPrince19963 жыл бұрын
You did a great job explaining your approach to solving the task at hand and walking us through the process and so Im wanting to know what would be the next steps for someone wanting to use this competition to learn data science? Like I dont have a detailed understanding of all the algorithms that you used in this competition so would it be best to pick the one that produced the best score and learn how to tune that particular algorithm model metrics to get a better score or would it be best to transfer your process to another beginner competition altogether to create a better understanding of the complete data science process as a whole?
@KenJee_ds3 жыл бұрын
I think this is great for learning how to tune the algorithms and seeing what results you get with different ones. It is also a good one for practicing feature engineering like I did with some of the seats etc.. I think transferring things to another competition would be a good idea!
@DarkPrince19963 жыл бұрын
@@KenJee_ds appreciate your advice and I will definitely do that.
@gisinatrag2 ай бұрын
Thank you for this video, it's very useful to understand how a beginner should face a real problem. After your video I have decided to rewrite my notebook using your way to plan the steps. I am trying to go over 80% without NN, my best score is 0,78468 top 14%. What would it be a good score for a beginner?
@shaikhkashif99732 жыл бұрын
U did feature Engineering first?? Then remove outliers 🤔
@D4nte-RN11 ай бұрын
Thanks for your great tutorial on youtube! I have a question In regrading to this analysis. In 24:40 you start talking about cross validation. I'm not sure If I clear understand your code here. You're making cross validation on X_train /y_train data set? Is this correct aproach? Or we should do this on whole data set X/y?
@MohamedKhaled-el7vv4 жыл бұрын
I have a question @24:15 after you made some changes for the entire dataset you split the data to train and test so doesn't this cause data leakage?
@KenJee_ds4 жыл бұрын
This would in fact cause data leakage. This is one of the things that separates kaggle from real data science problems. In this scenario, we want to maximize performance using all the data available to us.
@gupnir4 жыл бұрын
Hi Ken, your videos are really helpful for beginners like me. Can you do a similar walk-through video for House Prices problem as well.... thanks in advance.
@KenJee_ds4 жыл бұрын
I plan to! Thanks for watching Nirmit!
@MarsLanding914 жыл бұрын
Thanks Ken!
@KenJee_ds4 жыл бұрын
Thanks for watching!
@amrelshabasy11832 жыл бұрын
Thanks, Ken for this great video. Can you please explain, how did you measure that the Model XGboost is overfitting?
@nateMan53704 жыл бұрын
Hello Ken, really cool video. I have a question for you. You gave us all the important variables that contributed to the model at around 32:30. I was wondering, with those variables in mind, are we able to dig a little deeper and see what the actual values were that contributed most to the model? If that sounds confusing, here is an example. Age is the second most significant variable in the model. If we can program what I just asked you, the model would "spit" out ages 15-18 years, I do not know if that's true, old were the most significant and contributed most to the model. This is my first data science project, so I want to apologize in advance if my words do not sound "data sciency" enough.
@KenJee_ds4 жыл бұрын
Hi Nate - Thanks for watching! I'm glad you're starting your journey with my material! We aren't able to isolate the specific range in a continuous variable and split out a certain range like you had illustrated with the current method. If we wanted, we could make a variable for different age ranges and see what the impact would be from running those through the model though!
@nateMan53704 жыл бұрын
@@KenJee_ds Another question, I am using a gradient boosting model for the titanic dataset. Thankfully I did not run into any errors but I noticed that the model, specifically the r2, is very inconsistent. Every time I re run the model, it gives me a different r2. Is that normal thing to happen and why does it happen?
@nateMan53704 жыл бұрын
also, sorry for the barrage of questions, but from time 25:00 to 28:00 you mention all the models you created. like logistic regression, tree, KNN, etc. The numbers in the array [] --> could you explain what those represent? I am thinking they are the r2 but I do not think that is correct.