Kaggle Titanic Survival Prediction Competition Part 1/2 - Exploratory Data Analysis

  Рет қаралды 20,551

Jason Chong

Jason Chong

Күн бұрын

In this video, I will walk through my solution and analysis for one of the most popular beginner's competitions on Kaggle, that is the Titanic survival prediction competition. This video is part one of a two-part series.
Kaggle is a subsidiary of Google and an online community of data scientists and machine learning practitioners. On Kaggle, you can find many published datasets, data science and machine learning tutorials but most importantly, Kaggle is best known for their competitions.
The Titanic survival prediction competition is a great beginner's competition that introduces beginners to not only the Kaggle platform but also the process behind an end-to-end machine learning project, from loading and reading datasets to building a fully functional predictive model.
The aim of this competition is to analyse how different passenger features such as age, gender and ticket class correlate with survival and subsequently train a machine learning model to classify unknown passenger data. This is an example of a binary classification problem in machine learning where passengers are classified as either survived or did not survive.
This video covers the exploratory data analysis (EDA) section of my notebook. EDA is the process of exploring our datasets as well as summarise the key characteristics and trends in our data such as data types, distributions and correlation between numerical variables.
I managed to gather three important insights as a result of the EDA process:
1. Female passengers were more likely to survive than male passengers
2. First-class passengers were most likely to survive in comparison to second class as well as third-class passengers
3. Passengers of younger ages, especially children were more likely to survive than the other passengers on the Titanic
A lot of time has gone into preparing the solution notebook as well as this video. So, if you enjoyed it or found it helpful in your own learning, it would mean the world to mean if you could like the video and subscribe to my channel.
If you have any questions, feel free to reach out to me. Happy learning!
Timestamp
00:00 - Introduction
05:06 - Import libraries
05:48 - Import and read data
08:20 - Data description
09:30 - Data types, missing data and summary statistics
12:35 - Feature analysis introduction
13:49 - Analyse categorical variables
19:26 - Analyse numerical variables
24:33 - Summary and conclusion
26:04 - Outro
Install Anaconda and Jupyter Notebook
www.anaconda.com/products/ind...
Kaggle Titanic Survival Prediction Competition
www.kaggle.com/c/titanic/over...
Link to my notebook on GitHub
github.com/chongjason914/kagg...
Follow me
Facebook - / chongjason914
Instagram - / chongjason914
Twitter - / chongjason914
Medium - / chongjason
LinkedIn - / chongjason914
#Kaggle #DataScience #MachineLearning

Пікірлер: 31
@jeromeprovensal
@jeromeprovensal 3 ай бұрын
Best tutorial on the Kaggle Titanic competition!
@SurajSingh-ff4ki
@SurajSingh-ff4ki 3 жыл бұрын
Thank you Jason, you really helped me create my very first training model.
@faith3220a
@faith3220a 2 жыл бұрын
Thank you very much for the detailed explanations
@H99x2
@H99x2 2 жыл бұрын
This is one of the best guides of the Kaggle Titanic dataset on YT! And you have a very good way of explaining. Subscribed for more like this :)
@fairuzbackup72
@fairuzbackup72 3 жыл бұрын
Fantastic video! Really useful for beginners such as myself to understand what are the steps that are performed during a data analysis process :)
@l4dybu9
@l4dybu9 Жыл бұрын
Thank yiu so much for this vid ✌🏻🙌🏻💖💖
@zozolovpuppies
@zozolovpuppies 3 жыл бұрын
Awesome video! Excited to see the next ep!
@JasonChong914
@JasonChong914 3 жыл бұрын
Thanks bb 🥺
@digigoliath
@digigoliath 2 жыл бұрын
Awesome. TQVM!
@benjamindeworsop8348
@benjamindeworsop8348 3 жыл бұрын
Awesome video, your hypothesis-driven approach is a great way to make sense of the crazy amounts of data. Nice conclusion too - really ties it all together
@JasonChong914
@JasonChong914 3 жыл бұрын
Thanks Ben! ❤️
@akhilsoni729
@akhilsoni729 3 жыл бұрын
Looking forward to more videos like this being uploaded. Loved it.
@JasonChong914
@JasonChong914 3 жыл бұрын
Thanks Akhil - really appreciate your kind words! 😊
@endernator
@endernator 2 жыл бұрын
pls more content like this... sooo goood
@ai.simplified..
@ai.simplified.. 3 жыл бұрын
I ve already done the competition but it will be very usefull for any beginer like me, thx for sharing I already was thinking about making a video about it.
@mchafe
@mchafe 3 жыл бұрын
Thanks, very educative
@solaawodiya7360
@solaawodiya7360 2 жыл бұрын
This was so amazing Jason!. Thanks for the direct, clear, and step-by-step guide through this project. I learnt a lot and I'm looking forward to learn more 👌
@JasonChong914
@JasonChong914 2 жыл бұрын
Thanks Sola - glad I was able to help! 😉
@justinzhang4648
@justinzhang4648 Жыл бұрын
Awesome
@ashish-blessings
@ashish-blessings 2 жыл бұрын
You are simply awesome!
@JasonChong914
@JasonChong914 2 жыл бұрын
Aww thank you! 😊
@jpsiyyadri
@jpsiyyadri 2 жыл бұрын
Thanks Jason, u helped me a lot
@JasonChong914
@JasonChong914 2 жыл бұрын
You're most welcome, Jai. Happy I was able to help!
@makotao7218
@makotao7218 Жыл бұрын
What an amazing video! Im starting to learn data science and this was great - honestly, SWE feels so boring after this haha
@franklynchidi4379
@franklynchidi4379 Жыл бұрын
great was direct...its just that am new to it but was heplful though i had some issues with the grid
@raylow6213
@raylow6213 3 жыл бұрын
I find it useful when you explain the reason behind an occurrence, but did you come across any suspicious/interesting data that you might later explore on your own? Btw nice work!
@rohanshava5490
@rohanshava5490 3 жыл бұрын
Good job bro :)
@JasonChong914
@JasonChong914 3 жыл бұрын
Thank you!
@pontus_qwerty
@pontus_qwerty 7 ай бұрын
18:56: why is survival probability embarking from Q higher than from S when almost all passengers in Q are class 3? Are they most class 3 men embarking from Q as opposed to the class 3 women from S?
@mrreese2342
@mrreese2342 Жыл бұрын
is this dataset real ? i mean are the people in the dataset the real people that were in the titanic ?
Vivaan  Tanya once again pranked Papa 🤣😇🤣
00:10
seema lamba
Рет қаралды 29 МЛН
Alat Seru Penolong untuk Mimpi Indah Bayi!
00:31
Let's GLOW! Indonesian
Рет қаралды 13 МЛН
Don't Code Searches with Lists or Arrays, Use This | .NET C#
5:52
How to do the Titanic Kaggle Competition
18:28
Aladdin Persson
Рет қаралды 71 М.
ChatGPT Just Learned To Fix Itself!
5:47
Two Minute Papers
Рет қаралды 27 М.
Beginner Kaggle Data Science Project Walk-Through (Titanic)
38:16
EASIEST WAY TO BECOME A DATA ANALYST
15:10
Mo Chen
Рет қаралды 35 М.
Tutorial 11-Exploratory Data Analysis(EDA) of Titanic dataset
31:45
How I’d learn ML in 2024 (if I could start over)
7:05
Boris Meinardus
Рет қаралды 954 М.
Titanic Survival Prediction in Python - Machine Learning Project
53:38