Name: who, dtype: int64
xs=[i+0.8fori,_inenumerate(titanic['who'])]sns.barplot(data=titanic,x=titanic['who'],y=titanic.survived)plt.title('Surival Data for Men,Women and Children')plt.xlabel('Men Women and Children')plt.ylabel('Survival Probability')
<matplotlib.text.Text at 0xe6c1d30>
There are 177 null values in age which can effect our results.So we have to fill these values by appropiate value
#first we find the meantitanic['age'].mean()
The mean is 29 but we cant fill all the 177 values by 29 as there can be many children among these 177 values.Let us check the median
..Median is also same as the mean.we cant fill all the 177 values by 28 as there can be many children among these 177 values.So now check how many children are there
# we see that the 'who' column tells us whether the entry if of 'man' , 'woman' or 'child'.#so we check the number of children here
Name: who, dtype: int64
Number of children is 83 so we cannot fill the NaN values by media=28 or mean=29.We can fill the NaN entries with median if we have adult 'man' or 'woman'. Let us check the NaN entries in titanic['age'] corressponding to titanic['who']='child
Above plot and calculations shows that First Class had more chance of survival compared to Third Class.After First Class, the Second Class had more chance of survival.Third class had the least chance of survival
Both Male and Female have higher chance of survival in the upper class as the data shows .
All bove plots shows that Higher Cabin class has more chance of survival
titanic_fare.plot(kind='bar',figsize=(15,10))plt.ylabel('survival probability')plt.xlabel('Female , Male and Fare distribution')
<matplotlib.text.Text at 0xef0a1d0>
We see that the chances for survival in Class-3 is lower for all age and sex groups except for males between 18 and 80 years where Class-2 survival is higher than class-3 Men of all age groups have high survival in upper class Women of
The above graph again shows that first class had more chance of survival in all the above age groups
age_groups=pd.cut(titanic['age'],[0,20,40,60,81])titanic.groupby(age_groups).size().plot(kind='bar',stacked=True)plt.title("Distribution of Age Groups",fontsize=14)plt.ylabel('Count')plt.xlabel('Age Group');
p=sns.violinplot(data=titanic,x='survived',y='age')p.set(title='Survival by Age',xlabel='Survival',ylabel='Age Distribution',xticklabels=['Died','Survived']);plt.show()
Above observations show that age group between 20 and 40 has more chance of survival
Question : What is the survival chance for lonely passengers?
# Draw a nested barplot to show survival for class and sexg=sns.factorplot(x="child",y="survived",hue="parents",data=titanic,size=6,kind="bar",palette="muted",)g.set(title='Children Survival wrt to Family');g.set_axis_labels("Child or Adult","Survival Probability")g.set_xticklabels(["Adult","Child"])
<seaborn.axisgrid.FacetGrid at 0xefe3198>
The above graph shows that children with parents had more chance of survival compared to children without parents or with nannies
Above calculations shows that lonely passengers had less chance of survival
Based on the above calculations we can approximately say that :
1-Females had more survival chance than the male
2-First class passengers had more survival chance than the lower classes (economic factor)
3-More passengers who paid higer fares survived (also economic factor)
4-Age group between 20 and 40 had highest surival chance
5-Lonely passengers had less survival chance than those travelling with companions
6-More children died who were travelling without parents
Above findings cannot be accurate due to many aspects.Like we are missing a lot of age data (i.e 177 entries).
Also we dont know that in those days, which age category was considered as "child"
There are also 688 entries for "deck" column missing which can effect our finding that Class-1 passengers survived more .Also we dont know the locations of these decks,it is possible that decks in certain locations had more chance of survival than compared to the class of passengers.
It is also possible, that Class-3 passengers had a certain location which made them difficult to survive or it is also possible that the ice berg at the location of Class-3 passengers.
So these conclusions cannot be 100% correct as there are many factors involved which we have no information and also because there are so many missing values.