Download Free Audio of Welcome to the lesson on exploratory data analysi... - Woord

Read Aloud the Text Content

This audio was created by Woord's Text to Speech service by content creators from all around the world.


Text Content or SSML code:

<speak> Welcome to the lesson on exploratory data analysis <break strength="weak"/>(EDA). <break strength="strong"/>In this lesson, <break strength="weak"/>you will do exploratory data analysis (EDA).<break strength="strong"/> Exploratory data analysis means<break strength="weak"/> you can get deeper insights into the data. <break strength="strong"/>With EDA, <break strength="weak"/>you can get some insights <break strength="weak"/>that help build the machine learning model<break strength="weak"/> so that you can predict unseen data. <break strength="strong"/>We will try to analyze the data using statistical measures <break strength="weak"/>and different methods. <break strength="strong"/>EDA helps to build and evaluate the model.<break strength="x-strong"/> First, <break strength="weak"/>you need to check for missing values. <break strength="x-strong"/> Use the is null function available in pandas to detect missing values. <break strength="strong"/>The is null method helps to find the number of missing values. <break strength="x-strong"/> Missing values means null values. <break strength="strong"/>You can read them as n a n or not a number. <break strength="x-strong"/> The is null function will give the number of missing values or null values in a given column. <break strength="x-strong"/> We have eight columns in the data. <break strength="strong"/>It will give the count of the null values from the data.<break strength="strong"/> If there is a null value,<break strength="weak"/> it shows True. <break strength="strong"/>Else it shows False. <break strength="strong"/>The sum function will find the total sum of True values. <break strength="x-strong"/> The true indicates it is a missing value. <break strength="x-strong"/> The sum will count the number of True values from the is null output. <break strength="strong"/>In serial number, it is showing 0. <break strength="x-strong"/> That means there are no null values. <break strength="strong"/>If there is any null value, <break strength="weak"/>the number will be non-zero. Suppose<break strength="weak"/> it indicates that in the column serial number, <break strength="weak"/>we have zero null values.<break strength="strong"/> You can see <break strength="weak"/>there are no null values in the data. <break strength="strong"/>So there is no need to replace the missing values.<break strength="x-strong"/> Next, <break strength="weak"/>you need to identify and remove the outliers. <break strength="strong"/>Since it is a regression problem,<break strength="weak"/> the output is a continuous variable. <break strength="strong"/>You do not see discrete values. <break strength="strong"/>Our objective is to predict the chance of admission.<break strength="strong"/> It is a continuous output variable. <break strength="strong"/>Since the output is not a discrete value,<break strength="weak"/> it is a regression problem.<break strength="x-strong"/> </speak>