Read Aloud the Text Content
This audio was created by Woord's Text to Speech service by content creators from all around the world.
Text Content or SSML code:
<speak> Welcome to the lesson on exploratory data analysis (EDA). <break strength="strong"/>In this lesson, you will do exploratory data analysis (EDA).<break strength="strong"/> Exploratory data analysis means you can get deeper insights into the data. <break strength="strong"/>With EDA, <break strength="weak"/>you can get some insights that help build the machine learning model so that you can predict unseen data. <break strength="strong"/>We will try to analyze the data using statistical measures and different methods. <break strength="strong"/>EDA helps to build and evaluate the model.<break strength="x-strong"/> First, you need to check for missing values. <break strength="x-strong"/> Use the isnull function available in padas to detect missing values. <break strength="strong"/>The isnull method helps to find the number of missing values. <break strength="x-strong"/> Missing values means null values. <break strength="strong"/>You can read them as nan or not a number. <break strength="x-strong"/> The isnull function will give the number of missing values or null values in a given column. <break strength="x-strong"/> We have eight columns in the data. <break strength="strong"/>It will give the count of the null values from the data.<break strength="strong"/> If there is a null value,<break strength="weak"/> it shows True. <break strength="strong"/>Else it shows False. <break strength="strong"/>The sum function will find the total sum of True values. <break strength="x-strong"/> The true indicates it is a missing value. <break strength="x-strong"/> The sum will count the number of True values from the isnull output. <break strength="strong"/>In serial number, it is showing 0. <break strength="x-strong"/> That means there are no null values. <break strength="strong"/>If there is any null value, <break strength="weak"/>the number will be non-zero. Suppose it indicates that in the column serial number, we have zero null values.<break strength="strong"/> You can see there are no null values in the data. <break strength="strong"/>So there is no need to replace the missing values.<break strength="x-strong"/> Next, you need to identify and remove the outliers. <break strength="strong"/>Since it is a regression problem, the output is a continuous variable. <break strength="strong"/>You do not see discrete values. <break strength="strong"/>Our objective is to predict the chance of admission.<break strength="strong"/> It is a continuous output variable. <break strength="strong"/>Since the output is not a discrete value,<break strength="weak"/> it is a regression problem.<break strength="x-strong"/> </speak>