Exploratory Data Analysis (EDA)

📘 Data Science 👁 42 views 📅 Nov 14, 2025
⏱ Estimated reading time: 1 min

Introduction

Exploratory Data Analysis (EDA) is the process of examining datasets to summarize their main characteristics, identify patterns, spot anomalies, and form hypotheses.

1. Checking Dataset Structure


df.shape
df.info()
df.describe()
  

2. Understanding Numerical Features


df["age"].mean()
df["salary"].median()
df["age"].hist()
  

3. Understanding Categorical Features


df["gender"].value_counts()
df["department"].unique()
  

4. Detecting Outliers


import seaborn as sns
sns.boxplot(df["salary"])
  

5. Correlation Analysis


df.corr()
sns.heatmap(df.corr(), annot=True)
  

6. Pair Plot


sns.pairplot(df)
  

7. Handling Skewness


df["income"].skew()
  

Conclusion

EDA gives you a strong understanding of your dataset and guides your feature engineering and model selection decisions.


🔒 Some advanced sections are available for Registered Members
Register Now

Share this Post


← Back to Tutorials

Popular Competitive Exam Quizzes