SIGMA DHYANA
Python EDA (Exploratory Data Analysis)
Python track covering programming foundations, statistics, NumPy, Pandas, visualization and full EDA workflow.
Module 1: Introduction to Python
Objective: To introduce Python, emphasizing its advantages and core concepts, particularly in data analytics.
Topics and Sub-Topics: ▼
- Overview of Data Analytics
- Introduction to Data Analytics
- Importance of Python in Data Analytics
- Real-world Applications of Python in Data Analytics
- Python Programming History & Features
- History of Python
- Key Features of Python
- Setting Up a Python Environment
- Installing Anaconda
- Introduction to Jupyter Notebooks
- Setting Up Visual Studio Code for Python
- Introduction to PyCharm
- Python Syntax Overview
- Basic Syntax
- Indentation
- Comments and Docstrings
Hands-on Exercise:
- Install Anaconda and set up Jupyter Notebooks.
- Write a simple Python program using Jupyter Notebook.
- Set up Visual Studio Code and PyCharm for Python development.
Module 2: Basic Python
Objective: To master the fundamental elements of Python programming.
Topics and Sub-Topics: ▼
- Identifiers and Variables
- Naming Conventions
- Assigning Values
- Dynamic Typing
- Keywords
- List of Python Keywords
- Reserved Words
- Operators
- Arithmetic Operators
- Comparison Operators
- Logical Operators
- Bitwise Operators
- Assignment Operators
- Identity Operators
- Membership Operators
- Data Types
- Primitive Data Types: Integer, Float, String, Boolean
- Non-Primitive Data Types: List, Tuple, Dictionary, Sets
- Comprehensions in Python
- List Comprehensions
- Dictionary Comprehensions
- Set Comprehensions
- Nested Comprehensions
- Control Flow
- Conditional Statements: If, If-else, If-elif-else, Nested if
- Loops: While Loop, For Loop, Break, Continue, Pass
Hands-on Exercise:
- Write programs demonstrating variable assignments, operators, and control flow.
- Use comprehensions to create lists, dictionaries, and sets.
Module 3: Functions and Modules
Objective: To define and use functions and modules to create modular code.
Topics and Sub-Topics: ▼
- User Defined Functions
- Defining Functions
- Function Arguments
- Return Statement
- Built-in Functions
- Common Built-in Functions
- Using Built-in Functions
- Lambda Functions
- Anonymous Functions
- Syntax and Usage
- Map, Filter, Reduce
- Map: Applying a function to all items in an input list
- Filter: Constructing a list from elements of the input list that return true for a function
- Reduce: Applying a rolling computation to sequential pairs of values in a list
Hands-on Exercise:
- Write functions to perform simple tasks.
- Use lambda functions with map, filter, and reduce.
Module 4: File Handling
Objective: To perform file operations in Python for reading and writing data.
Topics and Sub-Topics: ▼
- File Operations
- Overview of File Handling in Python
- Importance of File Handling in Programming
- File Types: CSV, Excel, Text, PDF, JSON
- Opening Files
- Using the open() Function
- Different Modes for Opening Files (r, w, a, x)
- Creating Files
- Creating a New File Using 'w', 'a', or 'x' Mode
- Reading Files
- Reading the Entire Content Using read()
- Reading Line by Line Using readline()
- Reading All Lines into a List Using readlines()
- Writing to Files
- Writing a String to a File Using write()
- Writing Multiple Lines Using writelines()
- Deleting Files
- Using the os Module to Delete Files
Hands-on Exercise:
- Create, read, write, and delete files using Python.
- Perform file operations with CSV and JSON files.
Module 5: Exception Handling
Objective: To handle exceptions and errors gracefully in Python.
Topics and Sub-Topics: ▼
- Types of Errors
- Syntax Errors
- Runtime Errors
- Logical Errors
- Exception Handling
- try … except Block
- try … except … finally Block
- try … except … else Block
- Handling Multiple Exceptions
- Raising Exceptions
Hands-on Exercise:
- Write programs to demonstrate exception handling.
- Create custom exceptions and handle them appropriately.
Module 6: Regular Expressions
Objective: To use regular expressions for pattern matching in strings.
Topics and Sub-Topics: ▼
- Python re Module
- Functions in re-Module
- Compiling Regular Expressions
- Methods with Regex Usage
- match()
- search()
- findall()
- sub()
- split()
Hands-on Exercise:
- Use regular expressions to search, match, and manipulate strings.
Module 7: Object-Oriented Programming (OOP) in Python
Objective: To master the core concepts of OOP in Python for designing modular code.
Topics and Sub-Topics: ▼
- Classes and Objects
- Defining Classes
- Creating Objects
- Class Attributes and Methods
- OOP Principles
- Polymorphism
- Encapsulation
- Inheritance
Hands-on Exercise:
- Create classes and objects.
- Implement OOP principles in Python programs.
Module 8: Statistics with Python
Objective: To understand basic statistical concepts and perform statistical analysis using Python.
Topics and Sub-Topics: ▼
- Introduction to Statistics
- Importance of Statistics in Data Analysis
- Types of Statistics: Descriptive and Inferential
- Descriptive Statistics
- Measures of Central Tendency: Mean, Median, Mode
- Measures of Dispersion: Range, Variance, Standard Deviation
- Skewness and Kurtosis
- Probability
- Basic Probability Concepts
- Probability Distributions: Normal, Standard Normal Distribution
- Correlation and Regression
- Correlation Coefficient
- Coefficient of Determination
- Simple Linear Regression
Hands-on Exercise:
- Implement statistical measures using NumPy, SciPy, and StatsModels.
- Perform linear regression and correlation analysis.
Module 9: NumPy
Objective: To introduce NumPy for numerical operations.
Topics and Sub-Topics: ▼
- NumPy Basics
- Difference Between NumPy and List
- Introduction to NumPy
- NumPy Array
- Array Operations
- numpy.random Module
- Array Operations
- Vector Operations
- Statistical Functions
- Array Manipulation
- Array Indexing
- Array Manipulation
- Array Broadcasting
Hands-on Exercise:
- Practice with NumPy arrays and perform mathematical operations.
- Manipulate and index arrays.
Module 10: Data Pre-processing with Pandas
Objective: To manipulate and preprocess data using Pandas.
Topics and Sub-Topics: ▼
- Introduction to Pandas Library
- Series and DataFrame
- Data Structures in Pandas
- Working with Series and DataFrames
- Creating Series and DataFrames
- Basic Operations on Series and DataFrames
- Indexing and Selecting Data
- Selecting Rows and Columns
- Filtering Data
- Data Cleaning and Preprocessing
- Dealing with Duplicate Data
- Handling Outliers
- Feature Scaling and Normalization
- Encoding Categorical Variables
- Pandas Methods
- Creating DataFrames from various sources
- Viewing Data
- Selecting Data
- Filtering Data
- Adding/Modifying Columns
- Removing Data
- Handling Missing Data
- Detecting and Dropping Duplicates
- Aggregation and Grouping
- String Methods
- Merging and Joining
- Date and Time Handling
- Pivot Tables
- Exporting Data
Hands-on Exercise:
- Create and manipulate DataFrames.
- Clean and preprocess data using Pandas methods.
Module 11: Data Visualization with Matplotlib and Seaborn
Objective: To create visualizations using Matplotlib and Seaborn.
Topics and Sub-Topics: ▼
- Introduction to Data Visualization
- Importance of Data Visualization
- Types of Data Visualization
- Matplotlib for Basic Plotting
- Line Plot
- Bar Plot
- Histogram
- Scatter Plot
- Pie Chart
- Box and Whiskers Plot
- Seaborn for Statistical Data Visualization
- Line Plot
- Barplot
- Boxplot
- Heatmap
- Pairplot
- Countplot
- Regplot
- Scatterplot
- Hueplot
- Violin plot
- Swarmplot
- Stripplot
- Customizing Plots and Charts
- Choosing Axis
- Adding Grids
- Customizing Axis Values
- Adding Titles and Labels
- Customizing Colors and Styles
- Adding Legends
Hands-on Exercise:
- Create various types of plots using Matplotlib and Seaborn.
- Customize plots for better visualization.
Module 12: Exploratory Data Analysis (EDA)
Objective: To perform exploratory data analysis (EDA) to summarize the main characteristics of a dataset and uncover patterns, spot anomalies, test hypotheses, and check assumptions using visualizations and summary statistics.
Topics and Sub-Topics: ▼
- Introduction to EDA
- Introduction to EDA
- Tools and Libraries for EDA
- Loading Data
- Data Cleaning and Preparation
- Identifying and Handling Missing Data
- Identifying and Handling Duplicates
- Identifying and Handling Outliers
- Feature Engineering
- One Hot Encoding
- Label Encoding
- Range Categorization
- Univariate Analysis
- Summary Statistics
- Visualizations for Univariate Analysis
- Distribution Analysis
- Bivariate Analysis
- Summary Statistics for Bivariate Analysis
- Visualizations for Bivariate Analysis
- Categorical vs. Numerical Analysis
- Multivariate Analysis
- Summary Statistics for Multivariate Analysis
- Visualizations for Multivariate Analysis
- Exploratory Data Analysis (EDA) Practice
- Case Study
- Reporting and Presenting EDA Findings
- Hands-on
- Conducting a complete EDA on a given dataset
- Creating and presenting an EDA report
- Mentored EDA Projects Hands-On
- Analyzing Diwali Sales Trends
- IPL Match Performance Analysis
- Sales Insights from Euromart Data
- Car Manufacturing and Pricing Analysis
- Titanic Survival Data Analysis
Hands-on Exercise:
- Conduct a complete EDA on a given dataset.
- Create and present an EDA report.