MySQL Tutorial: Everything You Need to Know

Arijit DasArijit Das
20 min read

Table of contents

Session 1

1. SHOW DATABASES

The SHOW DATABASES command is used to display all the available databases present in the MySQL server.

Syntax:

SHOW DATABASES;

Explanation:

This will list all the databases created in the MySQL server. It helps in identifying existing databases before creating a new one.


2. CREATE DATABASE if not exists techforallwitharijit

The CREATE DATABASE statement creates a new database.

Syntax:

CREATE DATABASE IF NOT EXISTS techforallwitharijit;

Explanation:

  • IF NOT EXISTS: This ensures that the database is only created if it does not already exist, preventing duplicate database creation errors.

  • techforallwitharijit: This is the name of the database.


3. USE techforallwitharijit

The USE statement selects the database on which all subsequent queries will operate.

Syntax:

USE techforallwitharijit;

Explanation:

This command sets the current database to techforallwitharijit, making it the default for all future queries.


4. SELECT DATABASE()

This query displays the currently selected database.

Syntax:

SELECT DATABASE();

Explanation:

It confirms which database is currently in use.


5. CREATE TABLE employee

The CREATE TABLE statement is used to create a new table within the selected database.

Syntax:

CREATE TABLE employee(
EID INT AUTO_INCREMENT,
FirstName VARCHAR(50) NOT NULL,
LastName VARCHAR(50) NOT NULL,
Age INT NOT NULL,
Salary INT NOT NULL,
Location VARCHAR(50) NOT NULL,
PRIMARY KEY(EID)
);

Explanation:

  • EID INT AUTO_INCREMENT: This creates a unique ID for each employee, automatically incrementing with each new record.

  • VARCHAR(50): Defines the maximum length of string data.

  • NOT NULL: Ensures that the column cannot have NULL values.

  • PRIMARY KEY(EID): Sets the EID column as the primary key, making each value unique and not null.


6. DESC employee

The DESC command displays the structure of the specified table.

Syntax:

DESC employee;

Explanation:

It provides a description of the table's columns, data types, and constraints.


7. SHOW TABLES

This command lists all the tables present in the selected database.

Syntax:

SHOW TABLES;

Explanation:

It helps to confirm that the employee table was created successfully.


8. INSERT INTO employee

The INSERT INTO statement is used to add new records to the table.

Syntax:

INSERT INTO employee(FirstName, LastName, Age, Salary, Location)
VALUES ("Arijit", "Das", 27, 200000, "Tripura");

Explanation:

This inserts a new employee with the specified values into the employee table.

  • The EID field is auto-incremented automatically.

Multiple Insertions:

INSERT INTO employee(FirstName, LastName, Age, Salary, Location) VALUES ("Yasoda", "Nandan", 27, 400000, "Vrndavan");
INSERT INTO employee(FirstName, LastName, Age, Salary, Location) VALUES ("Srimati", "Radharani", 27, 800000, "Barsana");

9. SELECT * FROM employee

The SELECT statement is used to retrieve data from the table.

Syntax:

SELECT * FROM employee;

Explanation:

  • *: Selects all columns.

  • This command will display all the records available in the employee table.

Session 2

1. Database Operations

Show All Databases

SHOW DATABASES;

This query displays all databases available in the system.

Select Database

USE techforallwitharijit;

Selects the database techforallwitharijit for subsequent queries.

Show Tables

SHOW TABLES;

Lists all the tables present in the selected database.

Display All Records from Employee Table

SELECT * FROM employee;

Fetches all records from the employee table.


2. Course Table

Create Course Table

The Course table stores information about various courses offered.

CREATE TABLE Course(
    CourseID INT AUTO_INCREMENT,
    CourseName VARCHAR(50) NOT NULL,
    CourseDuration_Months INT NOT NULL,
    CourseFee INT NOT NULL,
    PRIMARY KEY(CourseID)
);

Table Description

DESC Course;

Displays the structure of the Course table.

Insert Course Records

INSERT INTO Course(CourseName, CourseDuration_Months, CourseFee) VALUES
("The Complete Excel Mastery Course", 3, 1499),
("DSA For Interview Preparation", 2, 4999),
("SQL Bootcamp", 1, 2999);

View Course Data

SELECT * FROM Course;

Retrieves all records from the Course table.


3. Learners Table

Create Learners Table

The Learners table stores information about learners enrolling in the courses.

CREATE TABLE Learners(
    Learner_Id INT AUTO_INCREMENT,
    LearnerFirstName VARCHAR(50) NOT NULL,
    LearnerLastName VARCHAR(50) NOT NULL,
    LearnerPhoneNo VARCHAR(15) NOT NULL,
    LearnerEmailID VARCHAR(50),
    LearnerEnrollmentDate TIMESTAMP NOT NULL,
    SelectedCourses INT NOT NULL,
    YearsOfExperience INT NOT NULL,
    LearnerCompany VARCHAR(50),
    SourceOfJoining VARCHAR(50) NOT NULL,
    Batch_Start_Date TIMESTAMP NOT NULL,
    Location VARCHAR(50) NOT NULL,
    PRIMARY KEY(Learner_Id),
    UNIQUE KEY(LearnerEmailID),
    FOREIGN KEY(SelectedCourses) REFERENCES Course(CourseID)
);

Table Description

DESC Learners;

Displays the structure of the Learners table.

Insert Learners Records

INSERT INTO Learners(LearnerFirstName, LearnerLastName, LearnerPhoneNo, LearnerEmailID, LearnerEnrollmentDate, SelectedCourses, YearsOfExperience, LearnerCompany, SourceOfJoining, Batch_Start_Date, Location)
VALUES
("Akash", "Mishra", '9998887776', "akash@gmail.com", '2024-01-21', 1, 4, "Amazon", "LinkedIn", '2024-02-29', "Bengaluru"),
("Rishikesh", "Joshi", "9950192388", "carjkop@gmail.com", '2024-03-19', 3, 2, "HCL", "Youtube", '2024-03-25', "Chennai"),
("Jeevan", "Hegde", "9657856732", "jeevanhegdek@yahoo.co.in", '2024-01-15', 2, 0, "", "LinkedIn", '2024-01-16', "Noida");

View Learners Data

SELECT * FROM Learners;

Retrieves all records from the Learners table.


4. Data Analysis Queries

1. Employee with Highest Salary

SELECT * FROM employee ORDER BY Salary DESC LIMIT 1;

2. Highest Salary with Age > 30

SELECT * FROM employee WHERE age > 30 ORDER BY Salary DESC LIMIT 1;

3. Total Enrollments

SELECT COUNT(*) AS num_of_enrollments FROM Learners;

4. Enrollments for CourseID = 3 (SQL Bootcamp)

SELECT COUNT(*) AS num_of_learners_SQL FROM Learners WHERE SelectedCourses = 3;

5. Learners Enrolled in January

SELECT COUNT(*) AS num_learners_jan FROM Learners WHERE LearnerEnrollmentDate LIKE '%-01-%';

6. Update Learner Details

UPDATE Learners
SET YearsOfExperience = 1, LearnerCompany = 'Amazon'
WHERE Learner_Id = 4;

7. Count Unique Companies

SELECT COUNT(DISTINCT LearnerCompany) AS distinct_companies FROM Learners;

Conclusion

This documentation outlines the SQL commands used to create and manipulate data for the TechForAllWithArijit database. The queries cover table creation, data insertion, data retrieval, and data analysis tasks to help manage learners, courses, and employees efficiently.

Session 3

1. Displaying Databases

SHOW DATABASES;

Explanation:

This command lists all the databases present in the MySQL server.


2. Selecting a Database

USE techforallwitharijit;

Explanation:

The USE statement selects the database techforallwitharijit to perform further operations.


3. Displaying Tables

SHOW TABLES;

Explanation:

Lists all the tables in the currently selected database.


4. Displaying All Learners

SELECT * FROM learners;

Explanation:

Fetches all records from the learners table.


5. Count Students by Source of Joining

SELECT SourceOfJoining, COUNT(*) as num_of_students
FROM learners
GROUP BY SourceOfJoining;

Explanation:

  • GROUP BY groups rows by the SourceOfJoining column.

  • COUNT(*) calculates the number of students for each group.


6. Grouping by Source of Joining and Location

SELECT SourceOfJoining, Location, COUNT(*) as num_of_students
FROM learners
GROUP BY SourceOfJoining, Location;

Explanation:

  • Groups records by both SourceOfJoining and Location.

  • Counts the number of students in each combination.


7. Students Count by Selected Course

SELECT SelectedCourses, COUNT(*) AS num_of_students  
FROM learners
GROUP BY SelectedCourses;

Explanation:

Counts how many students have enrolled in each course.


8. Display All Courses

SELECT * FROM course;

Explanation:

Fetches all records from the course table.


9. Maximum Years of Experience by Source of Joining

SELECT SourceOfJoining, MAX(YearsOfExperience) as max_years_exp
FROM learners
GROUP BY SourceOfJoining;

Explanation:

Finds the maximum years of experience for each source of joining.


10. Minimum Years of Experience by Source of Joining

SELECT SourceOfJoining, MIN(YearsOfExperience) as min_years_exp
FROM learners
GROUP BY SourceOfJoining;

Explanation:

Finds the minimum years of experience for each source of joining.


11. Average Years of Experience by Source of Joining

SELECT SourceOfJoining, AVG(YearsOfExperience) as avg_years_exp
FROM learners
GROUP BY SourceOfJoining;

Explanation:

Calculates the average years of experience for each source of joining.


12. Summation of Experience by Source of Joining

SELECT SourceOfJoining, SUM(YearsOfExperience) as sum_years_exp
FROM learners
GROUP BY SourceOfJoining;

Explanation:

Calculates the total sum of experience for each source of joining. This does not make much sense for practical purposes.


13. Filtering Aggregated Data with HAVING

SELECT SourceOfJoining, COUNT(*) AS num_of_students  
FROM learners
GROUP BY SourceOfJoining
HAVING num_of_students > 2;

Explanation:

  • HAVING filters aggregated data after GROUP BY.

  • It shows only those groups where the count of students is greater than 2.


14. Inserting New Learners

INSERT INTO Learners(LearnerFirstName,LearnerLastName,LearnerPhoneNo,LearnerEmailID,LearnerEnrollmentDate,SelectedCourses,YearsOfExperience,LearnerCompany,SourceOfJoining,Batch_Start_Date,Location)
VALUES ("Sidhish", "Kumar", '9998827776', "sidhish@gmail.com", '2024-01-21', 1, 4, "Amazon", "LinkedIn", '2024-02-29', "Gurugram");

INSERT INTO Learners(LearnerFirstName,LearnerLastName,LearnerPhoneNo,LearnerEmailID,LearnerEnrollmentDate,SelectedCourses,YearsOfExperience,LearnerCompany,SourceOfJoining,Batch_Start_Date,Location)
VALUES ("Kaneesha", "Mishra", '9128887776', "kaneesha@gmail.com", '2024-01-21', 1, 5, "Google", "LinkedIn", '2024-02-29', "Bengaluru");

Explanation:

Inserts two new learners into the learners table.


15. Count Students Joined via YouTube

SELECT SourceOfJoining, COUNT(*) AS num_of_students  
FROM LEARNERS
GROUP BY SourceOfJoining
HAVING SourceOfJoining = "YouTube";

Explanation:

Filters the aggregated result to show only those who joined via YouTube.


16. Filtering Courses without "Excel"

SELECT * FROM course
WHERE CourseName NOT LIKE "%Excel%";

Explanation:

Fetches all courses excluding those containing the word "Excel".


17. Filtering Students with Multiple Conditions

SELECT * FROM learners
WHERE YearsOfExperience < 4 AND SourceOfJoining = "YouTube" AND Location = "Chennai";

Explanation:

Fetches learners who satisfy all the three conditions.


18. Filtering Experience Range

SELECT * FROM learners
WHERE YearsOfExperience BETWEEN 1 AND 4;

Explanation:

Fetches learners having experience between 1 and 4 years.


19. OR Operator Usage

SELECT * FROM learners
WHERE YearsOfExperience < 4 OR SourceOfJoining = "YouTube" OR Location = "Chennai";

Explanation:

Fetches learners satisfying any one of the three conditions.


20. ALTER Commands

DESC learners;
ALTER TABLE employee ADD column jobPosition varchar(50);
ALTER TABLE employee MODIFY column FirstName varchar(40);
ALTER TABLE employee DROP column jobPosition;

Explanation:

  • ADD: Adds a new column.

  • MODIFY: Changes column data type.

  • DROP: Deletes a column.


21. TRUNCATE Command

Truncates the table (removes all rows) without logging individual row deletions.


22. Implicit Typecasting

INSERT INTO Course(CourseName,CourseDuration_Months,CourseFee) VALUES("Foundations of Machine Learning",3.5,4999);
SELECT * FROM course;

Explanation:

Inserts a course with a decimal duration.


23. Creating Table with Timestamp

CREATE TABLE Course_Update(
CourseID INT AUTO_INCREMENT,
CourseName VARCHAR(50) NOT NULL,
CourseDuration_Months DECIMAL(3,1) NOT NULL,
CourseFee INT NOT NULL,
Changed_at TIMESTAMP DEFAULT NOW() ON UPDATE NOW(),
PRIMARY KEY(CourseID));

Explanation:

Creates a table with an AUTO_INCREMENT primary key and timestamp column.


24. Updating Course Fee

UPDATE course_update SET CourseFee = 3999
WHERE CourseID = 3;

Explanation:

Updates the CourseFee for the course with CourseID = 3.


This documentation covers all commands with detailed explanations of their functionality and significance.

Session 4

1. Database Selection

USE techforallwitharijit;

Explanation:

The USE statement selects the database techforallwitharijit to perform all subsequent operations.

2. Show Tables

SHOW TABLES;

Explanation:

Displays all the tables present in the techforallwitharijit database.

3. Retrieve All Records from Employee Table

SELECT * FROM employee;

Explanation:

Retrieves all the columns and rows from the employee table.

4. Count and Average Salary Grouped by Location

SELECT Location, count(Location) as total_employee, avg(Salary) as avg_salary
FROM employee
GROUP BY Location;

Explanation:

  • Location: Groups employees by their location.

  • count(Location): Counts the number of employees in each location.

  • avg(Salary): Calculates the average salary for employees in each location.

  • GROUP BY Location: Groups the result set by the Location column.

Output Example:

LocationTotal EmployeeAvg Salary
Bengaluru321666.67
Noida145000
Pune1100000
Hyderabad1250000

5. Join Approach to Display Employee Details with Grouped Information

SELECT firstName, lastName, employee.Location, total_employee, avg_salary
FROM employee
JOIN
(SELECT Location, count(Location) as total_employee, avg(Salary) as avg_salary
FROM employee
GROUP BY Location) as temp
ON employee.Location = temp.Location;

Explanation:

  • Subquery temp: Groups location-wise employee count and average salary.

  • JOIN: Combines employee records with the subquery on the Location column.

  • Displays firstName, lastName, Location, total_employee, and avg_salary.

6. Window Functions Approach

SELECT firstName, lastName, Location,
COUNT(Location) OVER (PARTITION BY Location) as total_employee,
AVG(Salary) OVER (PARTITION BY Location) as avg_salary
FROM employee;

Explanation:

  • PARTITION BY Location: Divides the result set into partitions by Location.

  • COUNT() OVER: Calculates the total employees in each location.

  • AVG() OVER: Calculates the average salary in each location without grouping.

7. Employee Priorities by Salary in Descending Order

SELECT FirstName, LastName, Salary,
ROW_NUMBER() OVER (ORDER BY Salary DESC) as Priority_Salary
FROM employee;

Explanation:

  • ROW_NUMBER(): Assigns a unique row number to each record.

  • ORDER BY Salary DESC: Orders salaries in descending order.

  • Priority_Salary: Ranks employees based on salary with unique ranks.

8. Insert New Employee Records

INSERT INTO employee(FirstName, LastName, Age, Salary, Location) VALUES ("Pramod", "Kumar", 26, 10000, "Noida");
INSERT INTO employee(FirstName, LastName, Age, Salary, Location) VALUES ("Rohan", "Bhatia", 27, 45000, "Hyderabad");

Explanation:

Inserts two new employee records into the employee table.

9. Ranking with Skipping Numeric Data (RANK)

SELECT FirstName, LastName, Salary,
RANK() OVER (ORDER BY Salary DESC) as Priority_Salary
FROM employee;

Explanation:

  • RANK(): Assigns rank to employees based on salary.

  • Identical salaries receive the same rank, and the next rank is skipped.

10. Ranking without Skipping Numeric Data (DENSE_RANK)

SELECT FirstName, LastName, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) as Priority_Salary
FROM employee;

Explanation:

  • DENSE_RANK(): Assigns rank to employees based on salary without skipping ranks.

11. Employees with 2nd Highest Salary

SELECT * FROM
(SELECT FirstName, LastName, Salary,
DENSE_RANK() OVER (ORDER BY Salary DESC) as Priority_Salary
FROM employee) as temp
WHERE Priority_Salary = 2;

Explanation:

  • Subquery: Assigns ranks using DENSE_RANK().

  • WHERE Priority_Salary = 2: Filters employees with the second highest salary.

Conclusion:

This documentation provides an in-depth explanation of MySQL queries used to manipulate and retrieve employee data. It covers basic table operations, grouping, joins, window functions, and ranking functions to generate comprehensive employee reports.

Session 5

1. Database Selection

Command:

USE techforallwitharijit;

Description: This command selects the techforallwitharijit database as the active database for subsequent operations. It ensures that all queries are executed within the context of this database.


2. Show Tables

Command:

SHOW TABLES;

Description: Displays all the tables present in the selected database techforallwitharijit. This helps to verify which tables are available for querying.


3. Describe Table Structure

Command:

DESC learners;

Description: Describes the structure of the learners table. This command provides the following information about each column:

  • Field Name

  • Data Type

  • Nullability (whether NULL values are allowed)

  • Key Information (Primary Key, Foreign Key)

  • Default Value

  • Extra Information (e.g., Auto Increment)


4. Select All Records from Tables

Command:

SELECT * FROM learners;
SELECT * FROM course_update;
SELECT * FROM course;

Description: These queries fetch all records from the learners, course_update, and course tables respectively. The * wildcard is used to select all columns from the table.


5. Describe Table Structure for course_update

Command:

DESC course_update;

Description: Displays the structure of the course_update table, showing detailed information about the table columns similar to the DESC learners command.


6. Fetch Most Enrolled Course Using Subquery (Optimized Approach)

Command:

SELECT CourseID, CourseName, temptable.EnrollmentCount
FROM course
JOIN
(SELECT SelectedCourses, COUNT(*) AS EnrollmentCount
FROM Learners
GROUP BY SelectedCourses
ORDER BY EnrollmentCount DESC
LIMIT 1) AS temptable
ON course.CourseID = temptable.SelectedCourses;

Explanation:

This query finds the most enrolled course in an optimized manner:

  • Subquery (temptable):

    • It selects SelectedCourses from the Learners table.

    • Groups the records by SelectedCourses.

    • Counts the number of learners enrolled in each course (COUNT(*) AS EnrollmentCount).

    • Orders the result in descending order of enrollment count.

    • Limits the result to only the top record (course with the highest enrollment).

  • JOIN Operation:

    • The subquery result (aliased as temptable) is joined with the course table on the matching CourseID and SelectedCourses.

    • This allows fetching the CourseName corresponding to the most enrolled course.

Advantages:

  • The subquery first filters the records before performing the join, making the operation more optimized and reducing the number of join operations.

7. Fetch Most Enrolled Course Using Direct Join

Command:

SELECT CourseID, CourseName, COUNT(*) AS EnrollmentCount
FROM Learners
JOIN course
ON Learners.SelectedCourses = course.CourseID
GROUP BY SelectedCourses
ORDER BY EnrollmentCount DESC
LIMIT 1;

Explanation:

This query achieves the same result but in a different way:

  • Performs a direct JOIN between the Learners and course tables.

  • Groups the result by SelectedCourses.

  • Counts the number of learners enrolled in each course.

  • Orders the result in descending order of enrollment count.

  • Limits the result to only one record (highest enrolled course).

Disadvantages:

  • This query performs the join operation first before filtering records, making it less optimized compared to the subquery approach.

8. Comparison Between Both Queries

ApproachOptimizationJoin OperationsComplexity
Subquery ApproachHigherFewerModerate
Direct JoinLowerMoreSimple

Conclusion: The subquery approach is more optimized as it reduces the number of join operations by filtering the records first.


9. Final Remarks

  • Always use subqueries when filtering large datasets before performing join operations to improve performance.

  • Use DESC to understand table structures before writing queries.

  • Properly alias subqueries to improve readability and avoid conflicts.

  • Use LIMIT to optimize result set size for performance-sensitive queries.

Session 6

Database Selection

USE techforallwitharijit;

Explanation: This command selects the database named techforallwitharijit to perform subsequent queries.

Table Selection

SELECT * FROM course_update;
SELECT * FROM learners;

Explanation: These commands display all records from the course_update and learners tables.


CASE Statements in SQL

Creating Course Fee Status Column

SELECT CourseID, CourseName, CourseFee,
    CASE
        WHEN CourseFee > 3999 THEN 'Expensive Course'
        WHEN CourseFee > 1499 THEN 'Moderate Course'
        ELSE 'Cheap Course'
    END AS CourseFeeStatus
FROM course_update;

Explanation:

  • The CASE statement creates a new column CourseFeeStatus.

  • It assigns the following labels based on the CourseFee:

    • Greater than 3999: 'Expensive Course'

    • Greater than 1499 but less than or equal to 3999: 'Moderate Course'

    • Else: 'Cheap Course'


CASE Expressions in SQL

Categorizing Course Type

SELECT CourseID, CourseName, CourseFee,
    CASE CourseFee
        WHEN 4999 THEN 'Premium Course'
        WHEN 3999 THEN 'Plus Course'
        ELSE 'Regular Course'
    END AS CourseType
FROM course_update;

Explanation:

  • This CASE expression assigns course categories based on exact CourseFee values.

  • It assigns:

    • 4999 -> 'Premium Course'

    • 3999 -> 'Plus Course'

    • Any other value -> 'Regular Course'


Creating Orders Table

CREATE TABLE Orders(
    OrderID INT AUTO_INCREMENT,
    Order_Date TIMESTAMP NOT NULL,
    Order_Learner_Id INT NOT NULL,
    OrderStatus VARCHAR(30) NOT NULL,
    PRIMARY KEY(OrderID),
    FOREIGN KEY(Order_Learner_Id) REFERENCES Learners(Learner_Id)
);

Explanation:

  • OrderID: Primary key with auto-increment functionality.

  • Order_Date: Stores order date and time (mandatory).

  • Order_Learner_Id: Foreign key referencing Learners table.

  • OrderStatus: Stores the order status with possible values like 'Complete', 'Pending', 'Closed'.

Viewing Table Structure

SHOW TABLES;
DESC Orders;

Explanation:

  • SHOW TABLES: Lists all tables in the database.

  • DESC Orders: Describes the table structure (columns, types, and constraints).

Dropping Table

DROP TABLE Orders;

Explanation: This command deletes the Orders table along with its data.


Inserting Records into Orders Table

INSERT INTO Orders(Order_Date, Order_Learner_Id, OrderStatus) VALUES ('2024-01-21',1,'COMPLETE');

Explanation: Inserts a record into the Orders table with the specified Order_Date, Order_Learner_Id, and OrderStatus.

Multiple INSERT statements follow the same pattern.

Viewing Inserted Records

SELECT * FROM Orders;

Explanation: Displays all records from the Orders table.


Aggregation Queries

Total Orders per Student

SELECT Order_Learner_Id, COUNT(*) AS Total_Orders
FROM Orders
GROUP BY Order_Learner_Id;

Explanation:

  • Groups records by Order_Learner_Id.

  • Counts total orders per student.

Joining Learners and Orders

SELECT temp.Order_Learner_Id, LearnerFirstName, LearnerLastName, temp.Total_Orders
FROM learners
JOIN
    (SELECT Order_Learner_Id, COUNT(*) AS Total_Orders
    FROM orders
    GROUP BY Order_Learner_Id) as temp
ON Learners.Learner_Id = temp.Order_Learner_Id;

Explanation:

  • The subquery (temp) calculates the total orders per student.

  • The main query joins Learners and temp on Learner_Id to fetch the learner's name along with their order count.


Complex Queries with CTE (Common Table Expressions)

Average Orders Per Student

SELECT temp.Order_Learner_Id, LearnerFirstName, LearnerLastName, temp.total_orders,  AVG(SUM(temp.total_orders)) OVER()  as Avg_Orders_Entire_Student
FROM
    (SELECT Order_Learner_Id, COUNT(*) AS total_orders
    FROM Orders
    GROUP BY Order_Learner_Id) as temp
JOIN Learners ON temp.Order_Learner_Id = Learners.Learner_Id
GROUP BY temp.Order_Learner_Id;

Explanation:

  • The subquery calculates the total orders per learner.

  • The AVG(SUM(temp.total_orders)) OVER() calculates the average of total orders across all learners.

  • GROUP BY groups the result by learner IDs.


Conclusion

This MySQL script demonstrates various SQL concepts, including:

  • Basic SELECT queries

  • CASE statements and expressions

  • Table creation with primary and foreign keys

  • Data insertion

  • Aggregation functions

  • JOIN operations

  • Common Table Expressions (CTEs)

It provides a comprehensive overview of how to manage a learner's order data within a relational database.

Session 7

1. Creating the Database

Command:

CREATE DATABASE IF NOT EXISTS ecommerce_sales_data;
SHOW DATABASES;
USE ecommerce_sales_data;

Explanation:

  • CREATE DATABASE IF NOT EXISTS: This command creates a database named ecommerce_sales_data only if it does not already exist.

  • SHOW DATABASES: Lists all the databases available on the MySQL server.

  • USE ecommerce_sales_data: Selects the ecommerce_sales_data database to execute further queries.


2. Creating the Table

Command:

CREATE TABLE `Sales_Dataset` (
    order_id VARCHAR(15) NOT NULL,
    order_date DATE NOT NULL,
    ship_date DATE NOT NULL,
    ship_mode VARCHAR(14) NOT NULL,
    customer_name VARCHAR(22) NOT NULL,
    segment VARCHAR(11) NOT NULL,
    state VARCHAR(36) NOT NULL,
    country VARCHAR(32) NOT NULL,
    market VARCHAR(6) NOT NULL,
    region VARCHAR(14) NOT NULL,
    product_id VARCHAR(16) NOT NULL,
    category VARCHAR(15) NOT NULL,
    sub_category VARCHAR(11) NOT NULL,
    product_name VARCHAR(127) NOT NULL,
    sales DECIMAL(38, 0) NOT NULL,
    quantity DECIMAL(38, 0) NOT NULL,
    discount DECIMAL(38, 3) NOT NULL,
    profit DECIMAL(38, 5) NOT NULL,
    shipping_cost DECIMAL(38, 2) NOT NULL,
    order_priority VARCHAR(8) NOT NULL,
    year DECIMAL(38, 0) NOT NULL
);

Explanation:

  • CREATE TABLE: Defines a table named Sales_Dataset.

  • VARCHAR: Variable-length string data type with a maximum length specified.

  • DATE: Date data type.

  • DECIMAL(m, d): Numeric data type with m as the total number of digits and d as the number of digits after the decimal point.

  • NOT NULL: Ensures that the column cannot have NULL values.

Additional Commands:

SHOW TABLES;
DESC sales_dataset;
  • SHOW TABLES: Lists all tables in the selected database.

  • DESC sales_dataset: Describes the table structure, showing column names, types, and constraints.


3. Loading Data into the Table

Command:

LOAD DATA INFILE 'C:/Users/HP/OneDrive/Desktop/MYSQL_BOOTCAMP/Session_7/Sales_Dataset.csv'
INTO TABLE Sales_Dataset
FIELDS TERMINATED BY ','
ENCLOSED BY '"'
LINES TERMINATED BY '\n'
IGNORE 1 ROWS;

Explanation:

  • LOAD DATA INFILE: Loads data from an external file into the table.

  • FIELDS TERMINATED BY ',': Specifies that fields in the CSV file are separated by commas.

  • ENCLOSED BY '"': Specifies that fields are enclosed by double quotes.

  • LINES TERMINATED BY '\n': Specifies that each line of the file ends with a newline character.

  • IGNORE 1 ROWS: Skips the first row (header row) of the CSV file.

Common Error:

If MySQL throws the secure-file-priv error, set the secure-file-priv path in MySQL configuration.


4. Retrieve Data

Command:

SELECT * FROM Sales_Dataset;

Explanation:

  • SELECT*: Retrieves all columns from the Sales_Dataset table.

5. Top 3 States with Highest Shipping Costs

Command:

SELECT state, SUM(shipping_cost) AS sum_shipping_cost
FROM Sales_Dataset
GROUP BY state
ORDER BY sum_shipping_cost DESC
LIMIT 3;

Explanation:

  • SUM(): Calculates the total shipping cost for each state.

  • GROUP BY: Groups the result set by state.

  • ORDER BY: Sorts the result set by sum_shipping_cost in descending order.

  • LIMIT 3: Returns only the top 3 records.


6. Using Common Table Expressions (CTE)

Command:

WITH states AS (
    SELECT state, SUM(shipping_cost) AS sum_shipping_cost
    FROM Sales_Dataset
    GROUP BY state
    ORDER BY sum_shipping_cost DESC
    LIMIT 3
)
SELECT * FROM states;

Explanation:

  • WITH: Defines a temporary result set named states.

  • The subquery calculates the top 3 states with the highest shipping costs.

  • The final SELECT statement retrieves the result set from the CTE.

Use Case:

CTEs improve the readability and reusability of complex queries.


7. Total Sales by Year

Command:

SELECT year, SUM(sales) AS total_sales
FROM sales_dataset
GROUP BY year;

Explanation:

  • SUM(sales): Calculates the total sales amount for each year.

  • GROUP BY: Groups the result set by the year column.


Conclusion

This documentation covers the creation, population, and querying of the ecommerce_sales_data database. It demonstrates various MySQL techniques including table creation, data import, aggregation functions, and the use of Common Table Expressions (CTEs). These queries are essential for performing data analysis on ecommerce sales datasets.

0
Subscribe to my newsletter

Read articles from Arijit Das directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Arijit Das
Arijit Das