Difference between Fact and Dimension tables in SQL/Power BI with example.

Quick Introduction

Fact and dimension tables are two fundamental concepts in data warehousing and database design. They serve different roles in organizing and storing data for efficient analysis. Here's the difference between them:

Key Differences between Fact and Dimension Tables

Fact Table:

  • A fact table contains the quantitative data or facts that you want to analyze.

  • It typically consists of numerical values, measures, or metrics.

  • Fact tables are often quite large, containing millions or even billions of records.

  • They are associated with business transactions or events.

  • Fact tables usually have foreign keys that link to dimension tables.

  • They store data about what happened, such as sales, orders, or customer interactions.

Dimension Table:

  • A dimension table contains descriptive attributes that provide context to the facts in the fact table.

  • It holds textual or categorical data that helps in filtering, grouping, and labeling facts.

  • Dimension tables are generally smaller compared to fact tables.

  • They provide additional information for slicing and dicing the data.

  • Dimension tables often have primary keys that are referenced as foreign keys in the fact table.

  • They store data about who, what, where, when, and why something happened.

Example - [HumanResource Schema]

Let's take an example of HumanResource Schema from the AdventureWorks2022 Database. Here are the tables and data from the HumanResources schema in the AdventureWorks2022 database:

  • Department: This table contains information about the departments in the company, such as the department name, the department manager, and the department location.
Column NameData TypeDescription
DepartmentIDintThe unique identifier for the department.
Namenvarchar(50)The name of the department.
ManagerIDintThe employee ID of the department manager.
LocationIDintThe location ID of the department.
  • Employee: This table contains information about the employees of the company, such as their name, address, and phone number.
Column NameData TypeDescription
EmployeeIDintThe unique identifier for the employee.
Namenvarchar(50)The name of the employee.
Titlenvarchar(50)The title of the employee.
BirthDatedatetimeThe date of birth of the employee.
HireDatedatetimeThe date the employee was hired.
DepartmentIDintThe department ID of the employee.
ManagerIDintThe employee ID of the employee's manager.
Phonenvarchar(25)The phone number of the employee.
Emailnvarchar(50)The email address of the employee.
Addressnvarchar(60)The address of the employee.
Citynvarchar(30)The city of the employee's address.
StateProvincenvarchar(2)The state or province of the employee's address.
CountryRegionnvarchar(20)The country of the employee's address.
PostalCodenvarchar(10)The postal code of the employee's address.
  • EmployeeDepartmentHistory: This table contains information about the departments that employees have worked in, such as the department name, the department manager, and the department start and end dates.
Column NameData TypeDescription
EmployeeIDintThe unique identifier for the employee.
DepartmentIDintThe department ID of the employee.
DepartmentNamenvarchar(50)The name of the department.
ManagerIDintThe employee ID of the department manager.
StartDatedatetimeThe start date of the employee's employment in the department.
EndDatedatetimeThe end date of the employee's employment in the department.
  • EmployeePayHistory: This table contains information about the pay history of employees, such as their salary, bonus, and overtime pay.
Column NameData TypeDescription
EmployeeIDintThe unique identifier for the employee.
PayDatedatetimeThe date the pay was earned.
RatemoneyThe hourly rate of pay for the employee.
HoursWorkedintThe number of hours worked by the employee.
RegularPaymoneyThe regular pay for the employee.
OvertimePaymoneyThe overtime pay for the employee.
BonusPaymoneyThe bonus pay for the employee.
TotalPaymoneyThe total pay for the employee.
  • JobCandidate: This table contains information about job candidates, such as their name, address, and resume.
Column NameData TypeDescription
JobCandidateIDintThe unique identifier for the job candidate.
Namenvarchar(50)The name of the job candidate.
Addressnvarchar(60)The address of the job candidate.
Citynvarchar(30)The city of the job candidate's address.
StateProvincenvarchar(2)The state or province of the job candidate's address.
CountryRegionnvarchar(20)The country of the job candidate's address.
PostalCodenvarchar(10)The postal code of the job candidate's address.
Phonenvarchar(25)The phone number of the job candidate.
Emailnvarchar(50)The email address of the job candidate.
Resumevarbinary(max)The resume of the job candidate.

So Let's Identify which are Fact and dimension Tables from the HumanResource Schema.

So we have added prefixes to whichever tables are Fact as Fact and Dimension ad Dim.

fig. Data Model of HumanResource Schema

The following tables in the HumanResources schema are fact tables:

  • EmployeePayHistory: This table contains information about the pay history of employees.

  • EmployeeDepartmentHistory: This table contains information about the departments that employees have worked in.

The following tables in the HumanResources schema are dimension tables:

  • Department: This table contains information about the departments in the company.

  • Employee: This table contains information about the employees of the company.

  • JobCandidate: This table contains information about job candidates.

  • Shift: This table contains information about the shifts that employees work.

How tables were categorized as Fact and Dimension tables?

The EmployeePayHistory and EmployeeDepartmentHistory tables are fact tables because they store measurements of business events. The EmployeePayHistory table stores information about the pay of employees, such as their salary, bonus, and overtime pay. The EmployeeDepartmentHistory table stores information about the departments that employees have worked in, such as the department name, the department manager, and the department start and end dates.

The Department, Employee, JobCandidate, and Shift tables are dimension tables because they provide context for the measurements in the fact tables. The Department table provides information about the departments in the company, such as the department name, the department manager, and the department location. The Employee table provides information about the employees of the company, such as their name, address, and phone number. The JobCandidate table provides information about job candidates, such as their name, address, and resume. The Shift table provides information about the shifts that employees work, such as the shift start and end times, and the number of employees on each shift.

Thank you for reading till the end!

0
Subscribe to my newsletter

Read articles from The Analyst Geek directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

The Analyst Geek
The Analyst Geek

I am a data enthusiast who finds joy in working with data, utilizing my skills to extract meaningful and valuable insights from it.