Power BI Power Query M Code: Table.Group, Table.RowCount
let
Source = Sql.Database("YourServer", "YourDatabase"),
Impressions = Source{[Schema = "dbo", Item = "Impressions"]}[Data],
GroupedImpressions = Table.Group(
Impressions,
{"Ad_id", "Site_name", "Impression_date"},
{{"ImpressionCount", each Table.RowCount(_), Int64.Type}}
)
in
GroupedImpressions
Group and Aggregate in Power Query is a powerful technique for transforming and summarizing large datasets, making it easier to analyze and derive insights. By grouping data based on specific columns and aggregating it using functions, you can efficiently reduce the data volume and focus on key metrics. This approach not only enhances performance but also simplifies the data model, making it more manageable and insightful for reporting and analysis.
Scenario
Given meta data:
Table Name | Column Name |
campaigns | campaign_id |
campaigns | name |
ads | ad_id |
ads | name |
ads | campaign_id |
impressions | impression_id |
impressions | ad_id |
impressions | site_name |
impressions | impression_time |
impressions | impression_date |
Required tasks
The Impressions table contains approximately 30 million records per month. You need to create an ad analytics system to meet the following requirements:
Present ad impression counts for the day, campaign, and site_name. The analytics for the last year are required.
Minimize the data model size.
Solution
- Group the Impressions Query in Power Query:
• By grouping the Impressions query in Power Query by Ad_id, Site_name, and Impression_date, and then aggregating using the CountRows function, you can significantly reduce the data volume. This approach creates a summarized view of the data, which is more efficient for analysis and reporting.
- Create One-to-Many Relationships Between the Tables:
• Establishing one-to-many relationships between the campaigns, ads, and impressions tables will ensure data integrity and enable efficient querying. This allows you to easily join tables and retrieve related data, which is crucial for accurate ad analytics.
Step-by-step outline
- Group the Impressions Query in Power Query:
• Load the Impressions Table into Power Query.
• Group By: Go to the Home tab and select Group By. Group by Ad_id, Site_name, and Impression_date.
• Aggregate: Add a new aggregation column using the CountRows function to count the number of impressions for each group.
- Create One-to-Many Relationships Between the Tables:
• Campaigns to Ads: campaigns.campaign_id to ads.campaign_id
• Ads to Impressions: ads.ad_id to impressions.ad_id
This sequence ensures that you first summarize the data, making it more efficient to establish relationships and perform further analysis.
Sample Data
Impressions Table:
Ad_id | Site_name | Impression_date | Impression_id |
1 | SiteA | 2023-08-01 | 1001 |
1 | SiteA | 2023-08-01 | 1002 |
2 | SiteB | 2023-08-01 | 1003 |
1 | SiteA | 2023-08-02 | 1004 |
2 | SiteB | 2023-08-02 | 1005 |
3 | SiteC | 2023-08-02 | 1006 |
Step 1: Group and Aggregate in Power Query
Load the Impressions Table into Power Query.
Group By:
• Go to the Home tab and select Group By.
• Group by Ad_id, Site_name, and Impression_date.
- Aggregate:
• Add a new aggregation column e.g. Impression_count
using the CountRows function to count the number of impressions for each group.
Resulting Power M Query
let
Source = Sql.Database("YourServer", "YourDatabase"),
Impressions = Source{[Schema = "dbo", Item = "Impressions"]}[Data],
GroupedImpressions = Table.Group(
Impressions,
{"Ad_id", "Site_name", "Impression_date"},
{{"ImpressionCount", each Table.RowCount(_), Int64.Type}}
)
in
GroupedImpressions
Resulting Aggregated Table:
Ad_id | Site_name | Impression_date | Impression_count |
1 | SiteA | 2023-08-01 | 2 |
2 | SiteB | 2023-08-01 | 1 |
1 | SiteA | 2023-08-02 | 1 |
2 | SiteB | 2023-08-02 | 1 |
3 | SiteC | 2023-08-02 | 1 |
Step 2: Create One-to-Many Relationships Between the Tables
- Campaigns to Ads:
• campaigns.campaign_id to ads.campaign_id
- Ads to Impressions:
• ads.ad_id to impressions.ad_id
Example Relationships
Campaigns Table:
campaign_id | name |
1 | Campaign A |
2 | Campaign B |
Ads Table:
ad_id | campaign_id | name |
1 | 1 | Ad A |
2 | 2 | Ad B |
3 | 1 | Ad C |
By following these steps, you can efficiently summarize your data and establish relationships, making your ad analytics system more manageable and insightful.
Subscribe to my newsletter
Read articles from Mohamad Mahmood directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
Mohamad Mahmood
Mohamad Mahmood
Mohamad's interest is in Programming (Mobile, Web, Database and Machine Learning). He studies at the Center For Artificial Intelligence Technology (CAIT), Universiti Kebangsaan Malaysia (UKM).