Exploring LINQ in .NET 6: Mastering DistinctBy and More

Jan DoubekJan Doubek
4 min read

Since the inception of LINQ, methods suffixed with 'By' have played an important role in simplifying LINQ operations by introducing an additional 'selector' parameter.

A typical example is the OrderBy method. Without the selector parameter, even a straightforward task like sorting a list of objects by one of their properties would become needlessly complex, requiring a custom comparer implementation to select the appropriate property or properties for sorting.

// Using OrderBy to sort a list of Points based on the "X" member
var orderedByX = (new List<Point> { new Point(2, -2), new Point(-2, 2), new Point(0, 0) })
    .OrderBy(pt => pt.X);

Now, as long as we've had these invaluable LINQ methods, there's been a desire for similar selector-based methods to be available for other LINQ functions. A notable case is the LINQ Distinct method, where this extended functionality has been sorely missed. (I'm aware of the third-party libraries that have provided these capabilities, but having these features as part of the core framework is always preferable.)

This longing has finally been addressed with the arrival of .NET6, which introduces out-of-the-box support for six new selector-based methods. These additions significantly simplify our LINQ query building process, enhancing both efficiency and clarity.

The new methods are:

  • DistinctBy

  • ExceptBy

  • IntersectBy

  • MinBy/MaxBy

  • UnionBy

DistinctBy

DistinctBy returns unique (distinct) items from a list, determined by a key specified through a selector function. This method is particularly beneficial when dealing with large datasets, as DistinctBy can optimize performance by reducing the dataset to only unique items based on a particular property.

To illustrate, let’s consider an example where DistinctBy is used to quickly extract a subset of unique items from a collection based on a specific property. Specifically, we extract unique Person objects from a list based on their Name, effectively removing duplicates while ignoring other properties like age.

var people = new List<Person>
{
    new Person("Alice", 30),
    new Person("Bob", 25),
    new Person("Alice", 35)
};
var distinctNames = people.DistinctBy(p => p.Name);

MinBy, MaxBy

In the world of relational databases, functions like MIN and MAX are fundamental tools for retrieving the smallest and largest values from a dataset. When transitioning to C# and working with collections, the MinBy and MaxBy methods introduced in .NET6 offer an intuitive parallel.

Suppose you have a list of products, each with a name and a price. You want to find the product with the lowest price. MinBy makes this straightforward:

var products = new List<Product>
{
    new Product("Laptop", 1200),
    new Product("Smartphone", 800),
    new Product("Tablet", 600)
};
var cheapestProduct = products.MinBy(p => p.Price);

Here is an example identifying the most recent log entry from a list of entries by selecting the one with the latest timestamp:

var logEntries = new List<LogEntry>
{
    new LogEntry(DateTime.Parse("2023-03-01"), "System startup"),
    new LogEntry(DateTime.Parse("2023-03-05"), "User logged in"),
    new LogEntry(DateTime.Parse("2023-03-03"), "System update completed")
};
var latestEntry = logEntries.MaxBy(entry => entry.Timestamp);

UnionBy, ExceptBy, IntersectBy

The last three methods we'll cover in this post are UnionBy, ExceptBy and IntersectBy. These methods provide enhanced control over how sets are combined or compared based on specific properties or criteria.

UnionBy is used to combine two collections into one while eliminating duplicates based on a specified key selector.

  • Example: combining two lists of products from different sources and ensuring unique products based on product ID.

ExceptBy is used to create a collection that contains elements from the first collection that are not present in the second, again based on a specified key selector.

  • Example: removing products that are out of stock (present in an out-of-stock list) from a master product list.

IntersectBy finds common elements between two collections based on a specified key selector.

  • Example: finding common products in two different store inventories.

Here is an example scenario demonstrating all three methods to manage product inventories across multiple stores: using UnionBy to create a combined inventory list, ExceptBy to remove items that are out of stock, and IntersectBy to identify items that are common to all stores for a promotional campaign.

var store1Products = new List<Product>
{
    new Product(1, "Laptop"),
    new Product(2, "Smartphone"),
    new Product(3, "Tablet")
};

var store2Products = new List<Product>
{
    new Product(4, "Camera"),
    new Product(2, "Smartphone"),
    new Product(3, "Tablet")
};

var outOfStockProducts = new List<Product>
{
    new Product(2, "Smartphone")
};

// UnionBy: Combine product lists from both stores, avoiding duplicate IDs
var combinedInventory = store1Products
    .UnionBy(store2Products, p => p.ID);

// ExceptBy: Remove out-of-stock products from the combined inventory
var availableInventory = combinedInventory
    .ExceptBy(outOfStockProducts.Select(p => p.ID), p => p.ID);

// IntersectBy: Find common products available in both stores for a promotion
var commonProductsForPromotion = store1Products
    .IntersectBy(store2Products.Select(p => p.ID), p => p.ID);

EF Core

It's worth noting that these new methods are primarily designed for in-memory LINQ queries (LINQ to Objects). As of the most recent update, Entity Framework Core (currently at EF Core 8) does not yet support these new LINQ methods.

Your Turn

Have you had a chance to work with these selector-based LINQ methods yet? I'm curious to hear how you're integrating the new .NET 6 methods into your coding projects. Are there any other methods you're hoping to see in future updates? Feel free to share your experiences and insights in the comments below.

For more deep dives into LINQ, don't forget to explore other posts in my LINQ Gems series.

0
Subscribe to my newsletter

Read articles from Jan Doubek directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Jan Doubek
Jan Doubek

I'm an "old-school" Software Engineer, mostly specializing in C# and .NET. Currently in the process of discovering AI and Machine Learning.