Removing duplicates in installed applications list in OSHI

Rohan SarnadRohan Sarnad
3 min read

PR: https://github.com/oshi/oshi/pull/2902

In a recent contribution to OSHI, a key improvement was made to how installed applications are fetched across OS types: deduplication of application entries while preserving their original order.

When querying Windows registry entries across both KEY_WOW64_64KEY and KEY_WOW64_32KEY, it's not uncommon to encounter duplicate entries for the same application. To solve this, we leveraged Java’s LinkedHashSet and LinkedHashMap — ensuring that the final list is unique and consistently ordered.

Problem: Duplicate Application Entries

On 64-bit Windows systems, applications may be listed in both the 64-bit and 32-bit registry views.

Solution: Deduplicate with LinkedHashSet

We changed the internal data structure used to accumulate applications to a LinkedHashSet<ApplicationInfo>:

Set<ApplicationInfo> appInfoSet = new LinkedHashSet<>();

// Populate the set
appInfoSet.add(app);

// Convert back to list for return
return new ArrayList<>(appInfoSet);

This deduplicates by using equals()/hashCode() on ApplicationInfo, while maintaining the original discovery order — something that HashSet would not guarantee.

    @Override
    public boolean equals(Object o) {
        if (this == o) {
            return true;
        }
        if (!(o instanceof ApplicationInfo)) {
            return false;
        }
        ApplicationInfo that = (ApplicationInfo) o;
        return timestamp == that.timestamp && Objects.equals(name, that.name) && Objects.equals(version, that.version)
                && Objects.equals(vendor, that.vendor) && Objects.equals(additionalInfo, that.additionalInfo);
    }

    @Override
    public int hashCode() {
        return Objects.hash(name, version, vendor, timestamp, additionalInfo);
    }

Code: https://github.com/oshi/oshi/blob/master/oshi-core/src/main/java/oshi/software/os/ApplicationInfo.java

Why Order Matters in equals() and hashCode()

The ApplicationInfo class includes a field: Map<String, String> additionalInfo. Originally, this was a plain HashMap. But this caused problems with equals() and hashCode() because HashMap does not preserve insertion order — meaning two logically identical maps could produce different hash codes.

To ensure stable behaviour, we replaced it with a LinkedHashMap, which guarantees order preservation:

this.additionalInfo = additionalInfo != null
        ? new LinkedHashMap<>(additionalInfo)
        : Collections.emptyMap();

This change was subtle but important. Without it, two ApplicationInfo objects with the same content might fail equals() or be incorrectly handled in Sets — breaking our deduplication logic.

Unit Tests for Deduplication and Hash Consistency

We added test cases to confirm ApplicationInfo behaves correctly in sets.

    public void testEqualsAndHashCodeSameValues() {
        Map<String, String> info1 = new LinkedHashMap<>();
        info1.put("installLocation", null);
        info1.put("installSource",
                "C:\\ProgramData\\Package Cache\\{FE8C7838-D3E6-4CEA-87BE-216E42391827}v20.2.37.0\\");

        ApplicationInfo app1 = new ApplicationInfo("SQL Server Management Studio", "20.2.37.0", "Microsoft Corp.",
                1746576000000L, info1);
        ApplicationInfo app2 = new ApplicationInfo("SQL Server Management Studio", "20.2.37.0", "Microsoft Corp.",
                1746576000000L, new LinkedHashMap<>(info1));

        assertEquals(app1, app2);
        assertEquals(app1.hashCode(), app2.hashCode());
    }

    @Test
    public void testEqualsAndHashCodeDifferentVersion() {
        ApplicationInfo app1 = new ApplicationInfo("SQL Server Management Studio", "20.2.37.0", "Microsoft Corp.",
                1746576000000L, new LinkedHashMap<>());
        ApplicationInfo app2 = new ApplicationInfo("SQL Server Management Studio", "20.3.37.0", "Microsoft Corp.",
                1746576000000L, new LinkedHashMap<>());

        assertNotEquals(app1, app2);
    }

    @Test
    public void testDeduplicationWithListResult() {
        ApplicationInfo app1 = new ApplicationInfo("SQL Server Management Studio", "20.2.37.0", "Microsoft Corp.",
                1746576000000L, new LinkedHashMap<>());

        ApplicationInfo app2 = new ApplicationInfo("SQL Server Management Studio", "20.2.37.0", "Microsoft Corp.",
                1746576000000L, new LinkedHashMap<>());

        Set<ApplicationInfo> dedupedSet = new LinkedHashSet<>();
        dedupedSet.add(app1);
        dedupedSet.add(app2); // Duplicate

        List<ApplicationInfo> resultList = new ArrayList<>(dedupedSet);

        assertEquals(1, resultList.size());
        assertEquals(app1, resultList.get(0));
    }

Code: https://github.com/oshi/oshi/blob/master/oshi-core/src/test/java/oshi/software/os/ApplicationInfoTest.java

Conclusion

This contribution showcases how use of data structures (LinkedHashSet, LinkedHashMap) can improve data quality, stability, and correctness

0
Subscribe to my newsletter

Read articles from Rohan Sarnad directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Rohan Sarnad
Rohan Sarnad