Cracking the hardest part: Consolidating Product Data

Fabian WesnerFabian Wesner
2 min read

The most important but super complicated feature of a marketplace is transforming sellers' product data into a great product catalog that shops can present to their end customers.

Why is this so hard?

Sellers are often manufacturers that don't have a proper product data management team in-house. Their data is often pulled directly from their warehouse management systems, and it's typically in poor shape. There are countless challenges:

- Incomplete product data (e.g., missing descriptions or images)
- Poor quality (e.g., there is a description, but it's unusable)
- The provided data is often in a different language than the shop's

In addition, when products are loaded into a marketplace, they need to be harmonized with the shop's data:
- Products must be categorized within the shop's category tree
- Duplicate products need to be detected and merged (e.g., two sellers offering the same product)
- Images must be evaluated and selected for quality and consistency
- Product attributes from sellers must be mapped to the shop's schema.

The product attribute mapping is by far the hardest part because attributes may have different names, structures, or units. Imagine a shop selling tables with attributes like length and width in centimeters, while a seller provides a field called "尺寸" (dimension in Mandarin) containing data like "160x120", with units in inches, not centimeters. Before AI, solving this required either labor-intensive manual work or fragile transformation rules involving developers, which often broke when incoming data varied even slightly.

For my open-source seller center, I plan to use AI to handle some of these challenges. The LLM can be chosen per project, and prompts will be editable. If I find the time, I might start building it this or next week. Stay tuned!

0
Subscribe to my newsletter

Read articles from Fabian Wesner directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Fabian Wesner
Fabian Wesner

Hi, I'm Fabian, a hands-on CTO based in Berlin. My proudest achievement is co-founding and architecting Spryker, an enterprise e-commerce platform that Gartner recognizes as a Leader. Previously, I served as CTO at Project A Ventures and Rocket Internet, where I built and led development teams for startups across the globe.