Merging multiple OpenAPI spec files into one

Hamza WaleedHamza Waleed
5 min read

I recently had the opportunity to work within the Open edX ecosystem, and I plan to continue sharing my interesting experiences. Recently, I worked on the Enterprise Docs MFE, which requires you to be logged in to access it. Here's a screenshot(nothing confidential):

💡
The cover image is a snapshot of my desk setup 🫣, featuring my trusty standing desk—because I was done with too much sitting. Now, the next problem is I often forget to switch to standing mode! 💀

We are utilizing the Redoc package, which takes endpoint definitions in OpenAPI format.

For more details, check out the Redoc documentation.

The intriguing challenge we faced was displaying endpoints from different Django services. This meant we couldn’t manually copy and paste all the endpoints along with their HTTP response formats (e.g., 200, 203, 4xx, 5xx). By the way, the OpenAPI definition files are auto-generated by Django—once you set up some Swagger views, it will generate the schema file automatically.

Once this setup is complete, hitting the /schema endpoint returns all the endpoint definitions in this format:

Sample:

{
  "openapi": "3.0.0",
  "info": { "title": "API", "version": "1.0" },
  "paths": { "/example": { "get": { "description": "Example endpoint" } } }
}

Now, to the interesting part. We wanted to display a few endpoints from one service, a couple from another, and a few from a third. Our first approach was to write a custom script that would fetch and merge these YAML files.

Initially, this seemed straightforward: convert the YAML to JSON, iterate through the paths, and select the ones we needed. But here’s where it got tricky—an endpoint definition might reference other components, like this:

responses:
  '200':
    description: A detailed description of the Course.
    content:
      application/json:
        schema:
          $ref: '#/components/schemas/CourseResponse'

So even if we copied the endpoint, we’d still be missing the component definition that the endpoint references. To fix this, we had to update our script to handle these references and add the components at the end of the file if they hadn’t already been defined. This turned out to be a bit of a headache. But then, I found a solution—an existing package called openapi-merge.

It’s simple to use. You just provide the file paths, and it merges them. Here’s a basic example:

const mergeResult = merge([
  {
    oas: oas1,
    pathModification: { prepend: '/one' }
  },
  {
    oas: oas2,
    pathModification: { prepend: '/two' }
  }
]);

The package merges the paths, components, and tags, giving priority to the first input and adjusting subsequent inputs to fit smoothly.

However, since we were dealing with remote files, we needed to adjust our approach:

const [response1, response2, response3] = await Promise.all([
  fetch(`${process.env.URL_1}/api/schema/`),
  fetch(`${process.env.URL_2}/api/schema/`),
  fetch(`${process.env.URL_3}/api/schema/`)
]);

const [data1, data2, data3] = await Promise.all([
  response1.text(),
  response2.text(),
  response3.text(),
]);

const schemas = {
  schema1: yaml.load(data1),
  schema2: yaml.load(data2),
  schema3: yaml.load(data3),
};

localStorage.setItem(localStorageKey, JSON.stringify({ schemas, timestamp: currentTime }));

The reason we use localStorage is to cache the schema files and avoid refetching them on every page load. We fetch and merge the schemas, then store them for 24 hours to save on processing time.

Next, we use the merge function to combine the schemas:

export const mergeSchemas = (schemas) => {
  const mergeResult = merge([
    { oas: localSpecs },
    { oas: schemas.enterpriseAccess, pathModification: { prepend: 'service-1' } },
    { oas: schemas.lm, pathModification: { prepend: 'service-2' } },
    { oas: schemas.catalog, pathModification: { prepend: 'service-3' } }
  ]);
};

Note: The prepend modification avoids conflicts by ensuring unique paths for each service. In some cases, we need to override the public endpoints for enterprise use, but that’s a different topic we can ignore for now.

Once the merge is done, we need to override specific parts of the schema before rendering it. This is where we define the endpoints we want to include and the modifications we need to make:

const API_ENDPOINTS = [
  {
    original_link: 'dummy::/api/v1/policy-allocation/{policy_uuid}/allocate/',
    new_link: '/dummy/v1/allocation/{policy_uuid}/allocate/',
    tag: 'Learner Credit',
    method: ['post']
  },
  {
    original_link: 'dummy::/api/v1/subsidy-access-policies/',
    new_link: '/dummy/v1/subsidy-policies/',
    tag: 'Learner Credit',
    method: ['get']
  }
];

We then iterate through these endpoints and modify them accordingly:

API_ENDPOINTS.forEach(({ original_link, new_link, tag, summary, method }) => {
  const pathItem = mergeResult.output.paths[original_link];
  if (!pathItem) return;

  if (method) {
    method.forEach((methodType) => {
      if (pathItem[methodType]?.tags) {
        pathItem[methodType].tags = [tag];
        if (summary) {
          pathItem[methodType].summary = summary;
        }
      }
    });
    Object.keys(pathItem).forEach((existingMethod) => {
      if (!method.includes(existingMethod)) delete pathItem[existingMethod];
    });
  }
  updatedPaths[new_link] = pathItem;
});

This script ensures that only the required methods and tags are kept, and we remove any unwanted ones. It also updates the summary and swaps the original_link with new_link.

Now, to optimize performance, we cache the merged result for 24 hours in the localStorage:

const LS_KEY = 'apiSchemas';
const storedData = localStorage.getItem(LS_KEY);
const currentTime = new Date().getTime();

if (storedData) {
  const parsedData = JSON.parse(storedData);
  if (currentTime - parsedData.timestamp < 24 * 60 * 60 * 1000) {
    return parsedData.schemas;
  }
}
// Otherwise, fetch and merge...

This checks if the cached data exists and was fetched within the last 24 hours. If not, it fetches, merges, transforms, and stores the new schema.

One odd issue I encountered was that everything worked fine for me after deployment, but my teammates faced CORS errors, specifically in the stage environment. After some investigation, I realized that I had the CORS package installed in my Chrome extensions 🤦🏻‍♂️, which prevented the error from showing. Once I disabled it, the issue appeared. The fix was straightforward—just update the API URLs to use appropriate instance based on environment variables, and everything worked perfectly.

That’s it! I’ll continue sharing my learnings. I’ve got more on my to-do list, so fingers crossed!

0
Subscribe to my newsletter

Read articles from Hamza Waleed directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Hamza Waleed
Hamza Waleed

A Sr. Software Engineer specialising in React, React Native and Next.js development and is never afraid of learning new tools and technologies.