Introducing TopVault

Teddy ReedTeddy Reed
4 min read

Welcome to the TopVault Tech blog! With this inaugural post, I will begin to go behind the scenes on how TopVault is built. This whole blog series will continue that theme of describing parts of the technology that runs TopVault.

The main themes are simple design and development, and zero or extremely minimal recurring costs.

Keeping these in mind, the application set out to solve three main problems:

  1. Be a highly accurate index for collectibles.

  2. Facilitate creating high scores for collectors.

  3. Enable detailed note taking for collectors.

These were problems I was facing as a collector of trading cards. No existing app had a detailed index of all the obscure variants of trading cards. Tracking them meant leaving myself unstructured notes, for example the Jr. Rally Eevee Pokemon Card variation rarely shows up on collection tracking apps. Needing high scores was a fun way to create competition; and detailed note taking meant recording things like time, place, and other aspects needed for assembling a journal of sorts. I am a slow collector and like to focus on the journey not the destination, while looking back on how/where I finally acquired certain hard to find items.

I personally wanted a solution for these, and I wanted it to be mobile first.

Assembling the tech stack

I had used the Ionic Framework with React to build an iOS app in the past. I knew how it worked, some of the limitations, and where it shined. Now that TopVault is released and looking back at this selection it was indeed correct. I would choose it again as well as recommend it to others.

I am an advocate for the simplicity of monoliths, so the backend is a single Express service I call api-service. It runs on a small desktop machine with a GPU attached via Thunderbolt (more on that later). The database is SQLite with a small transaction management middleware.

All of the code is a monolith single GitHub repo, including the data model for all collectibles, any sample photos collected, the frontend, backend, website HTML, and other resources applicable for change management. Almost all code is TypeScript due to Ionic and React and how simple it is to stay in a single language. The exception is some string and fuzzy matching is optimized to run in golang.

The backend is served via Cloudflare using their tunnel clients. And the single service is brought up with a Docker compose configuration, with a fair amount of best-practice hardening using Docker in rootless mode and a few other sandboxing techniques.

The app uses machine learning in two specific cases. The first is for on-device collectible matching, where OpenVC and an OCR client are bundled as WASM executables. These optimize user images to do bounding box detection, then extract image perceptual hashes and text content, which are matched within api-service. Secondly, the FLUX.1-schnell open weight model is used to generate friendly avatars for collectors.

Design challenges

After completing the app and analyzing how the decisions worked in practice, there are two main drawbacks I found:

  1. Using Express for the backend means I cannot easily optimize code paths with multi-processing.

  2. Using a single-page-app (SPA) design means the web-version cannot easily have SEO and open-graph headers present for various share features.

The problem with (1) is nicely solved by breaking down the monolith, but that slowly erodes simplicity. A better choice would have been to ease on the simplicity and write the entire backend in golang.

The problem with (2) comes from needing the SPA to load, talk to a backend, and execute Javascript to update elements in <head> for social previews to work. Almost no social apps support this and instead expect the metadata to be rendered server-side.

The single recurring cost

The data storage for all collectible sample photos is a GitHub repo and over time this has grown to over 10GB of photos. Each is fairly small after intense compression so this can nicely fit into git however it makes CI/CD cumbersome if this data is needed, even with a shallow clone, every time tests run. To accommodate, the data is placed into git lfs and that bandwidth costs $5/monthly.

There is a way to self-host the LFS service, however the authentication solutions are not simple and require exposing too much for my tolerance. In this case, the monthly fee is well worth the simple authentication solution GitHub offers. And it can be migrated to a self-hosted solution if needed.

0
Subscribe to my newsletter

Read articles from Teddy Reed directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Teddy Reed
Teddy Reed