Building interm.ai – A Real-Time Helper with Electron, Swift & Bolt

Xin ShenXin Shen
2 min read

TL;DR
interm.ai is my cross-platform AI assistant that listens to calls and whispers suggestions in real time—think Cluely, but laser-focused on interviews & sales demos.

  • Electron desktop app for macOS / Windows

  • Swift CLI on macOS for lossless system-audio + screenshots

  • Google Speech-to-Text (Deepgram soon) → GPT-4o for sub-500 ms suggestions

  • Landing page on Bolt.new; demo videos generated by Claude Code


🛠️ What We Built

LayerStackWhy It Matters
Desktop appElectron 30 + ReactOne code-base → .dmg, .exe, AppImage
Real-time AIGoogle Speech-to-Text → GPT-4o (Deepgram planned)< 500 ms latency
macOS helperSwift CLI (ScreenCaptureKit)Captures system audio + screenshots that Electron can’t
Marketing & demosBolt.new site + Claude CodeShip landing pages & walkthroughs in hours

🌐 Demos

Architecture Overview 🛠️

Swift CLI: Capturing macOS Audio & Screenshots 🎙️🖼️

// AudioScreenshot.swift
import AVFoundation
import CoreImage
import ScreenCaptureKit

@main
struct AudioScreenshot {
  static func main() async throws {
    // 1️⃣ Capture raw system audio
    let session = try SCStream.shared(systemAudio: true, microphone: false)

    // 2️⃣ Poll active window every 5 s
    while true {
      let image = try session.captureCurrentFrame()
      save(image)                       // → ~/Library/Caches/interm/frame.png
      try await Task.sleep(for: .seconds(5))
    }
  }
}

Roadmap 🗺️

  1. On-device fallback with a tiny LLM for offline flights.

  2. CRM webhook – push call highlights straight into HubSpot.

  3. Switch speech backend to Deepgram once their low-latency best-word model exits beta.

  4. Open-source the Swift CLI (after tidying up the build script).


Try it Out & Give Feedback 🙌

Try Interview Terminator

0
Subscribe to my newsletter

Read articles from Xin Shen directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Xin Shen
Xin Shen