Announcing v1.0.0 of the "oreilly-highlight-parser" library

Ruben RangelRuben Rangel
3 min read

I’m an avid technical book reader and enjoy learning about software engineering. One of my favorite sources of information is the O’Reilly Learning Platform. I don’t make much use of the in-app notes feature, but I do use the highlight feature very often.

At the start of the year, I also started building out a Second Brain in Capacities. My O’Reilly highlights would be great pieces of data in my Second Brain, but I already had a sizable store of highlights from various books already built up. With “laziness being the father of innovation,” I looked to software engineer my way out of manually copying highlights from the app to Capacities over and over again. So, I looked at the two tools and how I would bridge the gap.

O'Reilly allows you to export your highlights as a CSV file, but raw CSV isn't the most convenient format to work with in an object-based knowledge app like Capacities. So, I built oreilly-highlight-parser, a small, open-source TypeScript library that makes it easy to parse and format O'Reilly highlights into JSON or Markdown for ease of use in different workflows.

What Does oreilly-highlight-parser Do?

This library helps you extract and format your O'Reilly highlights from a CSV export.

Key Features:

  • Multiple Parsing Methods – Choose between synchronous, callback-based, or streaming parsers based on your needs.

  • Formatted Output – Convert highlights into structured JSON or a clean Markdown format.

  • Optimized for Note-Taking – The Markdown formatter ensures highlights are neatly structured for use in apps like Capacities, Obsidian, or Notion. The Markdown formatter includes the quoted text from the highlight, the chapter name, book title, and a hyperlink to the highlight in the O’Reilly app.

Converting O'Reilly Highlights to Markdown

While oreilly-highlight-parser provides multiple output formats, Markdown was the primary driver for creating this tool. Here's an example of how the Markdown formatter works:

Example Highlight in CSV

Book TitleChapter TitleDate of HighlightBook URLChapter URLAnnotation URLHighlightColorPersonal Note
Clean Code: A Handbook of Agile Software CraftsmanshipChapter 2: Meaningful Names2025-02-01https://learning.oreilly.com/library/view/-/9780136083238/https://learning.oreilly.com/library/view/-/9780136083238/chapter02.xhtmlhttps://learning.oreilly.com/library/view/-/9780136083238/chapter02.xhtml#5c2df9fe-5777-4f3c-9ee3-f5edfaa1Choosing good names takes time but saves more than it takes.YELLOW

Converted Markdown Output

> Choosing good names takes time but saves more than it takes.
>
> \- Chapter 2: Meaningful Names, [Clean Code](https://learning.oreilly.com/library/view/-/9780136083238/chapter02.xhtml#5c2df9fe-5777-4f3c-9ee3-f5edfaa1)

This structured format makes it easy to copy into your note-taking system while keeping essential metadata like the book title and source link.

In my Second Brain, I keep a page for each book’s notes and link to those blocks from other pages.

Tradeoffs and Future Ideas

When designing oreilly-highlight-parser, I aimed to keep it lightweight and extensible, but there are always tradeoffs and areas for improvement:

  • Parsing flexibility – Right now, the library expects a specific column structure from the O'Reilly CSV export. Future iterations might support more robust parsing options.

  • Additional output formats – Markdown and JSON are supported, but it could be useful to add formats like HTML, different Markdown formats, or maybe some kind of plugin system for direct-to-app integration.

If any of these ideas resonate with you, or if you have feature requests, I'd love to hear your thoughts!


Try It Out and Contribute!

oreilly-highlight-parser is open source and available now on NPM and GitHub. If you're looking for a way to clean up and format your O'Reilly highlights effortlessly, give it a try.

🔗 GitHub: github.com/rubenrangel/oreilly-highlight-parser

📦 NPM: oreilly-highlight-parser

I'd love to hear how you're using it and what features you'd like to see next!

0
Subscribe to my newsletter

Read articles from Ruben Rangel directly inside your inbox. Subscribe to the newsletter, and don't miss out.

Written by

Ruben Rangel
Ruben Rangel