Docling in Working with Texts, Languages, and Knowledge


Introduction
Hi everyone. In the context of our research project, we were solving the problem of automating academic submission workflows, which led us to discover a platform called Docling.
Together, we explore the role of Docling in reshaping how research data can be represented, reused, and reasoned over in both human and machine-readable formats.
As part of the development of a scientific research project “Advanced Scientific Research Projects” (ASRP) aimed at creating a new-generation academic journal named ASRP.science, we encountered a number of methodological and technical challenges. One of the key issues was the automation of parsing academic research documents in order to streamline the submission, processing, and archiving of materials provided by researchers. Modern digital publications increasingly require tools that not only accept text submissions but also structure knowledge, annotate content, handle metadata, terminology, and interlinked concepts, and ensure that outputs are machine-readable for downstream use.
During the initial stages of analysis and prototyping, we formulated the following goal:
Develop or adapt a platform capable of effectively handling linguistically and conceptually annotated texts, compatible with academic publication formats, and supporting export and interoperability with LLM pipelines and knowledge bases.
While exploring available solutions, we came across an open-source project called Docling, designed for linguists, researchers, and digital humanists. Although originally built for working with language and text in a linguistic context, Docling turned out to be surprisingly well-aligned with our needs.
Why Docling Was a Relevant Choice:
Structured data format (JSON): Docling stores data in a structured, lightweight JSON format, which integrates easily into research pipelines and software development workflows. This made it easy for us to feed Docling outputs into other tools.
Graph- and tree-based knowledge representation: It supports graph and tree representations of knowledge, crucial for semantic parsing and linking concepts. A research paper in Docling isn’t just text; it becomes a network of interrelated nodes (e.g., sections, definitions, examples).
Flexible corpora creation: We can create flexible corpora, including lexical databases or grammatical descriptions, within the same environment. This was useful for building a “language archive” of terms and definitions encountered in our papers.
Visualization and extensibility: Docling offers visualization capabilities (tree views, tables) and is modular/extensible. We could visualize argument structures or sentence parse trees, then extend the platform with custom scripts.
Interface with AI systems: Perhaps most interestingly, Docling can serve as an interface for interaction with AI systems, including LLMs. The structured outputs (in JSON) can be fed into machine learning pipelines or used to improve prompt engineering by providing context in a structured way.
In the following sections, we will explore the architecture of Docling, its core features, example use cases, and the platform’s potential for adaptation within our research infrastructure.
Docling in Academic Submission Workflows
🧠 Key Points to Cover:
- Automatic structuring of research content: Docling allows users to break down complex research documents into modular, interlinked nodes (e.g. hypotheses, arguments, citations, definitions) instead of treating a document as one big block of text. For example, in our project pipeline we use Docling’s API to parse an uploaded PDF into a graph of nodes. Each node might correspond to a section heading, a paragraph, a bibliography entry, etc., all linked together by the document’s structure. Programmatically, this looks like:
from docling.document_converter import DocumentConverter
# Parse a PDF file into a DoclingDocument and iterate over its nodes
converter = DocumentConverter()
docling_doc = converter.convert("sample_article.pdf").document
for node, _ in docling_doc.iterate_items():
data = node.model_dump() # convert DocItem to a Python dict
print(f"{data['label']}: {data['text'][:50]}...")
Code example: Using Docling to convert a PDF into structured nodes. In this snippet, each node
is a Docling DocItemwith a label
(type of content, e.g. Title, Heading, Paragraph) and text
content. This kind of automated segmentation means a submitted paper is not just a single blob of text; it’s a hierarchy of interrelated parts. For instance, running the above on a sample article prints something like:
Title: Forecasting Social, Geopolitical, and Economic Events Using the 'Banchenko-Technology'
Paragraph: The unconscious can be understood as an entity subject to certain laws and in a dynamic relationship with consciousness...
Each line represents a node (with its type and an excerpt of text). Under the hood, Docling has taken the document and broken it into pieces, classifying them (e.g., that first node was recognized as a Title).
Text + metadata hybrid: Each node in Docling can contain text (the content of that segment, such as a paragraph or example sentence) and associated metadata. Metadata might include the author, source, tags, time period, related terms, or any custom fields you define. This is ideal for academic articles that need to be parsed, sorted, and indexed. In our pipeline, for instance, we could attach metadata like page numbers or confidence scores to each node. Docling’s data model (built on Pydantic) makes it easy to handle these metadata fields as Python objects. (In code, we simply call
node.model_dump()
to get a JSON-ready dict of all of a node’s fields, including text and any annotations.)Knowledge graph of the paper: A submitted paper is not just a static PDF — once in Docling, it becomes a mini knowledge graph. Sections, arguments, and concepts are connected and queryable. For example, a Conclusionsection node might have links to multiple Result section nodes that it references. Docling inherently supports linking any node to any other, allowing us to represent relationships like “Figure X illustrates Theory Y” or “Definition A is used in Section B”. The result is that the linear document transforms into a network of information. Reviewers (or algorithms) can trace the logic of an argument through this tree/graph structure, instead of being confined to linear reading. This was a huge plus for us in thinking about machine-assisted peer review.
Better peer-review preparation: Because of the structured, node-based approach, reviewers can traverse the argumentation structure more intuitively. For example, they can quickly isolate the central hypothesis node and see all evidence nodes linked to it, rather than hunting through the text. This tree-like structure supports critical thinking and logic validation by making the paper’s knowledge graph explicit.
Interoperability with research platforms: Docling’s JSON structure can be easily integrated into other tools:
Research management tools (like Zotero): We can export bibliographic metadata or structured references from Docling and import them into citation managers.
Semantic indexing engines: Because each piece of content is a node with metadata, we can feed the collection into a semantic search or indexing system. For instance, one could dump all Docling nodes into an Elasticsearch index, enabling fine-grained search (find all occurrences of a certain concept, or all evidence supporting a given claim).
Machine learning pipelines: Perhaps most significantly, the clean JSON output can feed ML pipelines for automated analysis or review support. We experimented with an AI agent that consumes Docling-structured data to extract insights. For example, given a Docling-parsed document, our AI agent can answer questions like “What are the author affiliations?” by traversing the JSON structure. We simply provide the agent with the structured text and ask for what we need. In code, it looked like this:
query = "Extract all metadata from the document and return a single JSON object." input_data = { "messages": [("user", query)], "file_path": "sample_article.pdf" } # Invoke the LLM agent with the structured document as context result_state = agent_executor.ainvoke(input_data) final_answer = result_state["messages"][-1].content metadata = json.loads(final_answer)
Code example: Using a LangChain-powered agent to query a Docling document. In our FastAPI service, we pass the user’s query and the document’s file path into a LangChain agent (
agent_executor
). The agent has been configured with tools that interface with Docling – for example, one tool can get the document’s title, another can get the authors, etc., all by utilizing the Docling-parsed content under the hood. The agent’s final answer (here,final_answer
) is a JSON string. We then parse it into a Python dictmetadata
. This metadata JSON is the result of the LLM analyzing the Docling structure of the document. It might look something like:{ "title": "Forecasting Social, Geopolitical, and Economic Events Using the 'Banchenko-Technology'", "authors": [ "Denis Banchenko", "Mykhailo Kapustin" ], "abstract": "This article presents a study on the interdependence between subjective experience gained through lucid dreaming and objectively observable processes dependent on collective consciousness. Relying on theoretical research in the field of consciousness, collective conscious and unconscious, and experimental data, assumptions about the nature of such interrelation have been made. A concept is proposed for discussion on the formation of structures capable of utilizing such phenomena for controlled types of activities such as: making economic-political decisions, managing investments, and shaping social transformations. The article introduces a digital system for managing event forecasts and market trends developed by BlackRock Corporation. Various aspects of the market—overvalued companies, potential risks, the impact of political projects on the financial world, as well as possible areas of future financial crises — are under the purview of the artificial intelligence used within BlackRock.", "doi": "10.33425/2690-8077.1119", "keywords": [ "states", "predictions", "forecasting", "event", "synchronization" ] }
Example output: Here the agent has identified key fields from the document — title, authors, affiliations, abstract, etc. This JSON was generated by the LLM after it utilized Docling to navigate the document’s structure. We could directly feed this structured output into downstream systems: for example, a database of article metadata, or a training corpus for an ML model.
Reusable components: Once a term, concept, or quote is added as a node in Docling, it can be reused across documents or become part of a larger corpus. This is ideal for journals with recurring themes or researchers building a cross-referenced library of concepts. In our use, if multiple papers defined the same technical term, we could merge those into a single lexical node referenced by all occurrences — effectively creating a dynamic glossary.
🎯 Goal: Demonstrate how Docling transforms the way we submit and process academic work — from static PDFs to dynamic, structured, and interoperable knowledge objects.
Docling’s Internal Structure: JSON, Nodes, and Knowledge Graphs
🔍 Key Points to Cover:
Everything is a Node: In Docling, both linguistic elements and text segments are stored as nodes in a graph. A word, a sentence, a comment, a lexical entry — all are treated as objects that can be linked. This uniform “node” abstraction means even metadata or annotations can be nodes. For example, an author name could be a node that links to a bibliography entry node or an affiliation node. In a linguistic context, a morpheme or gloss is a node that links to others (like a word node or a meaning node). For a research paper, we treated sections and paragraphs as nodes.
JSON-Based Format: Docling stores all data in clean, human-readable JSON. Each object/node has fields such as an
id
(unique identifier),type
orlabel
(what kind of node it is), thetext
content, and potentially references to other nodes (via IDs in alinks
array). This structure is ideal for versioning, exporting, or feeding into NLP/LLM pipelines, because it’s both human-readable and machine-readable. For instance, a simplified node representation might look like:{ "id": "node123", "type": "text", "label": "Sentence", "text": "The quick brown fox jumps over the lazy dog.", "links": ["node124", "node125"] }
Example: The above JSON could represent a sentence node with two links (perhaps linking to nodes 124 and 125 which could be its translation or related notes). In practice, Docling’s JSON might have additional fields (like metadata or provenance info), but the principle is that everything is stored as JSON objects. In our project, we leveraged this by storing entire documents as collections of JSON nodes. Because it’s just JSON, we could easily serialize or deserialize the data, send it over an API, or store it in Git. In fact, Docling’s Python models use Pydantic, so we often used
model_dump()
andmodel_dump_json()
to get JSON serializations of nodes or whole documents. This made integration with other systems trivial.Graph Relationships: Relationships between nodes can encode various structures:
Parent-child (hierarchical structure): e.g., a paragraph node might have child nodes for each sentence, or a chapter node might have children for sections. In treebanking or syntactic analysis, parent-child links capture the parse tree.
Semantic links: e.g., relations like "related-to", "supports", "contradicts". For a scholarly article, you might link a Methodology section node to a Data node with a relation "uses data from". Or link a Conclusion node to a Hypothesis node with "supports" if the conclusion supports the initial hypothesis.
Alignment links: e.g., linking parallel texts or translations. In linguistics, this could link a sentence node in English to its Spanish translation node. In our academic context, we didn’t use this as much, but one could imagine linking an original article node to a node containing an AI-generated summary, for instance.
Custom Annotations: Users can define their own annotation layers as needed. Docling isn’t limited to a fixed schema. Some examples of annotations one might include:
Grammatical tags (for linguistic corpora, e.g., marking noun, verb, tense on nodes representing words)
Glosses or translations of terms
Comments or notes (for peer review or personal notes, attached to any node)
Metadata like source, author, year, confidence scores, etc.
In our use case, we annotated nodes with provenance information — each DocItem carried a reference to the page number of the PDF it came from, via a prov
field. This way, if our AI agent extracted a quote or a fact, we knew exactly which page of the original PDF it was from (which is important for verification). Because Docling’s data model allowed arbitrary fields, we simply added a prov: {page_no: X}
field to each node when parsing.
Multimodal Content Support: Though focused on text, Docling supports attachments like audio or images and can link these to transcript nodes. For example, a corpus of oral histories might have audio files linked to text transcripts as nodes. While our project dealt mainly with PDFs and text, this feature is great for digital humanists or linguists working with spoken language data — you can keep the media and the transcription aligned in the graph.
Export Capabilities: You can export data from Docling in multiple formats:
Complete project JSON: The entire corpus or project can be dumped as one JSON (or a folder of JSON files), which is perfect for backups or interchanging data with other systems.
Filtered subsets: For instance, you might export just the lexicon nodes, or just a particular branch of the tree (subtree export) if you only want a portion of the data.
CSV: Tabular export for lexicons or structured data (e.g., you could export a list of all example sentences with their translations in a CSV).
Static HTML: Useful for archiving or sharing, Docling can generate a read-only HTML view of your project (for example, a nicely formatted tree or a dictionary).
Visualization Options: Docling includes built-in tree and table views. Trees help linguists see sentence structure or help a researcher visually map the logic of an argument. Tables make it easy to scan lexical entries or grammatical paradigms. We found the tree view especially insightful when mapping the structure of arguments in a paper—seeing a visual tree of how evidence nodes branched from hypothesis nodes, for example, helped in both human understanding and designing AI prompts.
🧠 Goal: Emphasize that Docling isn't just a storage system — it's a modular, machine-readable representation of structured thought, designed for downstream applications in AI, linguistics, and academic publishing. The data model’s transparency (JSON and nodes) means it’s both human-readable for collaboration and machine-readable for computation.
Use Case: Docling in Action
🧪 Example Scenario: Research Paper on “Digital Inequality”
Imagine a researcher working on a study titled "Digital Inequality in Rural Education Systems." Rather than submitting it as a static PDF, the researcher imports the document into Docling to structure it as a dynamic knowledge object. Here’s how that might play out:
🧩 Step-by-Step Breakdown:
Import and Segmentation: The original text is ingested into Docling and segmented into its logical parts – say, introduction, hypotheses, methodology, findings, conclusion, etc. Each paragraph (or even each sentence, depending on granularity) becomes a node in the Docling environment. For instance, the introduction might be a parent node that contains child nodes for each paragraph.
Semantic Tagging and Annotation: Key terms like “digital divide,” “infrastructure,” “policy intervention,” and “socioeconomic status” are identified and tagged. In Docling, you might create nodes for each of these concepts and then link every mention of them throughout the document. These terms become clickable threads running through the study. So if “digital divide” is mentioned in both the introduction and the findings, both instances link back to the same concept node Digital Divide in the knowledge graph, forming semantic connections across the paper.
Argument Mapping: The central hypothesis — “Access to digital infrastructure strongly correlates with academic performance in rural regions” — is represented as a central node (perhaps of type Hypothesis). Supporting evidence nodes (data tables, citations of prior studies, survey results) are added as child nodes underneath, visually connected in a reasoning tree. In a Docling tree view, you would literally see the hypothesis at the center with branches to each piece of evidence supporting it. If there are counter-arguments or conflicting data, those could be nodes as well, linked with a relationship like contradicts or challenges.
Peer Collaboration: Docling’s collaborative features let a co-author or reviewer join in. A co-author might add comments to a specific argument chain, e.g. suggesting an alternative interpretation of the survey results node. Another researcher could even link this study’s corpus to a separate Docling corpus on urban digital policy, drawing connections between ideas across projects (for example, linking the concept node Digital Infrastructure in this rural study to a similar node in the urban study). This cross-project linking can enrich both projects’ context.
Export and Integration: Once the paper is structured, it can be exported for various uses. The entire graph of the paper can be exported as a JSON file for use in an LLM training pipeline or other AI analysis. (In our case, we actually did this for a test: after structuring a paper, we exported the metadata and content to a JSON and fed it to a language model to see if it could generate a summary. The structured JSON input made the summaries more accurate because the model could see the labeled sections and relationships, not just raw text.) A simplified HTML view of the paper’s graph can also be generated for a public repository or website, allowing readers to interact with the content without using Docling directly. Key metadata (like the fields shown in the earlier JSON example: title, authors, keywords, etc.) can be extracted and fed into indexing engines, so that the article is findable in a digital library with all its details.
Outcome: The study is no longer a flat document. It’s now:
Searchable: Because every piece of information is a node with explicit content and metadata, you can query the corpus (e.g., find all evidence nodes that support a certain type of claim, or search within conclusions across a corpus of papers).
Interactive: Readers or reviewers can click through the graph of nodes. One can jump from reading a sentence to seeing related data or definitions instantly.
Modular: Pieces of this paper can be detached or remixed. For example, the literature review section (as a subtree of nodes) could be reused or compared with the literature review of another paper on a similar topic.
Integrated with a broader ecosystem of knowledge: Since the content is structured and linked, it can connect to external corpora or knowledge bases. Our imagined Digital Inequality corpus could link to data sets, to policy documents, or to educational resources, making the paper a living part of a knowledge network rather than an isolated PDF on someone’s hard drive.
🎯 Goal: Illustrate how Docling transforms a traditional academic article into a living, connected artifact — one that supports richer interpretation, reuse, collaboration, and AI-driven exploration.
Connection to LLMs and AI
Docling’s structured approach to text makes it particularly powerful in the age of AI and large language models:
Structured corpora for training and analysis: Large Language Models (LLMs) learn better from structured, annotated corpora. Docling can provide a trove of well-structured text: for example, a treebank of sentences for a low-resource language, or a graph of arguments and evidence from academic papers. Instead of feeding an LLM raw text, we could feed it JSON from Docling that explicitly labels sections, relationships, and metadata. This could improve fine-tuning processes or prompt engineering, as the model can be guided by the structure (e.g., “use the Conclusion nodes of papers as training data for summarization”). The clear delineation of pieces (titles, abstracts, etc.) means we can easily assemble training sets of just those pieces.
Docling as a reasoning interface for LLMs: One intriguing idea is to use Docling in combination with an LLM to perform logical reasoning or question-answering. Since Docling provides a knowledge graph of a document (or multiple documents), an LLM can navigate that graph via tools. In our project’s prototype, we set up an LLM-based agent with tools that knew how to fetch parts of the Docling graph (like “get_title”, “get_authors”, or even “find_node containing X”). The agent didn’t have to read the entire text blindly; it could ask for specific pieces via those tools. This MRKL (Mindstorm Reasoning Knowledge Language) approach allowed the LLM to, for example, first retrieve the abstract node, then look for all nodes labeled as Conclusion, and then form an answer. Docling essentially acted as the database and interface through which the LLM reasoned about the content. The result was more accurate and traceable, because we knew which nodes the AI consulted for an answer.
Integration pipelines (JSON export to LLM input): We’ve shown an example of exporting a Docling project to JSON and using it as an input for an LLM. It’s worth noting that this integration doesn’t have to be custom — one could imagine a future where Docling has a direct plugin to feed data to an LLM or where an LLM tool is built into Docling for querying the corpus. In our use, we manually orchestrated it with Python code and LangChain, but the principle is general: *Docling provides structured knowledge; LLMs provide reasoning and language generation.*Together they can form an AI that is grounded in a curated knowledge base.
In summary, Docling’s compatibility with AI workflows comes from its structured, open format. JSON outputs and an object model mean you can easily write a script to take all nodes of type “Example Sentence” and send them to a translation model, or use an LLM to fill in missing links in the graph (imagine an AI that suggests new connections between nodes or auto-populates metadata based on content). This symbiosis of Docling and LLMs holds a lot of promise for both academic research and advanced NLP applications.
Expansion and Publishing
Finally, let’s consider how Docling scales up to collaborative projects, other tools, and educational settings beyond just individual research use.
Collaboration Features
Multi-user editing: Docling’s structure makes it easy for multiple researchers to work on the same corpus simultaneously. Because each piece of data is discrete (as nodes), collaborators can add or edit different parts of the project without stepping on each other’s toes. For instance, one person can be annotating examples in the corpus while another is writing commentary nodes elsewhere. (On a technical note, since everything is JSON, using version control like Git allows merging changes from different contributors relatively easily.)
Version control: Because data is stored as text-based JSON files, changes can be tracked using Git or similar systems. Every edit to a node shows up as a diff. In our experience, we put our Docling project under Git, and we could see line-by-line what changed in the JSON after each editing session. This transparency is fantastic for academic teams who need to maintain a clear history of how data (or analyses) evolve over time.
Peer-review mode: Imagine a mode where a reviewer or editor can leave structured comments directly linked to specific nodes (rather than margin notes on a PDF). Docling can support this by treating reviewer comments as nodes linked to the sections or sentences they pertain to. This way, a peer review becomes part of the knowledge graph – the comment is anchored to the exact point of critique, and the author can address it, even track its resolution through versions.
Integration with Other Tools
Zotero: Citation managers like Zotero could integrate with Docling. For example, you could export all references from a Docling project (since they might be nodes of type Reference) to a format Zotero understands, or vice versa, import from Zotero to Docling. This helps keep your Docling corpus and your bibliography manager in sync.
ELAN and FLEx: Many linguists use tools like ELAN (for annotating audio/video transcripts) and FLEx (FieldWorks Language Explorer for building dictionaries and grammars). Docling can serve as a unifying platform by importing data from these tools. If you have an ELAN transcript of an interview, you could import it into Docling’s graph structure to link parts of the transcript to linguistic notes or translations. FLEx lexicons could be imported as Docling lexicon nodes. Conversely, Docling’s data could be exported to these formats if needed for analysis in those tools.
Obsidian & Notion: Personal knowledge management systems like Obsidian or Notion thrive on linking notes – something Docling does inherently. With Docling’s lightweight JSON exports or possible API endpoints, you could sync data to an Obsidian vault or a Notion database. For example, each Docling node could become a note in Obsidian (with backlinks corresponding to Docling links). This would bridge academic research data and personal notes, allowing researchers to move seamlessly between a Docling corpus and their broader note-taking system.
Educational Potential
Interactive assignments: Docling can be used in the classroom. Instead of giving students a static text, a teacher could give them a Docling corpus of an article or a short story. Students could be asked to explore it non-linearly – for instance, follow the argument nodes and identify where the argument might have weaknesses, or find all definitions of key terms via the graph. This trains a more critical, exploratory reading habit.
Custom teaching corpora: In language learning or linguistics education, an instructor can build a tailored corpus (a set of sentences, a mini-dictionary, etc.) in Docling for the class. Students can then use Docling to see the structure of sentences (treebanks), check glosses of words, hear audio linked to text, etc. It becomes a hands-on learning platform.
Case studies: An academic article ingested into Docling can become an interactive learning module. For example, a history paper in Docling might allow students to click on a date and see a timeline, or click on a reference and see the full source. Each section of the paper could link to background information nodes, dataset nodes, or multimedia. Essentially, Docling can turn a linear text into a multimedia knowledge experience, which is great for teaching complex material.
🧠 Goal: Show that Docling is not just a technical parsing tool, but also a collaboration platform, an educational resource, and an archival framework for future-oriented research workflows. Its uses extend from solo researchers structuring their notes, to teams building shared knowledge graphs, to teachers creating interactive content for students.
Conclusion: Why Docling Matters
✅ What We Especially Liked
Modular structure and JSON backbone: The platform’s reliance on simple JSON and discrete nodes makes it incredibly flexible and interoperable. We were able to integrate Docling into our existing workflows with minimal friction, and we know we can always export our data and use it elsewhere if needed. Machine-readability by default is a huge plus in an era of data-driven research.
Graph-based interface: Navigating complex research relationships and syntactic structures is intuitive with Docling’s graph view. It’s like having a mind-map of your document. This visual and structural approach is a refreshing change from the typical linear doc editor and unveils connections you might otherwise miss.
Open-source and lightweight: Docling is open-source and not a massive enterprise system, which means it’s easy to adopt, fork, or extend. We appreciate not being locked into a vendor and being able to contribute or customize the platform to fit our needs (for example, we wrote custom scripts on top of Docling’s data model, which was feasible because the code and data format are accessible).
Multidisciplinary scope: Docling appeals to linguists, AI researchers, digital humanists, and educators alike. Whether you’re documenting an endangered language, parsing legal documents, or teaching a literature class, the core idea of node-based, annotated text is universally useful. It’s rare to find a tool that can cater to such different audiences without feeling too narrow or too generic — Docling strikes a good balance.
🛠️ What Could Be Improved
Documentation: While the platform is conceptually elegant, its onboarding could benefit from more tutorials, templates, and real-world examples. We had to read through some code and community examples to fully grasp the best practices. For wider adoption, a gentle introduction (maybe a “Docling for Dummies” guide or more sample projects) would be valuable.
User Interface (UI): The current UI, though functional, may feel a bit bare-bones or technical for less tech-savvy users. A more polished, user-friendly interface (without losing the advanced features) could broaden Docling’s adoption. For instance, more drag-and-drop actions, or a guided mode for newbies, would help non-programmers embrace the tool.
Localization and accessibility: Better internationalization (i18n) support would make Docling more inclusive for non-English or multi-lingual research communities. Also, ensuring the interface meets accessibility standards (for users with disabilities) could improve its utility in educational settings.
🤔 Why We Chose Docling: In the context of building a research publication platform, we needed a tool that could structure scientific thought, not just store it. Docling stood out as a rare blend of linguistic depth, technical openness, and adaptability to modern AI workflows. Its potential as a reasoning interface, not just a text editor, aligned closely with our goal of developing next-generation academic tools that treat knowledge as dynamic and connected.
🔭 What’s Next: Going forward, we plan to do a pilot implementation in our journal’s submission pipeline, using Docling to parse and enrich selected articles. This means authors might submit a paper and get back a Docling representation alongside the usual PDF, which could then be used in peer review or in generating enhanced publication formats. We’re also building a custom JSON-to-LLM pipeline: essentially, using Docling’s export to feed an LLM that helps with semantic analysis (like automated highlights or consistency checks in a paper). Additionally, we are exploring developing teaching modules based on Docling for training students in argument mapping and linguistic analysis – imagine a classroom where students collaboratively annotate a text in Docling and then query it with an AI assistant.
🎯 Final Thought: Docling isn’t just another digital tool — it’s a philosophical shift in how we think about text, structure, and meaning. In a world increasingly driven by large language models and automated workflows, platforms like Docling help us preserve intention, logic, and nuance in our documents — one node at a time.
Subscribe to my newsletter
Read articles from Mykhailo Kapustin directly inside your inbox. Subscribe to the newsletter, and don't miss out.
Written by
