IN THE MEDIA | Docugami Home

Memory for the machine: How vector databases power the next generation of AI assistants

By Kyt Dawson, SiliconANGLE, May 28, 2025

Docugami needed to convert long-form unstructured enterprise documents, such as insurance policies or legal filings, into structured knowledge graphs for intelligent querying in validation. To approach this, the company uses 10 to 15 open-source large language models and the fast, in-memory database store from Redis Inc. to orchestrate them.

Going beyond RAG, Docugami uses vector databases to support its agentic systems by building knowledge graphs, where the AI system can track semantic elements and relationships across hundreds of pages and data points.

“We developed some agentic algorithms that double-check the values and find the ones that might be wrong,” Paoli said. “Our application uses vector databases at every level, not just for RAG, including to support these kinds of agentic systems.”

READ STORY

The Desperate Hunt for the A.I. Boom’s Most Indispensable Prize

By Erin Griffith, New York Times, August 2023

For the past year, Jean Paoli, chief executive of the artificial intelligence start-up Docugami, has been scrounging for what has become the hottest commodity in tech: computer chips.

In particular, Mr. Paoli needs a type of chip known as a graphics processing unit, or GPU, because it is the fastest and most efficient way to run the calculations that allow cutting-edge A.I. companies to analyze enormous amounts of data….

Two weeks ago, he struck gold: Docugami secured access to the computing power it needed through a government program called Access, which is run by the National Science Foundation, a federal agency that funds science and engineering. Docugami had previously won a grant from the agency, which qualified it to apply for the chip.

READ STORY

Should you build or buy generative AI?

By Mary Branscombe, CIO.com, July 2023

There are multiple collections with hundreds of pre-trained LLMs and other foundation models you can start with. Some are general, others more targeted. Generative AI startup Docugami, for instance, began training its own LLM five years ago, specifically to generate the XML semantic model for business documents, marking up elements like tables, lists and paragraphs rather than the phrases and sentences most LLMs work with. Based on that experience, Docugami CEO Jean Paoli suggests that specialized LLMs are going to outperform bigger or more expensive LLMs created for another purpose.

READ STORY

Docugami’s new model for understanding documents cuts its teeth on NASA archives

By Devin Coldewey, Techcrunch, April 2021

You hear so much about data these days that you might forget that a huge amount of the world runs on documents: a veritable menagerie of heterogeneous files and formats holding enormous value yet incompatible with the new era of clean, structured databases. Docugami plans to change that with a system that intuitively understands any set of documents and intelligently indexes their contents — and NASA is already on board.

READ STORY

Google made sense of the web. Docugami does that for documents

By Josh Constine, Principal investor & Head of Content at SignalFire, November 2020

When data gets structured, value emerges. We’ve seen it over and over. Google structured web links into PageRank. Facebook structured your social graph into content ranking. Tesla is turning footage of city streets into navigation algorithms. Documents are another near-infinite naturally occurring resource of unstructured data. Docugami can become a generational technology company by distilling what’s inside.

READ STORY

Announcing Grammarly’s Investment in Docugami

By Brad Hoover, CEO, Grammarly corporate blog, May 13, 2020

Grammarly exists to improve lives by improving communication. Our AI-powered writing assistant helps more than 20 million people around the world improve their English writing every day—and we’re always on the lookout for ways to help even more people!

Today I’m glad to announce a new step in our journey: an investment in the Seattle-based company Docugami.

READ STORY

In the Media...

Memory for the machine: How vector databases power the next generation of AI assistants

The Desperate Hunt for the A.I. Boom’s Most Indispensable Prize

Should you build or buy generative AI?

Grounding Transformer Large Language Models with Vector Databases

Docugami Vendor Vignette

Docugami’s new model for understanding documents cuts its teeth on NASA archives

Google made sense of the web. Docugami does that for documents

NLP Poised to Revolutionize the Enterprise

Announcing Grammarly’s Investment in Docugami

AI document engineering startup Docugami raises $10M seed round in unusually large early stage deal

Using machine learning to solve your dark data nightmare

Docugami unveils exec team

AI superpowers for everyday documents: Microsoft vet and XML co-creator unveils startup Docugami

Images

The Docugami team, with Ilya Kirnos of SignalFire, and Bob Muglia, celebrating the closing of the $10 million seed round in Seattle, February 2020.

Docugami board members Bob Muglia (left), Jean Paoli (center), and Ilya Kirnos (right), in Seattle, February 2020.

For Media Inquiries