In the Media
See the buzz around Docugami in The New York Times, CIO, GeekWire, and more.
For Press Inquiries: media@docugami.com
Memory for the machine: How vector databases power the next generation of AI assistants
By Kyt Dawson, SiliconANGLE, May 28, 2025
Docugami needed to convert long-form unstructured enterprise documents, such as insurance policies or legal filings, into structured knowledge graphs for intelligent querying in validation. To approach this, the company uses 10 to 15 open-source large language models and the fast, in-memory database store from Redis Inc. to orchestrate them.
Going beyond RAG, Docugami uses vector databases to support its agentic systems by building knowledge graphs, where the AI system can track semantic elements and relationships across hundreds of pages and data points.
“We developed some agentic algorithms that double-check the values and find the ones that might be wrong,” Paoli said. “Our application uses vector databases at every level, not just for RAG, including to support these kinds of agentic systems.”
Docugami needed to convert long-form unstructured enterprise documents, such as insurance policies or legal filings, into structured knowledge graphs for intelligent querying in validation. To approach this, the company uses 10 to 15 open-source large language models and the fast, in-memory database store from Redis Inc. to orchestrate them.
Going beyond RAG, Docugami uses vector databases to support its agentic systems by building knowledge graphs, where the AI system can track semantic elements and relationships across hundreds of pages and data points.
“We developed some agentic algorithms that double-check the values and find the ones that might be wrong,” Paoli said. “Our application uses vector databases at every level, not just for RAG, including to support these kinds of agentic systems.”
The Desperate Hunt for the A.I. Boom’s Most Indispensable Prize
By Erin Griffith, New York Times, August 2023
For the past year, Jean Paoli, chief executive of the artificial intelligence start-up Docugami, has been scrounging for what has become the hottest commodity in tech: computer chips.
In particular, Mr. Paoli needs a type of chip known as a graphics processing unit, or GPU, because it is the fastest and most efficient way to run the calculations that allow cutting-edge A.I. companies to analyze enormous amounts of data….
Two weeks ago, he struck gold: Docugami secured access to the computing power it needed through a government program called Access, which is run by the National Science Foundation, a federal agency that funds science and engineering. Docugami had previously won a grant from the agency, which qualified it to apply for the chip.
For the past year, Jean Paoli, chief executive of the artificial intelligence start-up Docugami, has been scrounging for what has become the hottest commodity in tech: computer chips.
In particular, Mr. Paoli needs a type of chip known as a graphics processing unit, or GPU, because it is the fastest and most efficient way to run the calculations that allow cutting-edge A.I. companies to analyze enormous amounts of data….
Two weeks ago, he struck gold: Docugami secured access to the computing power it needed through a government program called Access, which is run by the National Science Foundation, a federal agency that funds science and engineering. Docugami had previously won a grant from the agency, which qualified it to apply for the chip.
Should you build or buy generative AI?
By Mary Branscombe, CIO.com, July 2023
There are multiple collections with hundreds of pre-trained LLMs and other foundation models you can start with. Some are general, others more targeted. Generative AI startup Docugami, for instance, began training its own LLM five years ago, specifically to generate the XML semantic model for business documents, marking up elements like tables, lists and paragraphs rather than the phrases and sentences most LLMs work with. Based on that experience, Docugami CEO Jean Paoli suggests that specialized LLMs are going to outperform bigger or more expensive LLMs created for another purpose.
There are multiple collections with hundreds of pre-trained LLMs and other foundation models you can start with. Some are general, others more targeted. Generative AI startup Docugami, for instance, began training its own LLM five years ago, specifically to generate the XML semantic model for business documents, marking up elements like tables, lists and paragraphs rather than the phrases and sentences most LLMs work with. Based on that experience, Docugami CEO Jean Paoli suggests that specialized LLMs are going to outperform bigger or more expensive LLMs created for another purpose.
More Buzz About Docugami
Grounding Transformer Large Language Models with Vector Databases
By Simon Bisson, The New Stack, July 2023
Docugami Vendor Vignette
By Deep Analysis, April 2022
Docugami’s new model for understanding documents cuts its teeth on NASA archives
By Devin Coldewey, Techcrunch, April 2021
Google made sense of the web. Docugami does that for documents
By Josh Constine, Principal investor & Head of Content at SignalFire, November 2020
NLP Poised to Revolutionize the Enterprise
By Mary Branscombe, CIO, July 17, 2020
Announcing Grammarly’s Investment in Docugami
By Brad Hoover, CEO, Grammarly corporate blog, May 13, 2020
AI document engineering startup Docugami raises $10M seed round in unusually large early stage deal
By Todd Bishop, Geekwire, February 11, 2020