<img height="1" width="1" style="display:none;" alt="" src="https://px.ads.linkedin.com/collect/?pid=2604436&amp;fmt=gif">
Langchain Docugami KG-RAG
Langchain Docugami KG-RAG Mobile
Document Engineering

Introducing the Docugami KG-RAG Template for LangChain: Better Results than OpenAI Assistants


Updated 12/12/2023 with new eval using LangSmith: SEC 10Q Filings 2023-12-11 and a new dataset that reflects real-life customer usage patterns.

At Docugami, we have built the world’s most advanced Foundation Model to convert long-form business documents (Scanned PDFs, Digital PDFs, DOCX, DOC) into semantic XML Knowledge Graphs. Real-world documents are more than just flat text, and Docugami is built to handle multi-page long-form documents including complex tables and multi-column flows while producing an XML Knowledge Graph as output that faithfully represents the entire documents, semantically and structurally. 

Retrieval Augmented Generation (RAG) has recently gained traction as a popular use case that allows Large Language Models (LLMs) to reason over business-critical data that is often private to enterprises.  

RAG over simple text is a start, but RAG over semantic XML Knowledge Graphs (KG-RAG) is a game changer: Today, we are shipping the Docugami KG-RAG Template for LangChain that allows customers to send their Docugami XML Knowledge Graph as input to LLMs. Our preliminary results indicate that Docugami KG-RAG significantly outperforms the retrieval over documents built into OpenAI’s GPTs and Assistants API

  OpenAI
Assistants
Docugami
KG-RAG
Answer Correctness - SEC 10-Q Dataset 33% 48%

It is important to note that these are unassisted results. Docugami is designed so business users can provide point-and-click feedback to the model. With just a few minutes of feedback, Docugami's results approach 100% accuracy. We will provide examples of this with more detail in the next few days.

We are sharing the results of the evaluation run here in LangSmith: SEC 10Q Filings 2023-12-11.

As a reminder, over the past year, Docugami has shipped integrations with open-source frameworks like LangChain and LlamaIndex that allow developers to build their own RAG solutions, and OpenAI has also recently announced built-in support for retrieval over documents in GPTs and the Assistants API. 

We love all this momentum around RAG, but feel that there are still some gaps that need to be addressed: 

  1. Semantic tagging: RAG over simple text achieves a certain level of accuracy, but adding structural and semantic markup improves results significantly. We clearly saw this ourselves and are starting to see some independent research that agrees, e.g., here. 
  2. Handling complex structures: The adage “garbage in, garbage out” applies to RAG as well. The size of your LLM does not matter if your RAG pipeline gets bad text as input due to inability to handle tables, multi-column layouts, complex reading orders like text that flows around inset boxes, document concepts like page headers/footers, etc. Docugami is hardened over years with research supported by the NSF and NASA to work well across a wide variety of documents. 
  3. Multi-modal RAG: Modern foundation models operate on more than just text inputs. Docugami’s XML markup contains visual bounding boxes of key document elements that can be sent as inputs to multi-modal LLMs for joint RAG with text. Docugami does this seamlessly for most common document formats, many of which are non-trivial to render as page images at scale. 
  4. Loss of context due to over-chunking: A core aspect of RAG is chunking the input documents so that only relevant ones can be retrieved (typically using a vector database) and sent to the LLM within reasonable context window limits. Even with very large context windows, there are still cost and performance considerations that make chunking a critical piece of the RAG puzzle. However, over-chunking can lead to loss of context where the LLM is unable to determine additional information beyond what is inside the retrieved chunk. Docugami’s Semantic XML Knowledge Graph allows us to address this in different ways including semantic tags normalized across sets of documents, automatically generated XML summaries of entire documents that can be included in the RAG context and utilizing user-curated views into the XML Knowledge Graph to do structured computations using XQUERY and SQL. 

We recently released a cookbook that allows any developer to do RAG over XML Knowledge Graphs (KG-RAG).  Today, we are excited to go beyond the simple cookbook and release a new end-to-end LangChain template for Docugami KG-RAG. With this template, you can quickly get up and running with KG-RAG in your own applications. You can check out the template here: https://github.com/docugami/langchain-template-docugami-kg-rag.

We have done some preliminary comparisons of Docugami KG-RAG using this new template against OpenAI GPTs under the following conditions: 

  1. We used the following set of PDFs (publicly available SEC 10-Qs): here 
  2. We set up an OpenAI Assistant with these PDFs set up for retrieval. The system prompt used was the same as specified here for Docugami KG-RAG.
  3. For Docugami KG-RAG, we built the XML Knowledge Graph using our proprietary Foundation Model. Once the Knowledge Graph was created, we sent it as input to Open AI’s GPT-4-Turbo LLM with 128k context, and OpenAI’s text-embedding-ada-002 embeddings using the Docugami KG-RAG template.
  4. We ran this eval notebook to compare the results between the Open Assistant and Docugami KG-RAG: sec-10-q.ipynb. This notebook uses the LangSmith eval framework, using GPT-4 to eval correctness of the answers against the ground truth.

As mentioned, our preliminary results indicate that OpenAI Assistants are correct approximately 33% of the time, while Docugami KG-RAG is correct approximately 48% of the time. We are sharing the results of the evaluation run here in LangSmith: SEC 10Q Filings 2023-12-11.

We invite community feedback on these preliminary results and will be updating this template over time to add other capabilities, for example multi-modal RAG including figures/images inside documents. We will also be adding more test documents and questions and will be asking the community to contribute.  

Tag us @docugami to share your results and experience using our new Docugami KG-RAG template on your documents, or just reach out at https://www.docugami.com/contact-us.      

Get noticed on the latest Document Engineering insights

Be the first to know about the latest news, use cases, and innovative features.