Legal search is one of the hardest things you can ask an AI system to do. If it misses a clause, ignores a date filter, or pulls from the wrong contract type, the answer is not just slightly off. It is useless. That is why the Weaviate demo is interesting: it shows how to build a legal assistant that can handle structured questions and still point back to the exact source passages.
Here is the simple version of the approach described in Weaviate’s Legal RAG App blog post.
Architecture Overview
The app uses three main pieces.
- Multimodal ingestion – Legal PDFs are embedded with a multivector model and Muvera compression. Each page is encoded directly as visual tokens, so layout, tables, and page structure are preserved without relying on OCR or manual chunking.
- Structured collections – Instead of one flat bucket of documents, contracts are split into three collections: Commercial Agreements, Corporate & IP Agreements, and Operational Agreements. That narrows the search space and gives the agent something useful to route against.
- Query Agent – The agent acts as the reasoning layer. It inspects the schema, builds filters, reranks results, and synthesizes answers from grounded source passages.
Why Agentic Search Beats Naive RAG
Traditional RAG is linear. A user asks a question, the system finds similar text, and the model writes an answer. That can work for broad questions, but legal work is different. A question like What are the notice periods in our 2024 service agreements? needs more than semantic similarity. It needs date filters, contract filters, jurisdiction awareness, and the ability to reason about search strategy.
The Query Agent fixes that by thinking about the query before it searches. It can inspect the schema, split a complex question into subqueries, create native filters, run hybrid search or vector search or keyword search, and then rerank the results before it writes a final answer. That makes the output much more reliable.
How To Build It
Step 1: Install The Weaviate Agent Skills
The blog post says you can install the Weaviate agent skills plugin directly in your coding environment. The point is to load the documentation and best practices into the agent so it can build the app with the right context.
npx skills add weaviate/agent-skills
/plugin marketplace add weaviate/agent-skills
/plugin install weaviate@weaviate-plugins
If you are using Cursor or another coding agent, the idea is the same: give the tool enough Weaviate context before you start generating the app.
Step 2: Set Up A Weaviate Cluster
Once the tooling is in place, run the quickstart flow and add your API keys. The demo uses the quickstart so the environment is ready before any documents are ingested.
/weaviate:quickstart
At this point, you can start with an empty collection. The prompt will handle the rest.
Step 3: Define The Schema
Each object in the collection represents one PDF page and stores the page image, extracted text, and basic metadata such as contract type, document ID, and page number. The blog post also shows a multivector configuration with Muvera encoding.
wvc.config.Configure.MultiVectors.multi2vec_weaviate(
name='doc_vector',
image_field='doc_page',
model='ModernVBERT/colmodernvbert',
encoding=wvc.config.Configure.VectorIndex.MultiVector.Encoding.muvera(
ksim=4, dprojections=16, repetitions=20
),
)
That structure matters because it keeps the page image and the metadata tied together instead of flattening everything into one blob of text.
Step 4: Split The Data Into Three Collections
The demo separates the data into Commercial Agreements, Corporate & IP Agreements, and Operational Agreements. That way the query agent can route a question to the right part of the corpus instead of scanning everything.
This is a small design choice, but it is one of the reasons the system works. The agent gets structure, and structure makes retrieval smarter.
Step 5: Use Search Mode And Ask Mode
The Query Agent operates in two modes.
- Search mode focuses on discovery. It retrieves and reranks the most relevant contract sections for manual review.
- Ask mode turns the retrieved context into a direct answer with cited source passages.
That split keeps the system useful in both cases: when a user wants to inspect the documents and when they want a direct answer.
Key Implementation Notes
- Use
weaviate.WeaviateAsyncClientfor the async backend. - Import the module, not the variable, for dependency injection so the client is not
Noneat request time. - Request BLOB fields such as
doc_pageexplicitly withreturn_properties; they are not returned by default. - Split the
COLLECTIONSenvironment variable on commas before passing it intoAsyncQueryAgent.
What The Front End Looks Like
The demo is wrapped in a simple chat interface backed by FastAPI and Next.js. Users can ask natural-language questions across multiple collections, get grounded answers, and see the exact source pages those answers came from. The front end does not need to be flashy. It just needs to make retrieval, reasoning, and citation easy to use together.
Why This Matters
The bigger lesson is that naive RAG is only enough for simple use cases. As soon as the problem includes structure, filters, or any real need for precision, you need a reasoning layer on top. Weaviate gives you that layer by helping the agent search the right collection, build the right filters, and return answers that can be traced back to the source.
That is what makes the legal app production-ready. It is not just answering questions. It is answering them in a way the user can verify.
Background
What is Weaviate? A vector database and AI search platform for semantic retrieval, hybrid search, and structured query workflows.
Why is legal AI hard? Because legal questions depend on dates, jurisdictions, contract types, and exact clauses. Small retrieval mistakes can completely change the answer.