Legal Gen AI ChatBot





Industry: Legal Tech
Customer: Ozmosys
Technologies: Python, Next.JS, ChromaDB, PostgreSQL, Elastic

Areas of expertise: Legal Tech, AI. Generative AI, NLP, LLM, Data Science, Model Training


Law firms and legal departments deal with a vast number of legal documents, including contracts, case files, briefs, and research materials. Efficiently searching through these documents to find relevant information is critical for lawyers and legal professionals to build strong cases, provide accurate advice, and make informed decisions.

The Ozmosys project solution can be customized to create a powerful legal document search and retrieval system. By leveraging large language models such as LLaMA-2-13B, which is used for tasks such as text summarization, text tagging, and text citation. We use Predibase for text embedding, which also facilitates the integration of different LLMs and provides the ability to switch between them with relative ease. This collection of embeddings, called a vectorstore, can be efficiently stored in several vector databases, including Pinecone, AtlasDB, ChromaDB, and others. We’ve chosen Chroma DB for these purposes.

When a user asks a question, we find the relevant documents by first embedding the user’s question in the Predibase and then performing a similarity search over the vector store. This process ensures that the most relevant documents are identified and presented to the user.




The goal of the project was to develop an innovative solution that would allow legal professionals to find specific information related to a case or legal issue by providing a user-friendly chat interface that allows legal professionals to ask questions or provide keywords related to the information they are seeking. The system then performs a similarity search of the indexed legal documents and identifies the most relevant sections or passages.

The system must then present the attorney with the relevant excerpts from the documents, along with citations and links to the original sources. This allows the attorney to quickly review the information and determine its relevance to his or her case or research.

In addition, the system can provide a summary of the key points from the identified documents, saving the attorney time and effort in reading through lengthy materials. The attorney can further refine his or her search by asking follow-up questions or providing additional context, allowing for interactive exploration of the legal documents.

The most important requirement for the search and knowledge retrieval tool was data security.



By implementing this solution, law firms and legal departments can significantly improve their document search and retrieval efficiency. Lawyers can find the information they need faster, allowing them to focus on analyzing content and developing strong legal strategies. The system can also help uncover relevant information that may have been missed in manual searches, potentially strengthening cases and improving outcomes.

The Ozmosys project solution can be adapted to create a powerful legal document search and retrieval system. The system architecture would consist of the following components

 Document ingestion and pre-processing

Legal documents in various formats (PDF, DOC, TXT, etc.) are ingested into the system.

The documents are pre-processed, which includes text extraction, optical character recognition (OCR) for scanned documents, and data cleaning to remove irrelevant information.


Document indexing

The pre-processed documents are indexed using the LLaMA model for text summarization and the Predibase API for text embedding.

Each document is summarized using the LLaMA model to capture its key points and main ideas.

The summarized documents are then embedded using the Predibase API to create a vector representation of each document.

The vector embeddings are stored in a vector database such as Pinecone, AtlasDB, or ChromaDB, which allows for efficient similarity searches.

 User interaction and query processing:

Lawyers interact with the system through a user-friendly chat interface built using web technologies such as HTML, CSS, and React.js.

When an attorney enters a query or question, the system processes the input using the LLaMA model to create a summarized question that captures the context of the conversation. The summarized question is then embedded using the Predibase API.


Document Retrieval

The system performs a similarity search over the vector database using the embedded summarized question. The similarity search identifies the most relevant document vectors based on their cosine similarity to the question vector. The corresponding documents are retrieved from the document repository.


Document segmentation and context extraction

The retrieved documents are segmented into smaller chunks or passages. Each segment is embedded using the Predibase API. The system performs a similarity search over the segment embeddings using the compressed question embedding. The most relevant segments or passages are identified as the context for answering the question.


Answer generation

The system uses the LLaMA model to generate an answer to the condensed question based on the extracted context. The generated answer is presented to the lawyer along with the relevant document excerpts and citations.

Iterative refinement:

The lawyer can further refine the search by asking follow-up questions or providing additional context.

The system processes the new input and repeats steps 3-6 to retrieve more relevant information and generate updated answers.



To ensure the highest level of data protection, the entire setup used dedicated servers hosted by both Predibase and Ozmosys. This approach ensured that each client’s proprietary data remained in a separate and secure environment, preventing unauthorized access and potential data breaches.


By using dedicated servers, the system eliminated the risks associated with shared hosting environments where multiple clients’ data could coexist on the same server. This separation of client data provided an additional layer of security by minimizing the potential for data leakage or cross-contamination between different clients.



Our team is ready to boost your business

“Tailored. Flexible. Competitive.”