The following example setup implements an LLM with RAG capability to serve as an expert system that uses documentation to answer user queries.
The required pdf documents are provided as a knowledge base to the system.
Since the whole setup is offline, confidential data can be provided to the LLM
Overview of RAG
Expert systems based on a knowledge base of PDF documents can be created using Retrieval Augmented Generation (RAG).
This requires an LLM and a searchable knowledge base (content indexed in a vector database for performing similarity searches)
The user query will be used to retrieve most relevant content from the knowledge base by performing a semantic search on the knowledge base. The LLM will summarize the content from vector database and provide it as a response.
Illustration of RAG
Illustration of vector database for documents
Solution Architecture
Ollama is a platform to run various large language models (LLMs) locally on a machine.
Openwebui provides a web interface for interacting with large language models. It also provides a UI to add RAG capability to the LLM models. It is a web server that can be run as a docker container
Installation
Ollama is installed
A base language model like Llama 3.2 (3 billion parameter model) can be installed with ollama using a command like ollama run llama3.2
Docker desktop is installed and openwebui docker container is created in docker using the command
Comments
Post a Comment