Chat With Your Documents Using AI: Cost Effective Large Language Models And Vector Embeddings

By Christian OFOEFULE

In this new age of ChatGPT, we have seen the capabilities of large language models and how they have taken centre stage, showcasing their remarkable potential with minimal effort.

In this article, we are going to see how to build a simple application that enables us to chat with our documents. The document can be in PDF or text format.

We are going to use some important libraries like Langchain and learn the concepts of vector embeddings and why they are important both for cost savings and reading large files.

The Foundation: Understanding Embeddings and Vector Databases

Unleashing the Power of Embeddings

Vector embeddings are a way of representing textual data in the form of numerical points. With embeddings, we can efficiently capture the semantic meaning of words, phrases, or entire documents. This approach allows us to overcome the limitations of direct text processing and facilitates advanced operations like similarity searches.

The Role of Vector Databases

Vector databases simply store vector embeddings. Examples include Qdrant, Pinecone, and Weaviate as they stand as notable players. These databases provide a structured environment for storing and querying embeddings.

In essence, a cluster in Qdrant can house multiple collections, each functioning as a database. Within these collections, vectors (numerical representations) of data or text are stored as points, enabling seamless similarity searches.

Cost-Efficient AI: Storing and Querying Embeddings

With great power comes a great cost: When using powerful large language models like ChatGPT the cost associated with generating embeddings for every search query can become a significant concern, especially when utilising OpenAI API. The solution lies in the strategic use of vector databases to store and query embeddings efficiently.

Qdrant: Building Clusters and Collections

Qdrant is a powerful vector similarity search engine with a user-friendly API that enables you to effortlessly store, search, and manage vectors along with additional payloads.

In Qdrant, the process begins by creating a cluster, a high-level organizational unit. Within a cluster, multiple collections or databases can coexist. Each collection serves as a repository for vectors, enabling targeted searches within specified data sets. When you embed your text and send it to the Qdrant database, the resulting embeddings become stored as points within the designated collection.

Pinecone and Weaviate: Alternative Vector Database Solutions

Pinecone and Weaviate offer alternative avenues for storing and querying embeddings. With Pinecone’s robust infrastructure or Weaviate’s feature-rich capabilities, developers can explore diverse options based on their specific needs and preferences.

Langchain:

Langchain is a framework for developing applications powered by language models, it provides the libraries and APIs and establishes communication between the models and vector database.

You can install with pip with the following command:

pip install langchain

Language Models — Shoutout to Alejandro AO’s guide for this architectural representation.

Code Sample

We will write a simple code sample on how we can chat with our documents. The steps will be divided into 2 sections.

Storing/Uploading our document as embeddings
Chatting with the document embeddings using ChatGPT

Step 1: Import Libraries

import os
import qdrant_client
from qdrant_client.http import models
from langchain.vectorstores import Qdrant
from langchain.document_loaders import TextLoader
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.chains import RetrievalQA
from langchain.llms import OpenAI

Step 2: Read the document and split into chunks

loader = UnstructuredPDFLoader(/path-to-pdf-document)
documents = loader.load()
//split file into small chunks
text_splitter = RecursiveCharacterTextSplitter(chunk_size=800, chunk_overlap=80)
chunks = text_splitter.split_documents(documents)

Step 3: Create Embedding and Language Model Instances

embeddings = OpenAIEmbeddings()
llm = OpenAI()

Step 4: Create Qdrant Client

client = qdrant_client.QdrantClient(
	os.environ["QDRANT_HOST"],
	api_key=os.environ["QDRANT_API_KEY"],
)

Step 5: Create Vector Store

vector_store = Qdrant(
	client=client,
	collection_name=os.environ["QDRANT_COLLECTION_NAME"],
	embeddings=embeddings,
)

Step 6: Upload the document chunks as embeddings to Qdrant vector store

vector_store.add_documents(chunks)

Now that we have successfully uploaded our documents as embeddings to a vector store, we can now chat with it. For this, we would be using langchain’s RetrievalQA API

Step 7: Create a RetrievalQA chain type

qa = RetrievalQA.from_chain_type(
	llm=llm,
	chain_type="stuff",
	retriever=vector_store.as_retriever()
)

Step 8: Run the Query and Return the Answer:

question = ‘enter your question about the document’
answer = qa.run(question)

As we chat with our documents using AI, the fusion of cutting-edge technologies empowers developers and businesses to navigate the evolving landscape of conversational AI.

By understanding the nuances of embeddings, harnessing the capabilities of vector databases, and preserving chat history in document databases, we embark on a journey toward AI-driven conversations that are not just intelligent but also seamlessly integrated with the wealth of information stored in our documents.

In the dynamic world of AI-powered chat applications, the convergence of embeddings, vector databases, and language models marks a paradigm shift. By utilising these tools, we open up avenues for efficient, cost-effective, and well-documented AI-powered conversations.

***Christian Ofoefule is a Software Engineer

Chat With Your Documents Using AI: Cost Effective Large Language Models And Vector Embeddings

SeerBit X Sabre: Addressing Payment Challenges In The Airline Industry

How To Prevent Late Payments From Crippling Your Business

Exploring Trust, Authenticity, And Engagement In A Saturated Digital Space

Redefining The Disconnect In Strategic Communication Between The Nigerian Govt And The People

The Urgent Need For Mass Transportation

Uncovering The Motives Behind Governor Lawal’s Media Campaign Against Matawalle

Latest Posts

Umunneochi Mayor Launches Rural Road Revamp To Boost Connectivity, Security

NCC, Stakeholders Meet, Discuss Draft A2P Messaging Framework

PalmPay, Jumia Launch Holiday Campaign To Reward Users

Popular Posts

Tomi Ajiboye Joins SeamlessHR As Head Of Marketing And Communications

Cellulant Brokers $47.5m Deal With TPG, Affirms Why Future Of Payment Is Digital

Strategies For Maintaining Collaboration And Efficiency Among Remote Agile Teams

Chat With Your Documents Using AI: Cost Effective Large Language Models And Vector Embeddings

The Foundation: Understanding Embeddings and Vector Databases

Cost-Efficient AI: Storing and Querying Embeddings

Langchain:

Related Posts