Local Chat-with-pdf app with preview feature using LlamaIndex, Ollama & NextJS
In this tutorial we’ll build a fully local chat-with-pdf app using LlamaIndexTS, Ollama, Next.JS.
https://github.com/rsrohan99/local-pdf-ai/assets/62835870/6f2497ea-15b4-47ea-9482-dade56434b2b
Stack used:
We’ll use Ollama to run the embed models and llms locally.
Install Ollama
$ curl -fsSL https://ollama.com/install.sh | sh
For this guide, I’ve used phi2 as the LLM and nomic-embed-text as the embed model.
To use the model, first we need to download their weights.
$ ollama pull phi
$ ollama pull nomic-embed-text
But feel free to use any model you want.
FilePicker.tsx - Drag-n-drop the PDFThis component is the entry-point to our app.
It’s used for uploading the pdf file, either clicking the upload button or drag-and-drop the PDF file.
return (
<div
className='flex flex-col gap-7 justify-center items-center h-[80vh]'>
<Label htmlFor="pdf" className="text-xl font-bold tracking-tight text-gray-600 cursor-pointer">
Select PDF to chat
</Label>
<Input
onDragOver={() => setStatus("Drop PDF file to chat")}
onDragLeave={() => setStatus("")}
onDrop={handleFileDrop}
id="pdf"
type="file"
accept='.pdf'
className="cursor-pointer"
onChange={(e) => {
if (e.target.files) {
setSelectedFile(e.target.files[0])
setPage(1)
}
}}
/>
<div className="text-lg font-medium">{status}</div>
</div>
)
After successfully upload, it sets the state variable selectedFile to the newly uploaded file.
Preview.tsx - Preview of the PDFOnce the state variable selectedFile is set, ChatWindow and Preview components are rendered instead of FilePicker
First we get the base64 string of the pdf from the File using FileReader. Next we use this base64 string to preview the pdf.
Preview component uses PDFObject package to render the PDF.
It also takes page as prop to scroll to the relevant page. It’s set to 1 initially and then updated as we chat with the PDF.
useEffect(() => {
const options = {
title: fileToPreview.name,
pdfOpenParams: {
view: "fitH",
page: page || 1,
zoom: "scale,left,top",
pageMode: 'none'
}
}
console.log(`Page: ${page}`)
const reader = new FileReader()
reader.onload = () => {
setb64String(reader.result as string);
}
reader.readAsDataURL(fileToPreview)
pdfobject.embed(b64String as string, "#pdfobject", options)
}, [page, b64String])
return (
<div className="flex-grow roundex-xl" id="pdfobject">
</div>
)
ProcessPDF() Next.JS server actionWe also have to process the PDF for RAG.
We first use LangChain WebPDFLoader to parse the uploaded PDF. We use WebPDFLoader because it runs on the browser and don’t require node.js.
const loader = new WebPDFLoader(
selectedFile,
{ parsedItemSeparator: " " }
);
const lcDocs = (await loader.load()).map(lcDoc => ({
pageContent: lcDoc.pageContent,
metadata: lcDoc.metadata,
}))
Next, we pass the parsed documents to a Next.JS server action that initiates the RAG pipeline using LlamaIndex TS
if (lcDocs.length == 0) return;
const docs = lcDocs.map(lcDoc => new Document({
text: lcDoc.pageContent,
metadata: lcDoc.metadata
}))
we create LlamaIndex Documents from the parsed documents.
Next we create a VectorStoreIndex with those Documents, passing configuration info like which embed model and llm to use.
const index = await VectorStoreIndex.fromDocuments(docs, {
serviceContext: serviceContextFromDefaults({
chunkSize: 300,
chunkOverlap: 20,
embedModel, llm
})
})
We use Ollama for LLM and OllamaEmbedding for embed model
const embedModel = new OllamaEmbedding({
model: 'nomic-embed-text'
})
const llm = new Ollama({
model: "phi",
modelMetadata: {
temperature: 0,
maxTokens: 25,
}
})
We then create a VectorIndexRetriever from the index, which will be used to create a chat engine.
const retriever = index.asRetriever({
similarityTopK: 2,
})
if (chatEngine) {
chatEngine.reset()
}
Finally, we create a LlamaIndex ContextChatEngine from the Retriever
chatEngine = new ContextChatEngine({
retriever,
chatModel: llm
})
we pass in the LLM as well.
ChatWindow.tsxThis component is used to handle the Chat Logic
<ChatWindow
isLoading={isLoading}
loadingMessage={loadingMessage}
startChat={startChat}
messages={messages}
setSelectedFile={setSelectedFile}
setMessages={setMessages}
setPage={setPage}
/>
chat() server actionThis server action used the previously created ChatEngine to generate chat response.
In addition to the text response it also returns the source nodes used to generate the response, which we’ll use later to updated which page to show on the PDF preview.
const queryResult = await chatEngine.chat({
message: query
})
const response = queryResult.response
const metadata = queryResult.sourceNodes?.map(node => node.metadata)
return { response, metadata };
We use the response and metadata from the above server action (chat()) to update the messages, and update the page to show in the PDF preview.
setMessages(
[
...messages,
{ role: 'human', statement: input },
{ role: 'ai', statement: response }
]
)
// console.log(metadata)
if (metadata.length > 0) {
setPage(metadata[0].loc.pageNumber)
}
setLoadingMessage("Got response from AI.")
There’re a few things to consider for this project:
fs on browser otherwise pdf-parse will not work. We need to put this in the webpack section of next.config.jsif (!isServer) {
config.resolve.fallback = {
fs: false,
"node:fs/promises": false,
assert: false,
module: false,
perf_hooks: false,
};
}
Thanks for reading. Stay tuned for more.
I tweet about these topics and anything I’m exploring on a regular basis.
Follow me on twitter
We use cookies
We use cookies to analyze traffic and improve your experience. You can accept or reject analytics cookies.