Day 7 – Looking at the RAG Problem Again

🔥

Today I looked at the issue from earlier again to really understand what was going on.

Someone in the community gave me a simple and very helpful tip.Just extract the text from the PDF and convert it into a .md file.For now this is the easiest solution.

I tried it and it worked. I pulled the text out of the PDF, saved it as a markdown file, uploaded it and Pinecone finally accepted it. The text was processed without any problems.

I still don’t fully understand why the original PDF didn’t work.But I will figure it out sooner or later.

In the meantime I started thinking about how to automate this.Maybe I can use an OCR step to extract text from any PDF and then pass that text to Pinecone.Mistral has OCR. Do you know any other good options I should look at?

I also want to test Supabase later to compare how it works with a simple RAG setup.

Now the next step is to send everything into n8n and see what happens.

Let’s keep going my next fight

2 comments