How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites)
When building AI agents, you need them to understand your data—whether it’s PDFs, websites, or internal documents.
Most tools for this are closed-source, requiring API keys and external platforms. But what if you could do it all in Python with an open-source library?
In this week’s video, I show you how to build a fully open-source document extraction pipeline using Docling. You’ll learn how to:
  • Extract, parse, and chunk documents for AI processing.
  • Store and retrieve data efficiently with vector databases.
  • Build a working chat application that can answer questions based on your documents.
77
34 comments
Dave Ebbelaar
7
How to Get Your Data Ready for AI Agents (Docs, PDFs, Websites)
Data Alchemy
skool.com/data-alchemy
Your Community to Master the Fundamentals of Working with Data and AI — by Datalumina®
Leaderboard (30-day)
Powered by