9d (edited) • 💡 Help
Efficient Parallel LLM Execution in FastAPI?
In our pipeline, a FastAPI endpoint receives the request. In one request, different agents can run — 2 from the Orchestrator (planning + combining), 4 from the ExcelAgent, 1 from the Researcher (which handles RAG and web search), and 1 from the Image Generator. That means there can be up to 8 LLM calls in total. Right now, I’m using concurrent.futures to run the agents in parallel, and everything runs on Celery. Is this the best approach, or is there a more standard/efficient way to handle this concurrency?
2
3 comments
Cloud Bagtas
3
Efficient Parallel LLM Execution in FastAPI?
Data Alchemy
skool.com/data-alchemy
Your Community to Master the Fundamentals of Working with Data and AI — by Datalumina®
Leaderboard (30-day)
Powered by