Dan Quixote

I built a voice AI assistant for my website using Pipecat and Gemini's native audio model. People kept calling it trying to reverse-engineer how it works, so I just open-sourced the whole thing. It's a good starting point if you want to build your own web-based voice AI demo with low latency and multilingual support. Repo: https://github.com/askjohngeorge/askjg-demo-gemini-pcc System prompt: https://github.com/askjohngeorge/askjg-demo-gemini-pcc/blob/main/bot/prompts/demo_system_prompt.md You can try the live demo at https://askjohngeorge.com/demo (click the mic). Happy to answer questions if you have any.

New comment 3d ago

Dan Quixote

2 likes • 5d

@John George - that's amazing! Suddenly the demo I just finished building on LiveKit feels so ordinary. Back to the drawing board🙈.

Dan Quixote

1 like • 5d

@John George I had to Google Kevin Gates - showing my age. It's brilliant - I just tried speaking to it Spanish: Accent from Andalucía✅ Accent from Argentina✅ Then.. Accent from Manchester (Oasis style)✅

Jin Park

19d •

LiveKit

I cooked up a raw Voice AI orchestration engine from scratch using 𝗟𝗶𝘃𝗲𝗞𝗶𝘁 & 𝗣𝘆𝘁𝗵𝗼𝗻. 🍳

While wrappers are great for MVPs, building your own orchestration layer gives you 𝗳𝘂𝗹𝗹 𝗼𝘄𝗻𝗲𝗿𝘀𝗵𝗶𝗽, 𝘀𝗶𝗴𝗻𝗶𝗳𝗶𝗰𝗮𝗻𝘁𝗹𝘆 𝗹𝗼𝘄𝗲𝗿 𝗰𝗼𝘀𝘁𝘀, 𝗮𝗻𝗱 𝗴𝗿𝗮𝗻𝘂𝗹𝗮𝗿 𝗰𝗼𝗻𝘁𝗿𝗼𝗹 over the entire conversational pipeline. I designed this engine to fully replace third-party wrappers like Vapi & Retell AI. Here is a deep dive into what’s under the hood: 🔄 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗔𝗴𝗲𝗻𝘁 𝗖𝗼𝗻𝗳𝗶𝗴𝘂𝗿𝗮𝘁𝗶𝗼𝗻 (𝗥𝗲𝗮𝗹-𝗧𝗶𝗺𝗲 𝗛𝘆𝗱𝗿𝗮𝘁𝗶𝗼𝗻) Hardcoding agents is a trap. I implemented a system that executes an API call upon call initialization. • 𝗛𝗼𝘁-𝗦𝘄𝗮𝗽𝗽𝗮𝗯𝗹𝗲 𝗣𝗲𝗿𝘀𝗼𝗻𝗮𝘀: A single engine instance can instantly apply unique System Prompts, Voice IDs, and Temperature settings based on backend parameters. • 𝗥𝗲𝘀𝘂𝗹𝘁: You can power thousands of unique agents (e.g., specific to different businesses) without ever redeploying the core code or creating a new instance. 🛠️ 𝗖𝗼𝗻𝘁𝗲𝘅𝘁-𝗔𝘄𝗮𝗿𝗲 𝗙𝘂𝗻𝗰𝘁𝗶𝗼𝗻 𝗥𝗼𝘂𝘁𝗲𝗿 When building raw infrastructure, manually mapping tools to agents is a major architectural hassle. I built specialized helper logic for 𝗗𝘆𝗻𝗮𝗺𝗶𝗰 𝗧𝗼𝗼𝗹 𝗜𝗻𝗷𝗲𝗰𝘁𝗶𝗼𝗻 to solve this. • 𝗠𝗼𝗱𝘂𝗹𝗮𝗿 𝗟𝗼𝗴𝗶𝗰: The router decouples the orchestration engine from business logic. It parses the backend setup and assigns only the specific tools defined in that agent's configuration (e.g., loading "Appointment Booking" tools only when the specific use-case demands it). 💾 𝗗𝗮𝘁𝗮 𝗣𝗲𝗿𝘀𝗶𝘀𝘁𝗲𝗻𝗰𝗲 & 𝗣𝗼𝘀𝘁-𝗖𝗮𝗹𝗹 𝗜𝗻𝘁𝗲𝗹𝗹𝗶𝗴𝗲𝗻𝗰𝗲 Logs aren't enough. I built a save_conversation function that aggregates the full session payload and triggers intelligent sub-functions immediately after the call: • 𝗖𝗮𝗹𝗹 𝗦𝘂𝗺𝗺𝗮𝗿𝘆: Generates a natural language recap via LLM. • 𝗖𝗮𝗹𝗹 𝗘𝘃𝗮𝗹𝘂𝗮𝘁𝗶𝗼𝗻: Structurally classifies the outcome (e.g., "Booked", "Inquiry", "Failed"). • 𝗧𝗲𝗹𝗲𝗺𝗲𝘁𝗿𝘆: Captures precise Token Usage (for billing) and Latency statistics alongside the transcript. 🛡️ 𝗣𝗿𝗼𝗱𝘂𝗰𝘁𝗶𝗼𝗻 𝗚𝘂𝗮𝗿𝗱𝗿𝗮𝗶𝗹𝘀 To prevent runaway costs and "zombie" connections, I engineered active background monitors: • 𝗜𝗻𝗮𝗰𝘁𝗶𝘃𝗶𝘁𝘆 𝗠𝗼𝗻𝗶𝘁𝗼𝗿: Detects silence (30s default) and gracefully terminates the session.

New comment 6d ago

Dan Quixote

0 likes • 12d

Hi @Jin Park , thanks for sharing. I particularly like this: ->🛠️ 𝗖𝗼𝗻𝘁𝗲𝘅𝘁-𝗔𝘄𝗮𝗿𝗲 𝗙𝘂𝗻𝗰𝘁𝗶𝗼𝗻 𝗥𝗼𝘂𝘁𝗲𝗿 - that make this agent truly scalable.

Ankur Golwa

Dec '25 •

General discussion

Pipecat vs Livekit

Might be a silly one but Is one better than the other .? 1.In terms of latency and 2.agent orchestration ?

New comment Jan 2

Dan Quixote

3 likes • Jan 2

There was a discussion on this a couple of months ago if it helps: https://www.skool.com/open-source-voice-ai-community-6088/pipecat-vs-livekit

Mohammad Mussab

Nov '25 •

General discussion

Best Observability Tools for Voice AI Frameworks?

What observability tools are others using with Pipecat or similar voice AI frameworks? I've built a production voice agent using Pipecat and currently track basic metrics (call duration, sentiment, summary, transcripts) in a custom dashboard. Tomorrow it's going in production so problem I think I can face is When errors will occur, debugging is painful. My current logging approach creates massive log files that are nearly impossible to analyze efficiently when tracking down issues.

New comment 25d ago

Dan Quixote

1 like • Nov '25

@Mohammad Mussab I've just started using Logfire from Pydantic (https://pydantic.dev/logfire). I'm working on a whatsapp chatbot at the moment so not tested it with voice but it's so handy to quickly identify errors through their logs. Will be checking out Whisker though for sure.

Robert Figueroa

Nov '25 •

General discussion

Just sayin

Honestly, I've been building voice AI agents with Ultravox AI and the help of Claude. I understand that this community is really about LiveKit and Pipecat and open source, but Ultravox is also open source. I've had a lot of success using Ultravox AI. I've had some success with LiveKit and Pipecat, but I've had the most success with Ultravox AI. I think it's undervalued and overlooked as a source for open source AI agents and building them.oh, and…Claude is king!

New comment Nov '25

Dan Quixote

1 like • Nov '25

@Robert Figueroa Loving Claude Code too - (except the token limits - $20 to $100 is a big jump!).

Dan Quixote

1 like • Nov '25

@Robert Figueroa Not using much in the way of sub-agents yet. But I did introduce Serena recently which apparently helps reduce token use significantly.

1-10 of 10

Level 2

3points to level up

Dan Quixote

@dan-quixote-8098

Madrid based

Active 4h ago

Joined Nov 8, 2025

Spain / UK