🧠 Identical AI Agents, Different Results - What’s Going On?
Hey everyone, I’ve run into something strange and wanted to see if anyone else has experienced this. We’ve built two AI agents using VAPI to place part orders by calling warehouse reps. - One agent handles website orders - The other handles eBay orders Both agents: - Use the exact same VAPI prompt - Pull identical part details, trims, and rep names from Airtable - Use the same phone number - Follow the same call structure and automation logic Yet somehow, the website agent is consistently performing better, while the eBay agent has more failed calls, even though the inputs are literally the same. We’ve already ruled out: - Trim formatting differences - Time-of-day patterns - Call volume differences - Assistant prompt mismatches - Session overlap or stale calls Everything looks clean but the eBay agent’s calls still fail more often (either reps hang up, get confused, or the call flow breaks). Has anyone seen something similar before? Could there be something subtle like session memory bleed, invisible differences in the way the call is initialized, or even rep behavior bias based on caller history? Would love to hear your thoughts or debugging tips 👇