No $ No Latency local TTS for any need? Chats? Your Agent? Yes it's possible and easy NOW.
Eleven labs is cool but pricey. Local models are GPU intensive. Introducing Kokoro... a high quality, emotive Text to Speech Model. Its been out for a bit, but the local setup can be a pain. And no Ollama version available yet (but rumored Ollama is building that functionality). Now some great people have built a complete Docker version with built-in API to hit from your script for whatever REALTIME voice needs you have in your app, gpu and cpu build versions and a gradio and web interface. I tried the latest beta branch and there was ZERO latency with some great high-quality voices (even some weird ASMR ones... not judging ;) ). Don't be intimidated by Docker if you haven't used it (its free and basically just a complete .venv and app combo that is builds itself and is hosted in the cloud). This is a work in progress so i tried the 1.2 pre beta github branch using the web interface. The API is only set on the main 1.0 branch for now it seems. Did i mention it's basically a free voice service for whatever you're building?
0
1 comment
Kevin Dragan
3
No $ No Latency local TTS for any need? Chats? Your Agent? Yes it's possible and easy NOW.
AI Developer Accelerator
skool.com/ai-developer-accelerator
Master AI & software development to build apps and unlock new income streams. Transform ideas into profits. ๐Ÿ’กโž•๐Ÿค–โž•๐Ÿ‘จโ€๐Ÿ’ป๐ŸŸฐ๐Ÿ’ฐ
Leaderboard (30-day)
Powered by