Speech To Speech: an effort for an open-sourced and modular GPT4-o · Data Alchemy

Marcio Pacheco

Sep '24 • 💬 General

Speech To Speech: an effort for an open-sourced and modular GPT4-o

The repository implements a speech-to-speech cascaded pipeline with consecutive parts:

Voice Activity Detection (VAD): silero VAD v5
Speech to Text (STT): Whisper checkpoints (including distilled versions)
Language Model (LM): Any instruct model available on the Hugging Face Hub! 🤗
Text to Speech (TTS): Parler-TTS🤗

https://github.com/huggingface/speech-to-speech

7

5 comments

Marcio Pacheco

Speech To Speech: an effort for an open-sourced and modular GPT4-o

Data Alchemy

skool.com/data-alchemy-9173

Your Community to Master the Fundamentals of Working with Data and AI — by Datalumina®

Leaderboard (30-day)

1

Pavan Sai

+68

2

Yves Joseph Sikati

Yves Joseph Sikati

+12

3

Marcio Pacheco

+11

4

Nicolai Carlo Abruzzese Aguirre

Nicolai Carlo Abruzzese Aguirre

+11

5

Pierre-Henry Isidor

Pierre-Henry Isidor

+10