Continue speech from an audio prompt!

Input (Audio)

Recommended length: 5–20 seconds. Avoid long silence at the start/end of audio — these can cause the model to get stuck in a loop, producing only silence.

Automatically trim silence from the beginning and end of input audio (for more stability)

Generation Parameters (100 tokens ≈ 1 second)

0.1 2
0.1 1
100 3000
0 1000

Output

We thank Marin and OpenAthena for enabling this project with open-development LLM and training infrastructure.