Back to Models
Stability AIStability AI

Stable Audio 2.5 | Text to Audio

Music & Audio
Text to Music
Audio Enhancement
Sound Effects
Vocal Separation

Turn plain text into studio-quality music and sound effects in seconds. Describe mood, genre, instruments, and structure (intro, build-up, climax, outro) to generate rich, multi-part tracks up to three minutes long. The system captures nuanced directions like “uplifting” or “lush synthesizers,” delivering realistic instrument timbres, stereo depth, and strong alignment to your prompt. Iterate quickly: refine descriptors, adjust complexity, and re-generate to dial in feel and pacing. Ideal for film, games, ads, podcasts, and ambient soundscapes, it supports rapid prototyping and professional delivery on desktop and mobile. Clear, specific prompts yield the most coherent, engaging results.

Music Generation
Prompt Aligned Sound
Stable Audio 2.5 | Text to Audio

Output Example

Used Prompt

A gentle guitar melody slowly evolves into a powerful symphonic crescendo
Model Output Example