Back to Models
AlibabaAlibaba

Wan | 2.5 | Preview | Text to Video

Video
Text to Video
Image to Video
Enhance / Upscale

This text-to-video system turns detailed prompts into realistic, 1080p clips with smooth motion and synchronized audio. It excels at following complex instructions, giving you precise control over dialogue, camera work, and visual style across cinematic, anime, or illustrated looks. Powered by a pose‑latent transformer, it improves character expression and natural movement, reducing stiffness and common AI artifacts. For best results, write clear prompts, specify scene, rhythm, and sound cues, and iterate to refine timing and style. Ideal for short films, ads, music videos, and rapid creative tests, it balances quality and speed, with typical outputs capped at around 10 seconds.

Synchronized Audio
Character Motion
Wan | 2.5 | Preview | Text to Video

Output Example

Used Prompt

Hyperspeed POV shot of a motorcycle ride, the rider’s hands gripping the handlebars clearly visible. Dodging explosions while weaving through smoke, rubble, and blasts, the camera races forward as the chaotic environment blurs in rapid motion all around.

Negative Prompt

low resolution, error, worst quality, low quality, defects