Back to Models
QwenQwen

Qwen Image

Image
Text to Image
Style Transfer
Image Enhancement
Character Design

Qwen-Image is a powerful open-source foundation model for image generation and editing, built on an MoE-driven Multimodal Diffusion Transformer. It excels at rendering clean, accurate text directly in images (English and Chinese), handling multi-line and paragraph layouts with strong layout coherence. Beyond text-to-image, it supports advanced edits like style transfer, object insertion/removal, pose manipulation, and detail enhancement, plus multi-image editing for consistent person-to-product or scene compositions. It integrates with ComfyUI and offers GGUF quantization for local use. Provide specific, structured prompts, and use ControlNet inputs (depth/edges/keypoints) for precise control. Ideal for marketing visuals, e-commerce posters, comics, and multilingual design.

Multilingual Text Rendering
Advanced Image Editing
Multi Image Composition
Qwen Image

Output Example

Used Prompt

A steampunk astronaut playing a grand piano on the edge of a floating cliff in the sky, under a golden sunset. The cliff is covered in moss and rusted metal pipes, with small mechanical birds perched around. The astronaut’s suit is detailed with brass, leather straps, and glowing blue tubes. Clouds drift below, while distant airships pass in the background. The lighting is dramatic, casting long shadows and warm reflections on the piano’s surface. Ultra-detailed, cinematic composition, dreamy and surreal atmosphere, 8K.
Model Output Example