Skip to main content
WAN 2.7 Video Prompting Guide

Three modes, one model.
Text, image, extend.

WAN 2.7 Video is Alibaba's flagship video model on PixelDojo — text-to-video, image-to-video, and a video-extend mode that continues an existing clip. 720p (2.5 credits/s) and 1080p (3 credits/s), durations 2-15 seconds, 5 aspect ratios, with prompt-enhancement built in.

Overview

WAN 2.7 Video is the top-of-line in the WAN family — Alibaba's flagship video generation model. Where it stands out: three distinct modes in one model (text-to-video for composing from a prompt, image-to-video for animating a starting frame, video-extend for continuing an existing clip), durations from 2 to 15 seconds, and built-in prompt enhancement.

Two resolution tiers (720p at 2.5 credits/s, 1080p at 3 credits/s) and five aspect ratios — 16:9 for cinematic, 9:16 for vertical, 1:1 for square social, 4:3 / 3:4 for traditional photo proportions. Up to 15 seconds in a single clip — longer than most video models on PixelDojo.

3

Modes (T2V · I2V · Extend)

2–15s

Duration range

720p · 1080p

Resolution tiers

Key Features

T2V · I2V · Video-Extend

Three Modes in One Model

Text-to-video composes from a prompt alone. Image-to-video animates a starting frame from your prompt. Video-extend takes an existing clip and writes a continuation — useful for stretching short clips into longer narratives without re-rendering.

2.5 vs 3 credits per second

720p and 1080p

720p (2.5 credits/s) for iteration and social. 1080p (3 credits/s) for hero shots, atmospheric scenes, and material-heavy product work. 20% credit premium for 1080p — usually worth it for the final render.

Longer than most

Duration 2-15 Seconds

WAN 2.7 supports up to 15 seconds in a single clip — most video models cap at 6 or 10. Useful for atmospheric scenes that need breathing room, multi-shot sequences, or narrative beats that don't fit in 5 seconds.

Continue an existing clip

Video-Extend Mode

Pass a source video and prompt — WAN 2.7 writes a continuation that picks up from the final frame. Great for stretching a 5-second clip you like into a 10-15 second sequence without re-rendering the original.

Example Videos

Each example shows the exact prompt that produced the result. Copy any prompt with one click.

Cinematic Establishing Shot

1080p · 16:9 · 8s · 24 credits

Vibrant cinematic establishing shot drifting forward over a futuristic coastal city at golden hour, neon-lined boardwalks, glass towers reflecting amber light, distant ocean waves crashing, warm sunset color grade, the camera moves at a very slow forward dolly, hyper-detailed 8k

Cinematic establishing shots reward 1080p — atmospheric detail (waves, neon reflections, distant towers) compresses badly at 720p. Pair the 16:9 widescreen with ONE dominant camera move ('slow forward dolly', 'crane up', 'pull back') and let WAN handle pacing.

Character Close-Up

720p · 4:3 · 5s · 12.5 credits

Medium close-up of a young woman with curly auburn hair sitting at a sunlit cafe window, she lifts a porcelain teacup to her lips, takes a slow sip, then turns her head toward the camera with a soft smile, shallow depth of field, warm afternoon light, hyper-detailed skin texture and fabric weave

4:3 aspect lends portrait close-ups a slightly classical feel. WAN 2.7 reads micro-beats well — "lifts cup, takes a sip, turns head, smiles" plays cleanly. 720p is plenty for character close-ups; save the credit premium for atmospheric work.

Product Reveal

1080p · 1:1 · 6s · 18 credits

Studio product shot of a matte-black ceramic coffee mug on a polished walnut surface, steam curling upward, the camera does a slow 90-degree rotational pan around the mug, soft top-down rim lighting against a charcoal gradient backdrop, photoreal advertising quality

1:1 square at 1080p with one explicit camera move (rotational pan, dolly in, push back) is the canonical product spot. WAN 2.7 handles material textures (matte ceramic, polished walnut, brushed metal) cleanly at 1080p.

Atmospheric Landscape

1080p · 21:9 cropped to 16:9 · 10s · 30 credits

Wide cinematic landscape of a misty Norwegian fjord at dawn, mountain peaks emerging through low fog, a single distant fishing boat leaves a thin wake on glassy water, the camera does a very slow forward dolly, cool blue-green palette with warm amber highlight on one peak, sounds of distant water and gentle wind

10-second atmospheric clips reward 1080p — fog, water surface, distant detail all preserved. One slow camera move plus a "warm spot" element to break the cool palette. 16:9 widescreen is the natural fit; ultra-wide 21:9 isn't natively supported, so 16:9 is your widest.

Multi-Shot Action

720p · 16:9 · 12s · 30 credits

12-second action sequence. Shot 1: low-angle of a parkour runner sprinting toward a rooftop edge, wind whipping their jacket. Shot 2: mid-air shot of the runner leaping across a gap between buildings, city lights blurred behind. Shot 3: landing roll into a sprint that continues offscreen. Dynamic handheld camera throughout.

12s gives WAN 2.7 room for a true multi-shot sequence — three distinct beats with clean cuts. 720p is enough for action; cinematic atmosphere is where 1080p shines. Name each shot explicitly ("Shot 1: ... Shot 2: ...") for cleaner cuts.

Prompting Tips

Pick the mode by source material

Text-to-video when you have only a prompt. Image-to-video when you have a strong starting frame you want to animate. Video-extend when you have a 5-10 second clip you want to stretch to 15s without re-rendering. Same prompt language across all three.

Use 1080p for atmospheric, 720p for action

1080p preserves fog, water, foliage, and material textures. 720p is plenty for human action, character close-ups, and most prompts where motion matters more than fine detail. 20% credit premium for 1080p — pick by use case.

Lean into longer durations

WAN 2.7's 2-15s range is wider than most video models. Use 8-12 seconds for atmospheric scenes that need breathing room; 10-15s for genuine multi-shot sequences. Don't default to 5s if your story needs more.

Name one camera move per shot

'Slow forward dolly', 'crane up', '90-degree rotational pan', 'handheld follow' — WAN reads these literally. Combining moves ("dolly in then pan left") usually produces confused output. One move per shot is the rule.

Multi-shot for long durations

10-15s clips benefit from explicit "Shot 1: ... Shot 2: ... Shot 3: ..." structure. WAN 2.7 renders cuts cleanly when you name them. Stuffing one beat into 15s wastes the duration — use multi-shot or pick a shorter clip.

Negative prompt is supported

WAN exposes negative_prompt for things to avoid (distorted hands, blurry motion, weird artifacts). Keep it short and focused. The default works for most prompts; only customize when you're fighting a specific failure mode.

Settings Reference

SettingValuesNotes
Modetext-to-video · image-to-video · video-extendI2V needs first_frame_image; video-extend needs source video.
Resolution720p (2.5 cr/s) · 1080p (3 cr/s)20% credit premium for 1080p. Worth it for atmospheric / material-heavy clips.
Duration2-15 seconds (integer)Longer than most video models. Use 8-12s for atmospheric, 10-15s for multi-shot.
Aspect ratio16:9 · 9:16 · 1:1 · 4:3 · 3:45 presets covering common output frames.
Negative promptstring (optional, max 500 chars)Default works for most prompts.
Prompt enhancementOn / OffOptional WAN-side prompt expansion. Try both; sometimes off produces cleaner output.

FAQ

720p (2.5 cr/s) for iteration, social posts, character work, and human action — plenty of fidelity for these. 1080p (3 cr/s) for atmospheric scenes (fog, water, foliage), product shots with detailed materials, and hero clips. The 20% credit premium is usually worth it for the final render.