Skip to main content
Pixverse V6 Prompting Guide

Pixverse V6.
Multi-clip transitions, stylized presets.

Pixverse V6 is the latest from Pixverse — text-to-video, image-to-video, and a unique multi-clip mode (first frame + last frame = automatic transition). Six built-in style presets (anime, 3D animation, clay, comic, cyberpunk) plus 'none' for photoreal.

Overview

Pixverse V6 is Pixverse's latest video model — text-to-video and image-to-video like most competitors, plus a unique multi-clip mode that takes a first frame AND a last frame and generates the transition between them. Useful for narrative beats where you have anchor shots but need the in-between motion.

Six built-in style presets (anime, 3D animation, clay, comic, cyberpunk, plus 'none' for photoreal default). Five aspect ratios (16:9, 9:16, 1:1, 4:3, 3:4), resolutions from 360p through 1080p, durations from a few seconds, and optional audio generation.

3

Modes — T2V · I2V · Multi-clip

6

Style presets

5

Aspect ratios

Key Features

First frame + last frame → motion

Multi-Clip Transitions

Pass image AND last_frame_image and Pixverse V6 generates the motion that connects them — useful when you have two anchor frames (start of a beat, end of a beat) and want the model to fill in the transition naturally. Few other models offer this directly.

Anime · 3D · Clay · Comic · Cyberpunk · None

Six Style Presets

Built-in style presets snap the output toward specific aesthetics without needing to engineer the look entirely in prompt. 'Anime' for cel-shaded cartoon work. 'Clay' for stop-motion feel. 'Comic' for graphic-novel panels. 'None' for photoreal default.

360p iteration → 1080p hero

Resolution Flexibility

Four resolution tiers (360p, 540p, 720p, 1080p) let you trade quality for cost. 360p is iteration-cheap for prompt testing. 720p is the production sweet spot. 1080p for hero shots.

16:9 · 9:16 · 1:1 · 4:3 · 3:4

Five Aspect Ratios

Standard aspect coverage from landscape cinematic through square social to vertical reels and traditional 4:3 / 3:4 framings. Pick by downstream channel rather than upscaling/cropping later.

Example Videos

Each example shows the exact prompt that produced the result. Copy any prompt with one click.

Cinematic Train Station

720p · 16:9 · 5s · style: none

A vintage train pulls into a steam-clouded station at dawn, the conductor steps off and looks up at the clock tower, soft cinematic lighting, painterly atmosphere, slow tracking shot

"Style: none" leaves the model in photoreal default mode. Pair atmospheric language ("steam-clouded", "painterly atmosphere") with a clear camera move ("slow tracking shot") for cinematic motion that reads as directed.

Anime Rooftop Scene

720p · 16:9 · 5s · style: anime

Anime-style: a young woman with windswept hair stands at the edge of a sunlit rooftop overlooking a sprawling city, her ribbon flutters, the camera does a slow gentle pull-back, soft cel-shaded coloring, golden hour palette

Setting style=anime nudges the model toward cel-shaded animation defaults. Reinforce with anime-vocabulary ("cel-shaded coloring", "soft palette") and the motion stays clean. Combining style anime with anime-language amplifies the effect.

Vertical Action Beat

720p · 9:16 · 5s · style: none

A skateboarder ollies over a curb on a sunlit boardwalk, palm trees behind, vertical handheld follow shot, summer warmth, cinematic color grade

9:16 vertical at 720p is mobile-social sweet spot. Name the camera type explicitly ("vertical handheld follow shot") — Pixverse renders handheld motion convincingly when prompted. "Cinematic color grade" elevates raw documentary to polished content.

Prompting Tips

Match style preset to language

If you set style='anime', the prompt should also use anime vocabulary ('cel-shaded', 'ribbon flutters', 'expressive eyes'). Mismatched style + photoreal-language produces hybrid output. Same for clay (stop-motion language) and comic (graphic-novel language).

Iterate at 360p, finalize at 720p+

360p is much cheaper for prompt iteration. Once you've nailed the prompt, regenerate at 720p (production sweet spot) or 1080p (hero shots) for the keeper. Don't burn 1080p credits on first drafts.

Multi-clip mode for narrative beats

Got two anchor frames? Pass both as image and last_frame_image, set multi_clip=true, and let Pixverse generate the transition. Strong for storyboard-driven work where you've already drawn the start and end of a beat.

Moderation skews conservative

Pixverse rejects prompts that contain certain brand references ('Pixar', some named characters) or perceived edge content. If a prompt fails with HTTP 500063, rephrase to use generic descriptions ('Pixar-quality' → '3D animated', etc).

Pick aspect by channel

16:9 for YouTube / standard horizontal. 9:16 for TikTok / Reels / Shorts. 1:1 for square social spots. 4:3 / 3:4 for traditional / retro framings. Generate at the right aspect — Pixverse composes differently per ratio.

Camera move + atmosphere = polish

Pair a named camera move ('slow tracking', 'gentle pull-back', 'handheld follow') with atmospheric anchor ('golden hour', 'cinematic color grade', 'painterly atmosphere'). Both together produce directed-looking output even from sparse prompts.

Settings Reference

SettingValuesNotes
ModeText-to-video · Image-to-video · Multi-clipMulti-clip requires both image and last_frame_image. T2V uses prompt only.
Stylenone · anime · 3d_animation · clay · comic · cyberpunkSnap the aesthetic. Match style preset to prompt vocabulary.
Resolution360p · 540p · 720p · 1080p720p is the production sweet spot. 360p for cheap iteration.
Aspect ratio16:9 · 9:16 · 1:1 · 4:3 · 3:4Five aspects. Pick by downstream channel.
AudioBoolean toggleOptional synced audio generation.
Thinking typeauto · disabled · enabledOptional motion-planning step. "Auto" is the safe default.
Negative promptString (optional)Keep short and focused.

FAQ

Multi-clip takes a first frame AND a last frame and generates the motion that connects them. Great for storyboard-driven work where you have two anchor shots but need the in-between motion. Few other video models offer this — typically you'd have to generate two clips and edit them together.