AI Image and Video API Platform
Compare models, inspect endpoints, and access docs in one place.
API Reference
Public docs
Endpoints, schemas, and examples.
OpenAPI 3.1
Typed integrations
SDK generation and machine-readable specs.
LLM Docs
Agent-ready docs
`llm.txt` and agent integration help.
API Keys
Authentication
Create keys for apps and agents.
Usage Dashboard
Requests and credits
Monitor volume, logs, and balance.
Buy Credits
Prepaid capacity
Top up for image and video usage.
Available Models
127 AI models for image and video generation behind one async control plane
Change Camera Angle
Camera-aware editing via fal.ai Qwen Image Edit 2511 with multi-angle LoRA — 360° orbit, tilt, and zoom.
/models/change-camera-angle/runConsistent Characters
Generate consistent character variations with FLUX Kontext, Nano Banana Pro/2, Flux 2 Dev, or Qwen Image 2 Pro.
/models/consistent-characters/runCreative Upscale
Clarity Upscaler (creative upscale) via Replicate — boost detail with stable-diffusion refinement.
/models/creative-upscale/runDreamina 3.1
ByteDance Dreamina 3.1 — 4MP cinematic text-to-image with precise style control.
/models/dreamina/runErnie
Baidu Ernie text-to-image (fal.ai). Multilingual prompts and built-in prompt expansion.
/models/ernie/runFace Enhance
Crystal Upscaler via Replicate — face-detail preserving upscale, cost scales with output megapixels.
/models/face-enhance/runFLUX
FLUX family on Replicate — Schnell, Dev, Pro, Kontext, Ultra, and LoRA remix variants in one entrypoint.
/models/flux/runFlux 2 Flex
Max-quality with up to 10 reference images
/models/flux-2-flex/runFlux 2 Klein 4B
Very fast generation and editing with up to 5 reference images
/models/flux-2-klein-4b/runFlux 2 Klein 9B
4-step distilled FLUX.2 [klein] foundation model for flexible control
/models/flux-2-klein-9b/runFlux 2 Pro
High-quality with up to 8 reference images
/models/flux-2-pro/runFlux 2 Max
The highest fidelity image model from Black Forest Labs
/models/flux-2-max/runFlux 2 Dev
Fast quality with up to 4 reference images
/models/flux-2-dev/runFlux 2 Dev + LoRA
Dev model with custom LoRA support
/models/flux-2-lora/runFlux Edit (Kontext)
Black Forest Labs FLUX.1 Kontext for text-driven image editing — Dev (open-weight), Pro (state-of-the-art), and Max (premium typography).
/models/flux-edit/runFlux Dev
High-quality development model with configurable steps, guidance, and LoRA support.
/models/flux-dev/runFlux Krea Dev
Photorealistic generation that avoids the oversaturated AI look. LoRA compatible.
/models/flux-krea-dev/runFlux Dev Multi LoRA
Supports multiple custom LoRAs simultaneously for complex style combinations.
/models/flux-dev-multi-lora/runFlux 1.1 Pro
Latest pro model with enhanced quality and strong prompt adherence.
/models/flux-1.1-pro/runFlux 1.1 Pro Ultra
Highest quality Flux model with raw mode for natural-looking images.
/models/flux-1.1-pro-ultra/runFlux Kontext Pro
Advanced model with state-of-the-art performance for both generation and editing.
/models/flux-kontext-pro/runFlux Kontext Max
Premium model with maximum performance and improved typography for generation and editing.
/models/flux-kontext-max/runGoogle Gemini Flash
Fast generation with Gemini 2.5 Flash
/models/gemini-flash/runGoogle Nano Banana Pro
SOTA with accurate typography and reasoning
/models/nano-banana-pro/runGoogle Nano Banana 2
Next-generation SOTA model with stronger consistency
/models/nano-banana-2/runNano Banana Edit
Google Nano Banana image editing — multi-image fusion + edit instruction with Standard/Pro/Pro-fal tiers and 1K/2K/4K resolution.
/models/google-nano-banana/runGPT-Image 1.5 Low
Fast, lower detail generation
/models/gpt-image-low/runGPT-Image 1.5 Medium
Balanced quality and speed
/models/gpt-image-medium/runGPT-Image 1.5 High
Maximum detail and quality
/models/gpt-image-high/runGrok Imagine R2V
xAI Grok Imagine reference-to-video via Replicate — 1–7 reference images plus prompt for 1–10s clips at 480p or 720p.
/models/grok-r2v/runGrok Video Extend
xAI Grok Imagine video extension — continue an existing MP4 with a prompt-directed extension (2–10s).
/models/grok-video-extend/runHailuo Standard
Premium quality text-to-video and image-to-video
/models/hailuo-standard/runHailuo Fast
Fast image-to-video generation
/models/hailuo-fast/runHeygen Avatar
Heygen Avatar 4 via fal.ai — animate a portrait with prompt-driven speech or an audio track, with optional background and captions.
/models/heygen-avatar/runHiDream L1 Fast
HiDream L1 Fast - Fast generation
/models/hidream-l1-fast/runHiDream L1 Dev
HiDream L1 Dev - Fast generation
/models/hidream-l1-dev/runHiDream L1 Full
HiDream L1 Full - Highest quality
/models/hidream-l1-full/runHiDream E1.1
HiDream E1.1 - Fast generation
/models/hidream-e1.1/runHunyuan 3D
Tencent Hunyuan 3D 3.1 — generate 3D meshes from a text prompt or a single image.
/models/hunyuan-3d/runIdeogram Character
Generate consistent characters from a single reference image in many styles.
/models/ideogram-character/runCharacter Stylist
One-shot FLUX Kontext variants — filters, cartoonify, iconic locations, haircut swap, headshots, renaissance, face-to-many, and more.
/models/image-editor/runImage Relighting
Relight images with Magic Lighting, Nano Banana Pro/2, or Qwen Image Edit — multi-provider routing with per-model credit rates.
/models/image-relighting/runImage to 3D
Firtoz TRELLIS — generate textured 3D meshes from one or more reference images.
/models/image-to-3d/runFlux Image to Image
FLUX Dev LoRA image-to-image on Replicate — prompt + source image + optional LoRA weights.
/models/image-to-image-flux/runImagineart
Imagineart 1.5 Pro image generation (fal.ai).
/models/imagineart/runKling Image V3
Kling Image V3 (fal.ai) — high-quality text-to-image with flexible aspect ratios.
/models/kling-image/runKling Image Edit
Kling Image V3 (fal.ai) image-to-image editing with a text instruction.
/models/kling-image-edit/runKling Motion Control v3 Standard
Kling Video v3 Standard motion control endpoint
/models/kling-motion-control/runKling Motion Control v3 Pro
Kling Video v3 Pro motion control endpoint
/models/kling-motion-control-pro/runKling Reference to Video
Kling O3 reference-driven video generation — image or video references, Standard or Pro tier.
/models/kling-reference-to-video/runKling 2.6 Pro
Kling Video v2.6 Pro (fal.ai) — text-to-video or image-to-video, 5 or 10 seconds, with audio generation.
/models/kling-v2-6/runKling Video v3 Standard (Text)
Standard text-to-video with native audio
/models/kling-video-v3-standard-text/runKling Video v3 Standard (Image)
Standard image-to-video with native audio
/models/kling-video-v3-standard-image/runKling Video v3 Pro (Text)
Pro text-to-video with cinematic quality and native audio
/models/kling-video-v3-pro-text/runKling Video v3 Pro (Image)
Pro image-to-video with cinematic quality and native audio
/models/kling-video-v3-pro-image/runKling Video Edit
Kling O3 video-to-video edit — Standard or Pro, with optional reference images and audio preservation.
/models/kling-video-edit/runLip Sync
Replicate sync/lipsync-2 — align mouth movements in a video to a separate audio track.
/models/lip-sync/runLTX 2.3 Fast Text-to-Video
Fast text-to-video generation (6-20s, 1080p-2160p).
/models/ltx-2-fast-t2v/runLTX 2.3 Fast Image-to-Video
Fast image-to-video generation (6-20s, 1080p-2160p).
/models/ltx-2-fast-i2v/runLTX 2.3 Pro Text-to-Video
Higher quality text-to-video generation (6-10s, 1080p-2160p).
/models/ltx-2-pro-t2v/runLTX 2.3 Pro Image-to-Video
Higher quality image-to-video generation (6-10s, 1080p-2160p).
/models/ltx-2-pro-i2v/runLTX 2.3 Pro Extend Video
Extend an existing video clip from the start or end (1-20s, Pro tier only).
/models/ltx-2-pro-extend/runMagnific Upscaler
Freepik Magnific upscaler — creative or precision mode, up to 16x.
/models/magnific-upscaler/runOmniHuman 1.5
ByteDance OmniHuman 1.5 via Replicate — audio-driven talking-head video with lip sync.
/models/omnihuman/runOpenAI Image 1
OpenAI GPT Image 1 Mini — text-to-image via Replicate.
/models/openai-image-1/runOutpaint
fal.ai Image Apps V2 outpainting — expand an image beyond its original edges.
/models/outpaint/runP-Image
Pruna P-Image — sub-second text-to-image with optional custom dimensions.
/models/p-image/runP-Image Edit
Pruna P-Image Edit — fast image editing with up to 5 reference images.
/models/p-image-edit/runP-Video
Pruna P-Video — video generation with text/image/audio conditioning, draft mode, and 720p/1080p outputs.
/models/p-video/runPixverse v5.6
Pixverse v5.6 video generation via Replicate — text-to-video or image-to-video with optional audio, at 360p–1080p.
/models/pixverse/runPixverse V6
Pixverse V6 video generation via Runware. Text-to-video, image-to-video (start frame), or multi-clip (start + end frame).
/models/pixverse-v6/runPony Realism
Pony Realism - Stylized anime generation
/models/ponyxl-ponyrealism-v23/runPony NAI
Pony NAI - Stylized anime generation
/models/ponyxl-tponynai3-v7/runWai ANI
Wai ANI - Stylized anime generation
/models/ponyxl-waianinsfwponyxl-v140/runQWEN Image Plus
Fast generation with excellent quality
/models/qwen-image-plus/runQWEN Image Max
Highest quality output
/models/qwen-image-max/runQWEN Image 2.0
Fast, balanced image generation and editing
/models/qwen-image-2.0/runQWEN Image 2.0 Pro
Enhanced text rendering, realistic textures, and semantic adherence
/models/qwen-image-2.0-pro/runRecraft V4
Recraft's latest image model. Strong prompt accuracy, art-directed composition, integrated text rendering. Fast and cost-efficient at standard resolution.
/models/recraft-v4/runRecraft V4 Pro
Recraft V4 at ~2048px resolution. Same design taste and prompt accuracy as V4, with higher resolution for print-ready and large-scale work.
/models/recraft-v4-pro/runRecraft V4 SVG
Production-ready SVG vector images from text. Recraft V4's design taste applied to vector output — clean geometry, structured layers, editable paths.
/models/recraft-v4-svg/runRecraft V4 Pro SVG
Detailed SVG vector graphics from text. Recraft V4 Pro's design taste with more geometric detail and finer paths — clean layers, editable output, scalable to any size.
/models/recraft-v4-pro-svg/runFlux Redux
Black Forest Labs Flux Redux image variations — feed a source image, get stylistic riffs.
/models/redux-flux/runRunway Gen-4.5 Video
Runway Gen-4.5 video generation — text-to-video or image-to-video, 5 or 10 seconds.
/models/runway-gen4-video/runRunway
Canonical version-agnostic Runway video API ID.
/models/runway-video/runRunway Gen-4 (Legacy API ID)
Legacy alias for clients pinned to runway-gen4; maps to the current Runway model.
/models/runway-gen4/runSeedance 1
ByteDance Seedance 1 video generation — text-to-video or image-to-video with optional end frame.
/models/seedance-1.5/runSeedance 2 High
Higher-quality Seedance 2.0 video generation via fal.ai (supports 1080p)
/models/seedance-2-high/runSeedance 2 Reference to Video
Seedance 2.0 multimodal reference-to-video. Combine up to 9 images, 3 video clips, and 3 audio tracks to guide characters, motion, and sound.
/models/seedance-2-reference/runSeedance 2 Video Edit
Edit source videos with Seedance 2.0 on fal.ai using prompted changes, optional reference images, and 480p or 720p output.
/models/seedance-video-edit/runSeedream 3
ByteDance Seedream 3 text-to-image via Replicate.
/models/seedream-3/runSeedream 4.5
ByteDance Seedream 4.5 — new-generation image creation with superior aesthetics, text rendering, and up to 4K resolution.
/models/seedream-4/runSeedream 5 Lite
ByteDance Seedream 5.0 Lite — fast, high-quality image generation and editing with strong aesthetics and text rendering.
/models/seedream-5-lite/runSubject Control
FLUX Subject Control via Replicate — preserve subject identity across generations.
/models/subject-control/runText to Music
ElevenLabs Music via Replicate — generate music from a text prompt.
/models/text-to-music/runText to Speech
MiniMax Speech 2.8 Turbo via Replicate — convert text into natural-sounding speech.
/models/text-to-speech/runVEO 3.1 Fast
Faster generation at 3 credits per second
/models/veo-3.1-fast/runVEO 3.1 Standard
Higher quality at 8 credits per second
/models/veo-3.1-standard/runVEO 3.1 Lite
Runware-powered Lite variant at 1.5 credits/sec for 720p and 2 credits/sec for 1080p. No reference images, no audio generation, no 1:1 aspect ratio.
/models/veo-3.1-lite/runVideo Autocaption
TikTok-style auto-captioning via Replicate.
/models/video-autocaption/runVideo Reframe
Luma Reframe Video via Replicate — change a video's aspect ratio intelligently.
/models/video-reframe/runVideo to Sound
ThinkSound via Replicate — generate a sound effect track from a video.
/models/video-to-sound/runVideo Transform
Runway Gen4 Aleph via Replicate — transform the first 5 seconds of a video with a prompt.
/models/video-transform/runVideo Upscaler
Topaz Labs Video Upscale via Replicate — upscale video resolution and FPS.
/models/video-upscaler/runWAN 2.2 Standard
Premium quality with enhanced detail
/models/wan-2.2-standard/runWAN 2.2 Plus
Official Alibaba model with 1080p support
/models/wan-2.2-plus/runWAN 2.2 Extended
fal.ai WAN 2.2 with up to 10-second videos and dual LoRA support
/models/wan-2.2-extended/runWAN 2.2 Animate
WAN 2.2 video animation — drive a character image with a motion reference video.
/models/wan-2.2-animate/runWAN 2.2 Replace
WAN 2.2 character replacement — swap a character in a source video while preserving scene and motion.
/models/wan-2.2-replace/runWAN 2.6 Standard
Higher quality, 720p/1080p support
/models/wan-2.6-standard/runWAN 2.6 Flash
Fast and affordable image-to-video
/models/wan-2.6-flash/runWAN 2.6 Image
Alibaba WAN 2.6 text-to-image with prompt enhancement and multi-image output.
/models/wan-2.6-image/runWAN 2.6 Image Edit
Alibaba WAN 2.6 image editing — up to 4 reference images.
/models/wan-2.6-image-edit/runWAN 2.7 Standard
Faster Wan 2.7 image generation and editing
/models/wan-2.7-image/runWAN 2.7 Pro
Higher quality Wan 2.7 tier with 4K support for text-to-image
/models/wan-2.7-image-pro/runWAN 2.7 Image Edit
Alibaba WAN 2.7 image editing — Standard and Pro tiers, supports 1-4 input images for fusion edits.
/models/wan-2.7-image-edit/runWAN 2.7 Text-to-Video
Text-to-video with audio sync, 720p/1080p output, and 2-15 second durations
/models/wan-2.7-t2v/runWAN 2.7 Image-to-Video
Image-to-video and video continuation with optional last-frame control and audio sync
/models/wan-2.7-i2v/runWAN 2.2 Image
Fast cinematic image generation (3-6 seconds) with up to 4MP output and optional LoRA support.
/models/wan-image/runWAN Reference to Video
Alibaba WAN reference-to-video — up to 5 image/video references with multi-shot support.
/models/wan-reference-to-video/runWAN Video Character Swap
Alibaba WAN character swap — combine a character image with a reference video to produce a new clip.
/models/wan-video-character-swap/runWAN 2.7 Video Edit
Alibaba WAN 2.7 video editing — modify an existing clip via prompt with optional reference images.
/models/wan-video-edit/runGrok Imagine
xAI Grok Imagine — sync image generation with fast results and natural aesthetics.
/models/xai-image/runGrok Image Edit
xAI Grok image editing — sync response (no polling). Provide an image URL and a text edit instruction.
/models/xai-image-edit/runGrok Imagine Video
xAI Grok Imagine video — text-to-video or image-to-video, 1-15 seconds at 480p or 720p.
/models/xai-video/runGrok Video Edit
xAI Grok Imagine Video edit — transform short clips via Replicate.
/models/xai-video-edit/runZ Image Turbo
Super-fast 6B parameter text-to-image with great text rendering and LoRA support.
/models/z-image-turbo/runQuick start:
https://pixeldojo.ai/api/v1