# PixelDojo > AI image and video generation platform with a REST API. ## API PixelDojo provides an async job-based API for programmatic access to AI image and video generation models. - Base URL: https://pixeldojo.ai/api/v1 - Auth: Bearer token (API key) - Endpoint: POST /models/{modelId}/run → GET /jobs/{jobId} - Models available: 137 ### Available Models - change-camera-angle: Camera-aware editing via fal.ai Qwen Image Edit 2511 with multi-angle LoRA. 360° orbit, tilt, and zoom. (1 credit(s)) - consistent-characters: Generate consistent character variations with FLUX Kontext, Nano Banana Pro/2, Flux 2 Dev, or Qwen Image 2 Pro. (1 credit(s)) - creative-upscale: Clarity Upscaler (creative upscale) via Replicate. Boost detail with stable-diffusion refinement. (0.5 credit(s)) - dreamina: ByteDance Dreamina 3.1. 4MP cinematic text-to-image with precise style control. (1 credit(s)) - ernie: Baidu Ernie text-to-image (fal.ai). Multilingual prompts and built-in prompt expansion. (1 credit(s)) - face-enhance: Crystal Upscaler via Replicate. Face-detail preserving upscale, cost scales with output megapixels. (2 credit(s)) - flux: FLUX family on Replicate. Schnell, Dev, Pro, Kontext, Ultra, and LoRA remix variants in one entrypoint. (1 credit(s)) - flux-2-flex: Max-quality with up to 10 reference images (1.5 credit(s)) - flux-2-klein-4b: Very fast generation and editing with up to 5 reference images (0.1 credit(s)) - flux-2-klein-9b: 4-step distilled FLUX.2 [klein] foundation model for flexible control (0.5 credit(s)) - flux-2-pro: High-quality with up to 8 reference images (1.5 credit(s)) - flux-2-max: The highest fidelity image model from Black Forest Labs (2 credit(s)) - flux-2-dev: Fast quality with up to 4 reference images (1 credit(s)) - flux-2-lora: Dev model with custom LoRA support (1 credit(s)) - flux-edit: Black Forest Labs FLUX.1 Kontext for text-driven image editing. Dev (open-weight), Pro (state-of-the-art), and Max (premium typography). (1 credit(s)) - flux-dev: High-quality development model with configurable steps, guidance, and LoRA support. (1 credit(s)) - flux-krea-dev: Photorealistic generation that avoids the oversaturated AI look. LoRA compatible. (1 credit(s)) - flux-dev-multi-lora: Supports multiple custom LoRAs simultaneously for complex style combinations. (1 credit(s)) - flux-1.1-pro: Latest pro model with enhanced quality and strong prompt adherence. (1 credit(s)) - flux-1.1-pro-ultra: Highest quality Flux model with raw mode for natural-looking images. (1.5 credit(s)) - flux-kontext-pro: Advanced model with state-of-the-art performance for both generation and editing. (1 credit(s)) - flux-kontext-max: Premium model with maximum performance and improved typography for generation and editing. (2 credit(s)) - gemini-flash: Fast generation with Gemini 2.5 Flash (1 credit(s)) - nano-banana-pro: SOTA with accurate typography and reasoning (3 credit(s)) - nano-banana-2: Next-generation SOTA model with stronger consistency (3 credit(s)) - google-nano-banana: Google Nano Banana image editing. Multi-image fusion + edit instruction with Standard/Pro/Pro-fal tiers and 1K/2K/4K resolution. (3 credit(s)) - gpt-image-low: Fast, lower detail generation (1 credit(s)) - gpt-image-medium: Balanced quality and speed (1 credit(s)) - gpt-image-high: Maximum detail and quality (4 credit(s)) - gpt-image-1-5-edit: OpenAI GPT-Image 1.5 image editing — supply 1-8 reference images plus an edit instruction. Optional transparent background and high-fidelity input mode. (4 credit(s)) - gpt-image-2: OpenAI GPT Image 2 via fal.ai — next-generation image model with 4K rendering and sharper text fidelity. (5 credit(s)) - gpt-image-2-edit: OpenAI GPT Image 2 image editing — supply 1-8 reference images plus an edit instruction. Optional mask for inpainting. 4K-capable; pricing varies by quality + size. (5 credit(s)) - grok-r2v: xAI Grok Imagine reference-to-video via Replicate. 1 to 7 reference images plus prompt for 1 to 10 second clips at 480p or 720p. (10 credit(s)) - grok-video-extend: xAI Grok Imagine video extension. Continue an existing MP4 with a prompt-directed extension (2 to 10 seconds). (12 credit(s)) - hailuo-standard: Premium quality text-to-video and image-to-video (8 credit(s)) - hailuo-fast: Fast image-to-video generation (4 credit(s)) - happyhorse-1.0-r2v: Alibaba Happy Horse 1.0 reference-to-video — multi-reference image input that preserves subject characters, driven by a text prompt. 720p / 1080p, 3-15 second clips. (4 credits/sec) - happyhorse-1.0-t2v: Text-to-video with 720p/1080p output and 2-15 second durations (4 credits/sec) - happyhorse-1.0-i2v: Image-to-video animation with 720p/1080p output and 2-15 second durations (4 credits/sec) - happyhorse-1.0-video-edit: Alibaba Happy Horse 1.0 video edit — apply style transfer or local replacement to a source video using text prompts and optional reference images. 720p / 1080p, 3-15 second output. (4 credits/sec) - heygen-avatar: Heygen Avatar 4 via fal.ai. Animate a portrait with prompt-driven speech or an audio track, with optional background and captions. (2 credits/sec) - hidream-edit: HiDream O1 image-conditioned editing. Provide a source image and an instruction. (1 credit(s)) - hunyuan-3d: Tencent Hunyuan 3D 3.1. Generate 3D meshes from a text prompt or a single image. (4 credit(s)) - ideogram-character: Generate consistent characters from a single reference image in many styles. (5 credit(s)) - image-editor: One-shot FLUX Kontext variants — filters, cartoonify, iconic locations, haircut swap, headshots, renaissance, face-to-many, and more. (1 credit(s)) - image-relighting: Relight images with Magic Lighting, Nano Banana Pro/2, or Qwen Image Edit — multi-provider routing with per-model credit rates. (1 credit(s)) - image-to-image-flux: FLUX Dev LoRA image-to-image on Replicate. Prompt + source image + optional LoRA weights. (1 credit(s)) - imagineart: ImagineArt family — 1.0 (Mixture-of-Experts photorealism), 1.5, 1.5 Pro, and the 2.0 preview. (1.5 credit(s)) - kling-image: Kling Image V3 (fal.ai). High-quality text-to-image with flexible aspect ratios. (1 credit(s)) - kling-image-edit: Kling Image V3 (fal.ai) image-to-image editing with a text instruction. (1 credit(s)) - kling-motion-control: Kling Video v3 Standard motion control endpoint (3 credits/sec) - kling-motion-control-pro: Kling Video v3 Pro motion control endpoint (4 credits/sec) - kling-reference-to-video: Kling O3 reference-driven video generation. Image or video references, Standard or Pro tier. (15 credit(s)) - kling-v2-6: Kling Video v2.6 Pro (fal.ai). Text-to-video or image-to-video, 5 or 10 seconds, with audio generation. (15 credit(s)) - kling-video-v3-standard-text: Standard text-to-video with native audio (6 credits/sec) - kling-video-v3-standard-image: Standard image-to-video with native audio (6 credits/sec) - kling-video-v3-pro-text: Pro text-to-video with cinematic quality and native audio (8 credits/sec) - kling-video-v3-pro-image: Pro image-to-video with cinematic quality and native audio (8 credits/sec) - kling-video-edit: Kling O3 video-to-video edit. Standard or Pro, with optional reference images and audio preservation. (40 credit(s)) - lip-sync: Replicate sync/lipsync-2. Align mouth movements in a video to a separate audio track. (5 credit(s)) - ltx-2-fast-t2v: Fast text-to-video generation (6-20s, 1080p-2160p). (2 credits/sec) - ltx-2-fast-i2v: Fast image-to-video generation (6-20s, 1080p-2160p). (2 credits/sec) - ltx-2-pro-t2v: Higher quality text-to-video generation (6-10s, 1080p-2160p). (2 credits/sec) - ltx-2-pro-i2v: Higher quality image-to-video generation (6-10s, 1080p-2160p). (2 credits/sec) - ltx-2-pro-extend: Extend an existing video clip from the start or end (1-20s, Pro tier only). (2 credits/sec) - magnific-upscaler: Freepik Magnific upscaler. Creative or precision mode, up to 16x. (3 credit(s)) - omnihuman: ByteDance OmniHuman 1.5 via Replicate. Audio-driven talking-head video with lip sync. (45 credit(s)) - openai-image-1: OpenAI GPT Image 1 Mini. Text-to-image via Replicate. (1 credit(s)) - openai-image-1-edit: OpenAI GPT Image 1 Mini image editing — combine 1-8 reference images with a text edit instruction. Supports transparent or opaque backgrounds. (1 credit(s)) - outpaint: fal.ai Image Apps V2 outpainting. Expand an image beyond its original edges. (1 credit(s)) - p-image: Pruna P-Image. Sub-second text-to-image with optional custom dimensions. (0.1 credit(s)) - p-image-edit: Pruna P-Image Edit. Fast image editing with up to 5 reference images. (0.25 credit(s)) - p-video: Pruna P-Video — video generation with text/image/audio conditioning, draft mode, and 720p/1080p outputs. (0.5 credits/sec) - p-video-avatar: Pruna P Video Avatar — animate a portrait into a talking avatar from a script or an audio file. 30 voices, 10 languages, 720p / 1080p. (1 credits/sec) - pixverse: Pixverse v5.6 video generation via Replicate — text-to-video or image-to-video with optional audio, at 360p–1080p. (7.5 credit(s)) - pixverse-v6: Pixverse V6 video generation via Runware. Text-to-video, image-to-video (start frame), or multi-clip (start + end frame). (10 credit(s)) - ponyxl-ponyrealism-v23: Pony Realism - Stylized anime generation (1 credit(s)) - ponyxl-tponynai3-v7: Pony NAI - Stylized anime generation (1 credit(s)) - ponyxl-waianinsfwponyxl-v140: Wai ANI - Stylized anime generation (1 credit(s)) - qwen-image-plus: Fast generation with excellent quality (1 credit(s)) - qwen-image-max: Highest quality output (2 credit(s)) - qwen-image-2.0: Fast, balanced image generation and editing (1 credit(s)) - qwen-image-2.0-pro: Enhanced text rendering, realistic textures, and semantic adherence (2 credit(s)) - qwen-image-2-edit: Alibaba DashScope Qwen Image 2 edit — supply 1-3 reference images plus an edit instruction. Standard and Pro variants. (1 credit(s)) - qwen-image-edit: Alibaba DashScope Qwen Image edit — supply 1-3 reference images plus an edit instruction. Plus and Max model variants. (1 credit(s)) - qwen-image-edit-spicy: Qwen Image Edit Spicy. Add, remove, or modify elements in an existing image with text guidance. (1 credit(s)) - recraft-v4: Recraft's latest image model. Strong prompt accuracy, art-directed composition, integrated text rendering. Fast and cost-efficient at standard resolution. (1 credit(s)) - recraft-v4-pro: Recraft V4 at ~2048px resolution. Same design taste and prompt accuracy as V4, with higher resolution for print-ready and large-scale work. (6 credit(s)) - recraft-v4-svg: Production-ready SVG vector images from text. Recraft V4's design taste applied to vector output — clean geometry, structured layers, editable paths. (2 credit(s)) - recraft-v4-pro-svg: Detailed SVG vector graphics from text. Recraft V4 Pro's design taste with more geometric detail and finer paths — clean layers, editable output, scalable to any size. (8 credit(s)) - redux-flux: Black Forest Labs Flux Redux image variations — feed a source image, get stylistic riffs. (1 credit(s)) - runway-gen4-video: Runway Gen-4.5 video generation. Text-to-video or image-to-video, 5 or 10 seconds. (15 credit(s)) - runway-video: Canonical version-agnostic Runway video API ID. (15 credit(s)) - runway-gen4: Legacy alias for clients pinned to runway-gen4; maps to the current Runway model. (15 credit(s)) - seedance-1.5: ByteDance Seedance 1 video generation. Text-to-video or image-to-video with optional end frame. (8 credit(s)) - seedance-2-high: Higher-quality Seedance 2.0 video generation (supports 1080p) (4 credits/sec) - seedance-2-reference: Seedance 2.0 multimodal reference-to-video. Combine up to 9 images, 3 video clips, and 3 audio tracks to guide characters, motion, and sound. (20 credit(s)) - seedance-video-edit: Edit source videos with Seedance 2.0 using prompted changes, optional reference images, and 480p, 720p, or 1080p output. (25 credit(s)) - seedream-3: ByteDance Seedream 3 text-to-image via Replicate. (1 credit(s)) - seedream-4: ByteDance Seedream 4.5 — new-generation image creation with superior aesthetics, text rendering, and up to 4K resolution. (1 credit(s)) - seedream-5-lite: ByteDance Seedream 5.0 Lite — fast, high-quality image generation and editing with strong aesthetics and text rendering. (1 credit(s)) - text-to-music: ElevenLabs Music via Replicate. Generate music from a text prompt. (2 credit(s)) - text-to-speech: MiniMax Speech 2.8 Turbo via Replicate. Convert text into natural-sounding speech. (0.1 credit(s)) - veo-3.1-fast: Faster generation at 3 credits per second (3 credits/sec) - veo-3.1-standard: Higher quality at 8 credits per second (8 credits/sec) - veo-3.1-lite: Runware-powered Lite variant at 1.5 credits/sec for 720p and 2 credits/sec for 1080p. No reference images, no audio generation, no 1:1 aspect ratio. (1.5 credits/sec) - video-autocaption: TikTok-style auto-captioning via Replicate. (5 credit(s)) - video-reframe: Luma Reframe Video via Replicate. Change a video's aspect ratio intelligently. (8 credit(s)) - video-to-sound: ThinkSound via Replicate. Generate a sound effect track from a video. (2 credit(s)) - video-transform: Runway Gen4 Aleph via Replicate. Transform the first 5 seconds of a video with a prompt. (20 credit(s)) - video-upscaler: Topaz Labs Video Upscale via Replicate. Upscale video resolution and FPS. (10 credit(s)) - wan-2.2-standard: Premium quality with enhanced detail (3 credit(s)) - wan-2.2-plus: Official Alibaba model with 1080p support (10 credit(s)) - wan-2.2-extended: fal.ai WAN 2.2 with up to 10-second videos and dual LoRA support (1.2 credits/sec) - wan-2.2-animate: WAN 2.2 video animation. Drive a character image with a motion reference video. (2 credit(s)) - wan-2.2-i2v-spicy: Image-to-video with WAN 2.2 Spicy. Animate a starting image. 480p or 720p, 5s or 8s clips. (15 credit(s)) - wan-2.2-replace: WAN 2.2 character replacement. Swap a character in a source video while preserving scene and motion. (2 credit(s)) - wan-2.6-standard: Higher quality, 720p/1080p support (2.5 credits/sec) - wan-2.6-flash: Fast and affordable image-to-video (1 credits/sec) - wan-2.6-image: Alibaba WAN 2.6 text-to-image with prompt enhancement and multi-image output. (1 credit(s)) - wan-2.6-image-edit: Alibaba WAN 2.6 image editing. Up to 4 reference images. (1 credit(s)) - wan-2.7-i2v-spicy: Image-to-video with WAN 2.7 Spicy. Animate a starting image with optional driving audio. 720p or 1080p, 2–15 second clips. (20 credit(s)) - wan-2.7-image: Faster Wan 2.7 image generation and editing (1 credit(s)) - wan-2.7-image-pro: Higher quality Wan 2.7 tier with 4K support for text-to-image (2 credit(s)) - wan-2.7-image-edit: Alibaba WAN 2.7 image editing. Standard and Pro tiers, supports 1-4 input images for fusion edits. (1 credit(s)) - wan-2.7-t2v: Text-to-video with audio sync, 720p/1080p output, and 2-15 second durations (2.5 credits/sec) - wan-2.7-i2v: Image-to-video and video continuation with optional last-frame control and audio sync (2.5 credits/sec) - wan-image: Fast cinematic image generation (3-6 seconds) with up to 4MP output and optional LoRA support. (1 credit(s)) - wan-reference-to-video: Alibaba WAN reference-to-video. Up to 5 image/video references with multi-shot support. (4 credit(s)) - wan-video-character-swap: Alibaba WAN character swap. Combine a character image with a reference video to produce a new clip. (20 credit(s)) - wan-video-edit: Alibaba WAN 2.7 video editing. Modify an existing clip via prompt with optional reference images. (6 credit(s)) - xai-image: xAI Grok Imagine. Fast tier for quick iteration, Quality tier for higher fidelity at 1k or 2k. (1 credit(s)) - xai-image-edit: xAI Grok image editing. Sync response (no polling). Provide an image URL and a text edit instruction. Optional quality tier for 1k/2k high-fidelity edits. (1 credit(s)) - xai-video: xAI Grok Imagine video. Text-to-video or image-to-video, 1-15 seconds at 480p or 720p. (10 credit(s)) - xai-video-edit: xAI Grok Imagine Video edit. Transform short clips via Replicate. (15 credit(s)) - z-image-spicy: Z Image Spicy text-to-image. Square / portrait / landscape compositions, 256–1536px on each side. (1 credit(s)) - z-image-turbo: Super-fast 6B parameter text-to-image with great text rendering and LoRA support. (0.5 credit(s)) ## Documentation - [Full API Docs (LLM-optimized)](https://pixeldojo.ai/llm.txt): Complete API reference in plain text - [OpenAPI Spec](https://pixeldojo.ai/api/openapi): Machine-readable OpenAPI 3.1 specification - [API Docs (HTML)](https://pixeldojo.ai/api-docs): Static HTML API reference - [API Platform](https://pixeldojo.ai/api-platform): Interactive dashboard for keys, usage, and docs - [AI Plugin Manifest](https://pixeldojo.ai/.well-known/ai-plugin.json): Agent plugin manifest ## Quick Start 1. Get an API key: https://pixeldojo.ai/api-platform/api-keys 2. Submit a job: POST https://pixeldojo.ai/api/v1/models/flux-1.1-pro/run 3. Poll for results: GET https://pixeldojo.ai/api/v1/jobs/{jobId} ## Links - Website: https://pixeldojo.ai - API Platform: https://pixeldojo.ai/api-platform - Documentation: https://pixeldojo.ai/api-platform/documentation - ComfyUI Plugin: https://github.com/blovett80/ComfyUI-PixelDojo