Skip to main content

Image & Video tools
your AI agents can
actually call.

MCP install for Claude Code, Cursor, OpenClaw — or REST.137+ image and video models, one auth, async by default.

# 1. Install in your Claude Code / Cursor / OpenClaw project
npx @pixeldojo/mcp init

# 2. Set your API key
export PIXELDOJO_API_KEY=pd_your_api_key

# 3. Restart your agent — it now has these tools:
#    pixeldojo:generate    Any prompt -> image or video
#    pixeldojo:character   Consistent characters across shots
#    pixeldojo:storyboard  Multi-shot scenes from one brief
#    pixeldojo:upscale     Enhance any image
#    pixeldojo:status      Poll long-running jobs

# Get your key: https://pixeldojo.ai/api-platform/api-keys

Install

Three paths to the same five tools. Pick whichever your editor or agent uses.

Cowork

Drag-and-drop plugin. No JSON to edit before drop.

Download pixeldojo.plugin
  1. 1. Open the .plugin archive, edit .mcp.json, paste your PIXELDOJO_API_KEY.
  2. 2. Drag the file into Cowork.
  3. 3. All five tools appear. Done.

Claude Desktop

~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "pixeldojo": {
      "command": "npx",
      "args": ["-y", "@pixeldojo/mcp"],
      "env": { "PIXELDOJO_API_KEY": "pd_..." }
    }
  }
}

Cursor / OpenClaw / Codex

Same config shape. Drop into ~/.cursor/mcp.jsonor your client's equivalent.

{
  "mcpServers": {
    "pixeldojo": {
      "command": "npx",
      "args": ["-y", "@pixeldojo/mcp"],
      "env": { "PIXELDOJO_API_KEY": "pd_..." }
    }
  }
}

Any client that speaks stdio-transport JSON-RPC works. Spawn npx -y @pixeldojo/mcp, pass your key in PIXELDOJO_API_KEY.

Named Skills

Named skills

One install, four skills. Your LLM picks the right one per task.

pixeldojo:generate

Any prompt.
Best model, automatically.

Your agent describes what it needs in plain English. PixelDojo routes to the right model — photorealism, text rendering, video — and hands back a URL.

  • 130+ models, one skill to call
  • Images, video, editing — same call shape
  • Credits deducted only on success
Terminal

>_ Generate a cinematic portrait, Tokyo rain, neon reflections

PixelDojo

Routing to flux-1.1-pro...

Job queued: job_k9mXpQ2r

output: https://pixeldojo.ai/r/…/portrait.png

1024×1024 PNG · 1 credit

>_ _

Terminal

>_ Alex presenting a new phone, marble desk, soft studio light

PixelDojo

Loading ref: alex_character.png...

Routing to flux-kontext...

Job queued: job_3vNaL8wK

output: https://pixeldojo.ai/r/…/alex-desk.png

Consistency preserved · 2 credits

>_ _

pixeldojo:character

Same character.
Any scene.

Pass a reference image once. Your agent reuses the character across any number of scenes — different backgrounds, poses, lighting — while preserving their face and features.

  • Prompt evolves the scene; the character stays locked
  • Works with Flux Kontext, Ideogram Character, and more
  • No LoRA training required — just a reference image URL

pixeldojo:storyboard

One brief.
N shots, planned and generated.

Your agent writes the brief. PixelDojo breaks it into shots, decides which are images and which are video, generates them in parallel, and returns an ordered array of output URLs.

  • Mix image and video shots in the same storyboard
  • Shot planning included — no need to prompt each frame individually
  • Returns an ordered array your agent can pass to an editor or exporter
Terminal

>_ 60s product reveal: teaser, unboxing, close-up, lifestyle

PixelDojo

Planning 4 shots...

Shot 1/4 teaser (image)

Shot 2/4 unboxing (video)

Shot 3/4 close-up (image)

Shot 4/4 lifestyle (video)

outputs: [4 URLs] · 6 credits

>_ _

Terminal

>_ Upscale this product photo to 4K, enhance detail

PixelDojo

Analyzing: 1024×1024 4096×4096

Routing to magnific-upscaler...

Job queued: job_8tHjR5mN

output: https://pixeldojo.ai/r/…/upscaled.png

4096×4096 PNG · 2 credits

>_ _

pixeldojo:upscale

Any image.
Up to 16× sharper.

Pass any image URL. Your agent gets back a high-res version — no upload step, no format conversion. Conservative mode preserves the original; creative mode can enhance textures and fine detail.

  • 2× to 16× magnification depending on model
  • Works on any image URL — no upload required
  • Conservative and creative upscale tiers

Canvas

Chain models in Canvas.

Generate, edit, upscale, animate — all in one freeform session. Hand the chain off to your LLM, or drive it yourself in the browser.

PixelDojo Canvas — chain models in one freeform session

API Design

Built for automation

Every detail is designed for machines that call APIs, not humans clicking buttons.

130+ AI Models

Image generation, video generation, upscaling, and editing. One auth.

Async by Default

Submit a job, get a job ID. Poll the status URL or register a webhook.

Per-Model Schemas

Every model has a JSON schema endpoint. Your agent knows the request shape before calling.

No Browser Needed

Pure REST API. No UI scraping, no headless browser, no screenshots.

Credit-Based Pricing

Pay per generation with credits. No subscription required to use the API.

LLM Discovery

llm.txt, OpenAPI 3.1 spec, and AI plugin manifest for zero-config agent integration.

Endpoint reference

MethodEndpointDescription
GET/api/v1/modelsList all available models
GET/api/v1/models/{apiId}/schemaGet the JSON schema for a model
POST/api/v1/models/{apiId}/runSubmit a generation job
GET/api/v1/jobs/{jobId}Check job status and get output URLs
POST/api/v1/jobs/{jobId}/webhookRegister a webhook for completion

Full reference: API Documentation · OpenAPI Spec · llm.txt

137+ models, one API

Same endpoint pattern for every model. Your agent picks the model, we handle the rest.

Change Camera Angle example
image

Change Camera Angle

1 credit
Image

Camera-aware editing via fal.ai Qwen Image Edit 2511 with multi-angle LoRA — 360° orbit, tilt, and zoom.

/models/change-camera-angle/run
Consistent Characters example
image

Consistent Characters

1 credit
Image

Generate consistent character variations with FLUX Kontext, Nano Banana Pro/2, Flux 2 Dev, or Qwen Image 2 Pro.

/models/consistent-characters/run
Creative Upscale example
image

Creative Upscale

0.5 credits
ImageLoRA

Clarity Upscaler (creative upscale) via Replicate — boost detail with stable-diffusion refinement.

/models/creative-upscale/run
Dreamina 3.1 example
image

Dreamina 3.1

1 credit
Image

ByteDance Dreamina 3.1 — 4MP cinematic text-to-image with precise style control.

/models/dreamina/run
Ernie example
image

Ernie

1 credit
Image

Baidu Ernie text-to-image (fal.ai). Multilingual prompts and built-in prompt expansion.

/models/ernie/run
FLUX example
image

FLUX

1 credit
ImageLoRA

FLUX family on Replicate — Schnell, Dev, Pro, Kontext, Ultra, and LoRA remix variants in one entrypoint.

/models/flux/run
Flux 2 Flex example
image

Flux 2 Flex

1.5 credits
ImageLoRAEditing

Max-quality with up to 10 reference images

/models/flux-2-flex/run
Flux 2 Klein 4B example
image

Flux 2 Klein 4B

0.1 credits
ImageLoRAEditing

Very fast generation and editing with up to 5 reference images

/models/flux-2-klein-4b/run
Grok Imagine R2V video example
video

Grok Imagine R2V

10 credits
Video

xAI Grok Imagine reference-to-video via Replicate — 1–7 reference images plus prompt for 1–10s clips at 480p or 720p.

/models/grok-r2v/run
Grok Video Extend video example
video

Grok Video Extend

12 credits
Video

xAI Grok Imagine video extension — continue an existing MP4 with a prompt-directed extension (2–10s).

/models/grok-video-extend/run
Hailuo Standard video example
video

Hailuo Standard

8 credits
Video

Premium quality text-to-video and image-to-video

/models/hailuo-standard/run
Hailuo Fast video example
video

Hailuo Fast

4 credits
Video

Fast image-to-video generation

/models/hailuo-fast/run
Happy Horse 1.0 Text-to-Video video example
video

Happy Horse 1.0 Text-to-Video

4 credits/sec
Video

Text-to-video with 720p/1080p output and 2-15 second durations

/models/happyhorse-1.0-t2v/run
Heygen Avatar video example
video

Heygen Avatar

2 credits/sec
VideoAudio

Heygen Avatar 4 via fal.ai — animate a portrait with prompt-driven speech or an audio track, with optional background and captions.

/models/heygen-avatar/run

Works with

Any agent that can make an HTTP request

Claude CodeClaude DesktopCursorOpenClawCodexCoworkLangChainAutoGPTn8nZapierCustom MCP serversAny HTTP client

Start generating

Get an API key, install MCP (or grab the SDK), and your first generation runs in under a minute.

npx @pixeldojo/mcp init