Cowork
Drag-and-drop plugin. No JSON to edit before drop.
Download pixeldojo.plugin- 1. Open the .plugin archive, edit .mcp.json, paste your PIXELDOJO_API_KEY.
- 2. Drag the file into Cowork.
- 3. All five tools appear. Done.
MCP install for Claude Code, Cursor, OpenClaw — or REST.137+ image and video models, one auth, async by default.
# 1. Install in your Claude Code / Cursor / OpenClaw project
npx @pixeldojo/mcp init
# 2. Set your API key
export PIXELDOJO_API_KEY=pd_your_api_key
# 3. Restart your agent — it now has these tools:
# pixeldojo:generate Any prompt -> image or video
# pixeldojo:character Consistent characters across shots
# pixeldojo:storyboard Multi-shot scenes from one brief
# pixeldojo:upscale Enhance any image
# pixeldojo:status Poll long-running jobs
# Get your key: https://pixeldojo.ai/api-platform/api-keysThree paths to the same five tools. Pick whichever your editor or agent uses.
Drag-and-drop plugin. No JSON to edit before drop.
Download pixeldojo.plugin~/Library/Application Support/Claude/claude_desktop_config.json
{
"mcpServers": {
"pixeldojo": {
"command": "npx",
"args": ["-y", "@pixeldojo/mcp"],
"env": { "PIXELDOJO_API_KEY": "pd_..." }
}
}
}Same config shape. Drop into ~/.cursor/mcp.jsonor your client's equivalent.
{
"mcpServers": {
"pixeldojo": {
"command": "npx",
"args": ["-y", "@pixeldojo/mcp"],
"env": { "PIXELDOJO_API_KEY": "pd_..." }
}
}
}Any client that speaks stdio-transport JSON-RPC works. Spawn npx -y @pixeldojo/mcp, pass your key in PIXELDOJO_API_KEY.
Named Skills
One install, four skills. Your LLM picks the right one per task.
pixeldojo:generate
Your agent describes what it needs in plain English. PixelDojo routes to the right model — photorealism, text rendering, video — and hands back a URL.
>_ Generate a cinematic portrait, Tokyo rain, neon reflections
PixelDojo
Routing to flux-1.1-pro...
Job queued: job_k9mXpQ2r
✓ output: https://pixeldojo.ai/r/…/portrait.png
✓ 1024×1024 PNG · 1 credit
>_ _
>_ Alex presenting a new phone, marble desk, soft studio light
PixelDojo
Loading ref: alex_character.png...
Routing to flux-kontext...
Job queued: job_3vNaL8wK
✓ output: https://pixeldojo.ai/r/…/alex-desk.png
✓ Consistency preserved · 2 credits
>_ _
pixeldojo:character
Pass a reference image once. Your agent reuses the character across any number of scenes — different backgrounds, poses, lighting — while preserving their face and features.
pixeldojo:storyboard
Your agent writes the brief. PixelDojo breaks it into shots, decides which are images and which are video, generates them in parallel, and returns an ordered array of output URLs.
>_ 60s product reveal: teaser, unboxing, close-up, lifestyle
PixelDojo
Planning 4 shots...
Shot 1/4 ✓ teaser (image)
Shot 2/4 ✓ unboxing (video)
Shot 3/4 ✓ close-up (image)
Shot 4/4 ✓ lifestyle (video)
✓ outputs: [4 URLs] · 6 credits
>_ _
>_ Upscale this product photo to 4K, enhance detail
PixelDojo
Analyzing: 1024×1024 → 4096×4096
Routing to magnific-upscaler...
Job queued: job_8tHjR5mN
✓ output: https://pixeldojo.ai/r/…/upscaled.png
✓ 4096×4096 PNG · 2 credits
>_ _
pixeldojo:upscale
Pass any image URL. Your agent gets back a high-res version — no upload step, no format conversion. Conservative mode preserves the original; creative mode can enhance textures and fine detail.
Preset Library
Curated starting points across 60 workflows. Refresh for a different mix.
Noir detective
Product spin
Social quote card
Architectural twilight
Studio fit
Fitness action
Reveal shot
Studio headshot
Real estate interior
Pet action
Flat icon
Creative 4x
Event poster
Cyberpunk street
Product explosion
Product rotation
Concert stage
Lifestyle cafe
Editorial coffee shop
Badge design
Casual tee
Food overhead
Character 3-up
Lifestyle vlog
API Design
Every detail is designed for machines that call APIs, not humans clicking buttons.
Image generation, video generation, upscaling, and editing. One auth.
Submit a job, get a job ID. Poll the status URL or register a webhook.
Every model has a JSON schema endpoint. Your agent knows the request shape before calling.
Pure REST API. No UI scraping, no headless browser, no screenshots.
Pay per generation with credits. No subscription required to use the API.
llm.txt, OpenAPI 3.1 spec, and AI plugin manifest for zero-config agent integration.
| Method | Endpoint | Description |
|---|---|---|
GET | /api/v1/models | List all available models |
GET | /api/v1/models/{apiId}/schema | Get the JSON schema for a model |
POST | /api/v1/models/{apiId}/run | Submit a generation job |
GET | /api/v1/jobs/{jobId} | Check job status and get output URLs |
POST | /api/v1/jobs/{jobId}/webhook | Register a webhook for completion |
Full reference: API Documentation · OpenAPI Spec · llm.txt
Same endpoint pattern for every model. Your agent picks the model, we handle the rest.
Camera-aware editing via fal.ai Qwen Image Edit 2511 with multi-angle LoRA — 360° orbit, tilt, and zoom.
/models/change-camera-angle/runGenerate consistent character variations with FLUX Kontext, Nano Banana Pro/2, Flux 2 Dev, or Qwen Image 2 Pro.
/models/consistent-characters/runClarity Upscaler (creative upscale) via Replicate — boost detail with stable-diffusion refinement.
/models/creative-upscale/runByteDance Dreamina 3.1 — 4MP cinematic text-to-image with precise style control.
/models/dreamina/runBaidu Ernie text-to-image (fal.ai). Multilingual prompts and built-in prompt expansion.
/models/ernie/runFLUX family on Replicate — Schnell, Dev, Pro, Kontext, Ultra, and LoRA remix variants in one entrypoint.
/models/flux/runMax-quality with up to 10 reference images
/models/flux-2-flex/runVery fast generation and editing with up to 5 reference images
/models/flux-2-klein-4b/runxAI Grok Imagine reference-to-video via Replicate — 1–7 reference images plus prompt for 1–10s clips at 480p or 720p.
/models/grok-r2v/runxAI Grok Imagine video extension — continue an existing MP4 with a prompt-directed extension (2–10s).
/models/grok-video-extend/runPremium quality text-to-video and image-to-video
/models/hailuo-standard/runFast image-to-video generation
/models/hailuo-fast/runText-to-video with 720p/1080p output and 2-15 second durations
/models/happyhorse-1.0-t2v/runHeygen Avatar 4 via fal.ai — animate a portrait with prompt-driven speech or an audio track, with optional background and captions.
/models/heygen-avatar/runWorks with