# Image and Video tools your AI agents can actually call

PixelDojo lets any AI agent generate images and videos through a simple REST API or via the official MCP install. 137+ models, one auth, async by default.

- **Install MCP**: `npx @pixeldojo/mcp init`
- **Landing page**: https://pixeldojo.ai/skills
- **This document**: https://pixeldojo.ai/skills.md
- **ClawHub Skill (alternative)**: https://clawhub.ai/blovett80/pixeldojo

## Quick Start (MCP)

```
npx @pixeldojo/mcp init
export PIXELDOJO_API_KEY=pd_your_api_key
# Then restart your agent.
```

## Named Skills

- **pixeldojo:generate** -- Any prompt to image or video. Auto-routes to the best model.
- **pixeldojo:character** -- Consistent characters across shots. Uses character-aware models.
- **pixeldojo:storyboard** -- Multi-shot scenes from one brief. Image or video shots.
- **pixeldojo:upscale** -- Enhance any image. Conservative or creative upscale tiers.

## How It Works

1. **Install MCP** with `npx @pixeldojo/mcp init` (or grab the SDK at https://www.npmjs.com/package/@pixeldojo/sdk).
2. **Add your API key** -- get one at https://pixeldojo.ai/api-platform/api-keys
3. **Your agent creates media** -- it calls the API, polls for results, and gets back image/video URLs.

## Authentication

```
Authorization: Bearer pd_your_api_key
```

## API Endpoints

- `GET /api/v1/models` -- Public discovery endpoint for all enabled image and video models.
- `GET /api/v1/models/{apiId}` -- Fetch model capabilities, parameters, and the canonical request schema for one model.
- `GET /api/v1/models/{apiId}/schema` -- Return the model request schema as JSON so agents and SDKs can build valid payloads.
- `POST /api/v1/models/{apiId}/run` -- Submit an async job for any image or video model. Match the request body to the schema from /models/{apiId}/schema.
- `GET /api/v1/jobs` -- List recent jobs for the authenticated API key owner with optional filters.
- `GET /api/v1/jobs/{jobId}` -- Check job status and retrieve outputs, asset references, and webhook delivery state when complete.

Full model list: `GET https://pixeldojo.ai/api/v1/models`

## Code Examples

### MCP

```shell
# 1. Install in your Claude Code / Cursor / OpenClaw project
npx @pixeldojo/mcp init

# 2. Set your API key
export PIXELDOJO_API_KEY=pd_your_api_key

# 3. Restart your agent — it now has these tools:
#    pixeldojo:generate    Any prompt -> image or video
#    pixeldojo:character   Consistent characters across shots
#    pixeldojo:storyboard  Multi-shot scenes from one brief
#    pixeldojo:upscale     Enhance any image
#    pixeldojo:status      Poll long-running jobs

# Get your key: https://pixeldojo.ai/api-platform/api-keys
```

### TypeScript SDK

```typescript
import { PixelDojoClient } from "@pixeldojo/sdk";

const pd = new PixelDojoClient({ apiKey: "pd_your_api_key" });

// Generate an image and wait for the result
const job = await pd.generate("google-nano-banana", {
  prompt: "A fantasy castle on a cliff at sunset",
  aspect_ratio: "16:9",
});

console.log(job.assets[0].url);
```

### Python

```python
import requests, time

API_KEY = "pd_your_api_key"

# Submit a generation job
job = requests.post(
    "https://pixeldojo.ai/api/v1/models/google-nano-banana/run",
    headers={"Authorization": f"Bearer {API_KEY}"},
    json={"prompt": "A fantasy castle on a cliff at sunset"}
).json()

# Poll until the image is ready
while True:
    result = requests.get(
        job["statusUrl"],
        headers={"Authorization": f"Bearer {API_KEY}"}
    ).json()
    if result["status"] == "completed":
        print("Image URL:", result["output"]["images"][0])
        break
    time.sleep(2)
```

### cURL

```bash
# Submit a job
curl -X POST "https://pixeldojo.ai/api/v1/models/google-nano-banana/run" \
  -H "Authorization: Bearer pd_your_api_key" \
  -H "Content-Type: application/json" \
  -d '{"prompt":"A fantasy castle on a cliff at sunset"}'

# Poll for results
curl "https://pixeldojo.ai/api/v1/jobs/job_abc123" \
  -H "Authorization: Bearer pd_your_api_key"
```

## Available Models (137 total)

### Image Models

- **Change Camera Angle** (`change-camera-angle`) -- Camera-aware editing via fal.ai Qwen Image Edit 2511 with multi-angle LoRA — 360° orbit, tilt, and zoom. (1 credit) [editing]
- **Consistent Characters** (`consistent-characters`) -- Generate consistent character variations with FLUX Kontext, Nano Banana Pro/2, Flux 2 Dev, or Qwen Image 2 Pro. (1 credit) [character]
- **Creative Upscale** (`creative-upscale`) -- Clarity Upscaler (creative upscale) via Replicate — boost detail with stable-diffusion refinement. (0.5 credits) [upscale]
- **Dreamina 3.1** (`dreamina`) -- ByteDance Dreamina 3.1 — 4MP cinematic text-to-image with precise style control. (1 credit)
- **Ernie** (`ernie`) -- Baidu Ernie text-to-image (fal.ai). Multilingual prompts and built-in prompt expansion. (1 credit) [photorealism, marketing]
- **FLUX** (`flux`) -- FLUX family on Replicate — Schnell, Dev, Pro, Kontext, Ultra, and LoRA remix variants in one entrypoint. (1 credit) [photorealism, character, cinematic]
- **Flux 2 Flex** (`flux-2-flex`) -- Max-quality with up to 10 reference images (1.5 credits) [photorealism, cinematic]
- **Flux 2 Klein 4B** (`flux-2-klein-4b`) -- Very fast generation and editing with up to 5 reference images (0.1 credits) [photorealism, cinematic]

### Video Models

- **Grok Imagine R2V** (`grok-r2v`) -- xAI Grok Imagine reference-to-video via Replicate — 1–7 reference images plus prompt for 1–10s clips at 480p or 720p. (10 credits)
- **Grok Video Extend** (`grok-video-extend`) -- xAI Grok Imagine video extension — continue an existing MP4 with a prompt-directed extension (2–10s). (12 credits)
- **Hailuo Standard** (`hailuo-standard`) -- Premium quality text-to-video and image-to-video (8 credits) [video, marketing]
- **Hailuo Fast** (`hailuo-fast`) -- Fast image-to-video generation (4 credits) [video, marketing]
- **Happy Horse 1.0 Text-to-Video** (`happyhorse-1.0-t2v`) -- Text-to-video with 720p/1080p output and 2-15 second durations (4 credits/sec)
- **Heygen Avatar** (`heygen-avatar`) -- Heygen Avatar 4 via fal.ai — animate a portrait with prompt-driven speech or an audio track, with optional background and captions. (2 credits/sec) [video, character]

Browse all models: https://pixeldojo.ai/api-platform

## Error Codes

| Code | Status | Message |
|------|--------|---------|
| `unauthorized` | 401 | Missing or invalid API key |
| `invalid_json` | 400 | Invalid JSON in request body |
| `validation_error` | 400 | Input validation failed |
| `not_found` | 404 | Model or job not found |
| `insufficient_credits` | 402 | Insufficient credits |
| `credit_error` | 500 | Failed to deduct credits |
| `submission_failed` | 500 | Failed to submit job |
| `expired` | 410 | Job has expired |
| `rate_limit_exceeded` | 429 | Rate limit exceeded |
| `internal_error` | 500 | Internal server error |

## Where It Works

- **OpenClaw / ClawHub** -- Install the skill and your agent gets structured tool access.
- **MCP & Chat Agents** -- Any MCP-compatible agent or assistant can call the same endpoints.
- **Automation & Workflows** -- Wire generation into pipelines with polling or webhooks.

## Resources

- Install skill: https://clawhub.ai/blovett80/pixeldojo
- API docs: https://pixeldojo.ai/api-platform/documentation
- API keys: https://pixeldojo.ai/api-platform/api-keys
- LLM docs: https://pixeldojo.ai/llm.txt
- OpenAPI spec: https://pixeldojo.ai/api/openapi
- AI plugin manifest: https://pixeldojo.ai/.well-known/ai-plugin.json
- ComfyUI plugin: https://github.com/blovett80/ComfyUI-PixelDojo