Strong Prompt Fidelity
HiDream reads compositional language literally. "Three-quarter view from below, holding a single book, looking off-camera" actually produces that shot. Specifically directional prompts execute the way they read.
HiDream O1 is a versatile image model with notably strong prompt adherence and stylized output capability. Full variant for hero shots, Dev (distilled) for fast iteration. Both at flat 1 credit each. Image-conditioned editing is supported for subject-driven workflows.
HiDream O1 sits in a sweet spot between photoreal and stylized output. It handles literal prompt instructions reliably (composition, camera angle, props), and it ranges from documentary-photo realism to painterly illustration without changing the prompt language much.
Two generation variants: Full (default, slower but higher fidelity) and Dev (distilled, ~2-3× faster). Both cost 1 credit. Both support image-conditioned editing — pass an image_url and HiDream uses it as a structural reference for subject-driven personalization.
2
Generation variants (Full · Dev)
1
Credit per image (flat)
Yes
Image-conditioned editing
HiDream reads compositional language literally. "Three-quarter view from below, holding a single book, looking off-camera" actually produces that shot. Specifically directional prompts execute the way they read.
Same prompt language switches between documentary realism, painterly illustration, and stylized art simply by naming the style ("oil paint on canvas", "studio photography", "watercolor"). Few models cover that range cleanly.
Full is the default — slower (~10-20s), highest fidelity, best for hero shots. Dev is distilled (~3-5s), good for iteration. Same prompt language. Pricing identical at 1 credit each, so the choice is purely about wait time vs marginal quality gain.
Pass an image_url with your prompt and HiDream uses it as a structural reference — subject identity, composition, palette guidance — while applying your prompted changes. Useful for restyling, character-consistent variations, and brand-locked compositions.
Each example shows the exact prompt that produced the result. Copy any prompt with one click.
HiDream O1 Full · 1:1 · 1 credit
Editorial portrait of a young woman with curly auburn hair, holding a brass-handled teacup at chest height, three-quarter view turned slightly camera-right, soft Rembrandt lighting from a tall north-facing window, melancholic gaze, vintage cream wool sweater, hyper-detailed skin texture, shot on 50mm medium format film, 8k
HiDream Full nails directional portrait prompts — name the prop, the angle, the lighting direction. The "three-quarter view turned slightly right" prompt produces exactly that pose reliably.
HiDream O1 Full · 3:2 · 1 credit
Watercolor illustration of an enchanted forest clearing at twilight, glowing mushrooms scattered through ferns, a small deer drinking from a moonlit pond, soft pastel palette with deep purple shadows, hand-painted texture, storybook quality
HiDream's stylized register is strong on illustration. Name the medium ('watercolor', 'gouache', 'oil paint') and the tradition ('storybook', 'editorial illustration') and it sticks the look without prompt-bloat.
HiDream O1 Full · 16:9 · 1 credit
Wide cinematic landscape of a misty pine forest at dawn, fog clinging to the lowest branches, a single shaft of golden light breaking through the canopy onto a moss-covered path, cool blue-green palette with warm amber highlight, photoreal 8k
Atmospheric landscapes benefit from a "single warm element" against a cool palette — the shaft of light, a distant lit window, a warm sunset on one peak. That contrast is what makes the shot feel cinematic.
HiDream O1 Dev · 4:5 · 1 credit
Stylized character portrait of a confident African woman in flowing emerald-green fabric, hands raised holding a glowing orb of soft blue light, dramatic three-point lighting, painterly digital art style, fantasy book cover quality, vibrant color palette
Dev variant is fast enough to iterate on character compositions — try 3-4 generations to nail the pose, then switch to Full for the final render. Same 1 credit either way.
HiDream O1 Full · 1:1 · 1 credit
Architectural detail of a minimalist Japanese concrete cafe interior, single wooden chair against a textured concrete wall, soft directional window light casting a long shadow, hyper-detailed concrete texture, single ceramic vase with a sprig of pine, editorial photography
For architectural and interior work, specify the material (concrete, walnut, plaster), the light direction (north-facing, late afternoon side light), and one specific prop. HiDream reads these as composition signals rather than visual clutter.
"Three-quarter view from below", "holding a single book at chest height", "looking off-camera to the left" — HiDream reads these literally and produces the shot. Vague "interesting angle" does not.
"Studio photography" vs "watercolor illustration" vs "oil paint on canvas" — same scene, different register. HiDream switches between photoreal and stylized cleanly when you name the medium explicitly.
Both cost 1 credit, but Dev is ~3-5s and Full is ~10-20s. Iterate prompts on Dev to find the right composition/style, then run the final at Full for the marginal fidelity gain.
"Holding a brass-handled teacup" is a stronger anchor than "with something in hand". HiDream renders specific props faithfully, vague atmosphere often as mush.
When you need the same subject across multiple scenes, generate the first image, then pass that URL as image_url on subsequent prompts. HiDream uses it as a structural reference — identity, palette, composition carry through.
"Soft window light from camera-left", "warm key from above-right", "single rim light from behind" — HiDream's lighting follows these instructions reliably. Generic "good lighting" is a missed opportunity.
| Setting | Values | Notes |
|---|---|---|
| Generation variant | Full (default) · Dev (fast) | Both 1 credit. Dev is ~3-5s, Full is ~10-20s. |
| Image size | square_hd (default) + landscape / portrait presets | The model snaps to the closest supported resolution. |
| Edit variant | Full edit · Dev edit | When passing an input image, the edit-variant pair handles image-conditioned generation. |
| Outputs | 1 | HiDream returns 1 image per prediction; submit multiple times for variations. |
| Output format | PNG (default) | PNG is standard. Some variants support JPG via the public API. |
| Image-to-image | Pass image_url | Uses the source as structural / palette / identity reference. |
Default to Full for hero shots and final renders — it handles fine detail (skin texture, hair strands, fabric weave) more cleanly. Switch to Dev when you're iterating on prompts or composition: ~3-5 second turnaround feels real-time. Both cost the same 1 credit, so the choice is wait time vs marginal fidelity gain.