Skip to main content
Ernie Image Prompting Guide

Baidu Ernie.
Multilingual prompts, native CJK output.

Ernie Image is Baidu's text-to-image model — multilingual prompts (English, Chinese, Japanese), strong CJK text rendering, and two model variants (HD at 1 credit, UHD at 3 credits, or Turbo at 1 credit). Particularly strong on East Asian subjects and CJK signage.

Ernie Image hero — Shanghai Pudong skyline at sunset

Overview

Ernie Image is Baidu's text-to-image model. Its defining traits: multilingual prompt support (you can write prompts in English, Chinese, or Japanese), native CJK text rendering (Chinese, Japanese, Korean characters render with proper letterforms), and strong photoreal output on East Asian subjects, architecture, and signage.

Two variants via the model field. ernie-image is the quality tier — 1 credit at HD (square / landscape / portrait HD) or 3 credits at Square UHD. ernie-image/turbo is the fast tier — 1 credit flat. Both support negative prompts and seed pinning for reproducibility.

Multi

EN / ZH / JA prompts

CJK

Native text rendering

1–3 cr

Per image

Key Features

Multilingual Prompts
EN · ZH · JA

Multilingual Prompts

Write prompts in English, Chinese, or Japanese — Ernie reads all three natively. Useful for non-English content workflows, and for prompts where the source-language nuance matters (cultural specificity, literary references, place names).

Native CJK Text Rendering
Chinese · Japanese · Korean

Native CJK Text Rendering

Ernie renders CJK characters with proper letterforms, kerning, and weight — one of the strongest models on this dimension. Wrap exact strings in double quotes. Restaurant signs, calligraphy, packaging, posters all read clean and culturally accurate.

East Asian Subjects
Architecture, ceremony, lifestyle

East Asian Subjects

Particularly strong on East Asian subjects — traditional Chinese architecture, Japanese gardens, Korean cuisine, Asian fashion. Cultural detail (specific costume styles, regional architectural features, ceremonial elements) renders with respect and accuracy.

HD / UHD / Turbo Variants
Price/quality tradeoff

HD / UHD / Turbo Variants

ernie-image HD = 1 credit (square / landscape / portrait HD). ernie-image Square UHD = 3 credits (higher resolution). ernie-image/turbo = 1 credit flat (fast tier, smaller variants). Pick by use case.

Example Images

Each example shows the exact prompt that produced the result. Copy any prompt with one click.

Editorial Portrait — Tea Master

Editorial Portrait — Tea Master

HD portrait · 1 credit

Editorial portrait of a Chinese tea master in traditional silk hanfu, sitting at a wooden tea table with fine porcelain, soft window light from the left, hyper-detailed skin and fabric texture, photoreal

"Traditional silk hanfu" + cultural detail anchors. Ernie renders hanfu fabric weave, porcelain tea sets, and traditional wooden surfaces convincingly. Soft directional lighting + 3:4 portrait gives editorial framing without needing technical photo anchors.

Cinematic Shanghai

Cinematic Shanghai

HD landscape · 1 credit

Cinematic wide shot of Shanghai's Pudong skyline at sunset, neon lights beginning to illuminate against a deep orange sky, the Bund visible in the foreground, painterly photoreal

Place-specific prompts work — name the city + landmark. Ernie composes Pudong's distinctive skyline cleanly. Twilight transition ("neon beginning to illuminate against deep orange sky") gives the right time-of-day lighting layer.

Chinese Calligraphy Sign

Chinese Calligraphy Sign

HD square · 1 credit

A handpainted Chinese restaurant sign reading "上海菜館" in elegant calligraphy on a deep red wooden panel, brass mounting, golden afternoon light, photoreal

Quote the exact CJK characters in double quotes — Ernie holds the letterforms cleanly. "Elegant calligraphy" + "deep red wooden panel" + "brass mounting" gives surface and typography direction. One of Ernie's standout strengths.

Traditional Garden Pavilion

Traditional Garden Pavilion

HD landscape · 1 credit

Traditional Chinese garden pavilion with curved tile roof reflecting in a still pond at dawn, lotus flowers floating on the water, soft mist rising, painterly atmospheric

"Traditional Chinese garden pavilion" + "curved tile roof" — specific architectural detail. Ernie composes the reflection in the pond cleanly with the lotus floating elements layered. "Painterly atmospheric" closes the prompt with aesthetic anchor.

Prompting Tips

Prompt in the target language for nuance

Prompts in Chinese or Japanese sometimes capture cultural nuance that English translation flattens. For specifically East Asian subjects (architecture, calligraphy, ceremony), writing the prompt in the target language can produce more authentic output.

Quote CJK text in double quotes

Same recipe as Latin text — wrap exact CJK strings in double quotes. Ernie renders the characters with proper letterforms and weight. "上海菜館", "東京", "한국" all hold up at HD tier when prompted this way.

HD is the production default

Square HD, Landscape HD, Portrait HD all cost 1 credit. Use these for everyday production work. Reach for Square UHD (3 credits) only for hero shots where the extra resolution is necessary. Turbo is the speed-optimized 1-credit fast tier.

East Asian subjects shine

Cultural specificity unlocks Ernie's strengths. Hanfu, kimono, hanbok, traditional architecture, regional cuisine, calligraphy — all read with cultural accuracy. Vague "Asian-style" produces generic output; specific cultural detail produces convincing output.

Atmospheric anchors over technical

"Painterly atmospheric", "soft mist", "golden afternoon light" work better than camera-stack anchors. Ernie reads aesthetic vocabulary literally and renders the atmospheric mood convincingly.

Pick size by aspect

square_hd for 1:1, landscape_hd for 16:9, portrait_hd for 9:16. Square_uhd for hero shots at higher resolution. The image_size field is enum, not a free aspect string — match exactly.

Settings Reference

SettingValuesNotes
Modelfal-ai/ernie-image · fal-ai/ernie-image/turboBase for quality, Turbo for speed. Both 1 credit at HD.
Image sizesquare_hd · landscape_hd · portrait_hd · square_uhdEnum, not free aspect. Choose explicitly.
Negative promptString (optional)Optional. Short focused negatives work best.
SeedInteger (optional)Pin for reproducibility across re-runs.
Pricing — HD tiers1 creditSquare HD, Landscape HD, Portrait HD.
Pricing — UHD3 creditsSquare UHD only. Base model.
Pricing — Turbo1 credit flatfal-ai/ernie-image/turbo. Speed-optimized fast tier.

FAQ

Both render CJK characters cleanly with proper letterforms. Ernie has slightly stronger Chinese-cultural composition behaviors (traditional architecture, hanfu, regional cuisine read with more authenticity). Qwen Image has broader aspect ratio coverage and the Plus/Max snapshot system. For specifically Baidu/Chinese-cultural subjects, Ernie. For broader work with occasional CJK, Qwen.