Writing Prompts for AI Image Generators (2026 Framework)

The six elements, in depth.

1. Subject, who or what is in the image.

The subject is the foreground. Be specific about quantity, attributes, action, and relationship: "a young woman" is weak; "a young woman holding a leather-bound book, sitting cross-legged on a windowsill" is strong. Models trained on natural-language captions prefer connected sentences; models trained on tag-style data prefer comma-separated phrases. Most modern diffusion models handle either.

2. Style, the visual grammar.

Style is the most powerful single element. Strong style terms: "photograph", "oil painting", "watercolour illustration", "3D render", "technical line drawing", "art nouveau poster", "Pixar-style 3D", "Studio Ghibli illustration", "fashion editorial". Weaker terms: "artistic", "beautiful", "detailed" (these add nothing because everything the model produces is already "detailed"). Reference specific movements, illustrators, or production styles where appropriate; avoid living artists by name.

3. Composition, framing and arrangement.

Compositional terms: "close-up", "medium shot", "wide shot", "overhead view", "three-quarter view", "Dutch angle", "rule of thirds", "centred subject", "negative space on the left". The model has internalised cinematographic and photographic vocabulary; using it gets predictable results.

4. Lighting, the second-most-powerful element.

Lighting terms: "golden hour", "blue hour", "soft window light from the left", "harsh midday sun", "studio softbox", "rim light", "backlit", "moody chiaroscuro", "neon", "candlelit". Light controls mood, depth, and texture. Specifying it explicitly avoids the model defaulting to soft even diffusion.

5. Mood, the emotional register.

Mood terms: "serene", "dramatic", "contemplative", "playful", "melancholy", "tense", "ethereal", "raw". These steer the choices the model makes about colour grading, expression, and atmosphere. Mood is also where prompts can over-do it; one mood word usually suffices.

6. Technical, camera and post-processing.

Technical terms: "shallow depth of field", "f/1.4", "50mm lens", "85mm portrait lens", "Kodak Portra 400 film", "wide-angle", "tilt-shift", "high contrast", "muted tones", "4:3 aspect", "cinematic colour grade". Use sparingly, too many technical modifiers create stylistic noise.

Negative prompts.

Negative prompts list what you want to avoid. They work well on most diffusion models (SD-family, Flux variants); less reliably on closed proprietary models that don't expose negative-prompt parameters. Standard safety negatives: "blurry", "deformed hands", "extra limbs", "text, watermark, signature". Compositional negatives: "multiple subjects" if you want one, "cluttered background" if you want clean.

Models that don't expose negatives respond to in-prompt restatement: "a single cat" instead of "a cat" with "multiple cats" in negative.

Why prompts vary by architecture.

Diffusion models trained with descriptive caption datasets respond well to detailed natural-language prompts. Older diffusion models trained with tag-data (Danbooru-style) respond better to comma-separated tags. Autoregressive token-based models handle structured layouts (lists, ordered descriptions) better than diffusion. The architectural background on /how-it-works explains why; in practice, run two or three prompt variants on a new generator to learn which structure it prefers.

Seed management and reproducibility.

The seed is a number that initialises the model's random state. Same prompt + same seed + same model + same parameters = same output. This is how you iterate on a successful generation: lock the seed, then change one prompt element to see what shifts. Generators that surface the seed (most do; check the metadata or generation parameters panel) let you do this; generators that hide the seed make iteration harder.

For workflow-critical prompts, archive the full set of parameters: prompt text, negative prompt, seed, model version, sampler, steps, guidance scale, resolution. A simple text file or spreadsheet is enough at small scale; at production scale, JSON sidecars or a metadata database.

Prompt libraries and sharing.

Communities organise around shared prompts. Civitai, PromptHero, OpenArt, and various Discord and subreddit communities collect prompts that produce strong results. Treat shared prompts as starting points; few transfer cleanly across generators. The architectural differences mean a prompt tuned for SDXL behaves differently on Flux, Imagen, or DALL-E 3.

Build your own library starting with the prompts that work for your use case. After a few months of use, your library is a more valuable asset than any platform you happen to be on.

Writing prompts for AI image generators.

The prompt builder.