"Can't you just type a few words?"

Yes, you can get an image with just a single word. Sometimes you even get amazing results. But if you want consistent results that match what you're looking for, you need to know how to write prompts properly.

I used to write by feel at first, but when I analyzed prompts from skilled creators in Discord communities, I found common patterns. Following those patterns dramatically improved my results.

Paul DelSignore's article neatly organized prompt structure, so I've translated it. These principles apply to Midjourney, Stable Diffusion, and DALL-E alike.


Anatomy diagram of AI art prompt, showing four connected boxes labeled "Content Type", "Description", "Style", "Composition" in a flow chart, clean infographic design, educational illustration, blue and white color scheme


The 4 Components of a Prompt

A good prompt includes 4 elements in order. The order matters because AI treats words at the front as more important.

OrderElementRole
1Content TypeWhat are you creating? (photo? painting? 3D?)
2DescriptionSubject and environment description
3StyleLighting, detail, art style
4CompositionAspect ratio, camera angle, resolution

Let's look at each one.


1. Content Type

"What kind of image are you creating?"

This is the starting point of your prompt. If you don't specify this, the AI will decide on its own.

Common Content Types

A photograph of...     (photo)
A painting of...       (painting)
A drawing of...        (drawing)
A sketch of...         (sketch)
A 3D render of...      (3D render)
An illustration of...  (illustration)
A digital art of...    (digital art)

Example

❌ wolf in the forest
✅ A photograph of a wolf in the forest

It seems simple, but this single phrase completely changes the result.


Same wolf subject shown in 4 different content types: photograph (realistic), painting (oil paint texture), sketch (pencil lines), 3D render (CGI look), four-panel comparison demonstrating content type impact


2. Description

"What, where, in what state?"

This is the most important part. The more specific you are, the closer you get to your desired result.

The 3 Elements of Description

| Element | Explanation | Example | |---------|-------------|---------|| | Subject | What is the subject | wolf | | Subject Attributes | State/characteristics of the subject | angry, full-bodied | | Environment/Scene | Background, environment | in the foggy woods |

Progression Example

Level 1: A photograph of a wolf
→ Just a wolf photo

Level 2: A photograph of an angry wolf
→ A wolf with an angry expression

Level 3: A photograph of an angry full-bodied wolf in the foggy woods
→ An angry wolf with full body visible in a foggy forest

As the level increases, the results become more specific.

Adding Era/Period

If your image includes people or buildings, when matters too.

Primitive society
Antiquity
Middle ages
Renaissance
Modern world
Contemporary
Future
Example: A photograph of a knight in medieval castle, middle ages

Prompt evolution demonstration: three wolves showing progression from basic "wolf" to "angry wolf" to "angry full-bodied wolf in foggy woods at dusk", showing how description detail improves output quality, before-after style


3. Style

"What mood and feel?"

Style can be divided into three subcategories.

3-1. Lighting

Lighting determines the mood. The same scene looks completely different depending on the lighting.

Natural light:
natural lighting, sunlight, moonlight, dusk, dawn, golden hour

Artificial light:
neon lamp, candlelight, spotlight, fluorescent, Edison bulb

Special effects:
backlight, rim lighting, dramatic lighting, soft lighting,
crepuscular rays, glowing, blacklight

3-2. Detail

This determines the precision of the image. You can mention rendering engines or camera techniques.

Rendering engines:
unreal engine, octane render, vray, houdini render, arnold render

Photo feel:
bokeh, depth of field, 8k uhd, film photography, DSLR, 100mm

Quality keywords:
highly detailed, ultra realistic, studio quality, raytracing

3-3. Art Style

You can specify historical art movements or techniques.

Historical styles:
Renaissance, Baroque, Impressionism, Cubism, Surrealism,
Art Deco, Pop Art, Abstract Expressionism

Techniques/Media:
Digital art, oil painting, watercolor, concept art,
character design, line-art, tarot card style

Style Application Example

Basic:
A photograph of an angry full-bodied wolf in the foggy woods

+ Adding style:
A photograph of an angry full-bodied wolf in the foggy woods,
dusk, unreal engine, 8k

Same wolf scene with three different style treatments: natural soft lighting (warm, gentle), neon cyberpunk lighting (blue/pink glow), dramatic noir lighting (high contrast shadows), demonstrating lighting impact on mood


Using Artist Names

One of the methods AI understands best for specifying style is artist names.

Examples:
...by Greg Rutkowski, by Artgerm
...in the style of Studio Ghibli
...by Alex Horley-Orlandelli, by Bastien Lecouffe-Deharme

Combining multiple artists mixes styles for unique results.

A photograph of an angry full-bodied wolf in the foggy woods,
by Alex Horley-Orlandelli, by Bastien Lecouffe-Deharme,
dusk, sepia

Note: Using living artists' names has ethical controversy. Use only for style reference, and be careful with commercial use.


4. Composition

"How will you compose the frame?"

4-1. Aspect Ratio

Choose the ratio based on your purpose.

PurposeRatio
Instagram square1:1
Instagram Stories/Reels9:16
YouTube thumbnail16:9
Vertical portrait2:3, 4:5
Horizontal landscape3:2, 16:9
Wide banner21:9

4-2. Camera View

This determines the viewer's perspective.

Distance:
extreme close-up, close-up, medium shot, long shot, extreme long-shot

Angle:
low angle, high angle, aerial view, street level view, dutch angle

Lens:
wide-angle, ultra wide-angle, fisheye, panoramic, bokeh

4-3. Resolution

Specify quality and size.

4k, 8k uhd, highly detailed, studio quality, ultra realistic

Camera view comparison: same castle subject shown from 6 perspectives - extreme close-up (stone texture detail), medium shot (gate and wall), long shot (full castle), aerial view (from above), low angle (looking up), wide-angle (panoramic), labeled grid demonstration


Complete Prompt Structure

Combining all elements looks like this:

[Content Type] + [Description] + [Style] + [Composition]

Example:
A photograph of                          ← Content Type
an angry full-bodied wolf                ← Subject + Attributes
in the foggy woods,                      ← Environment
by Alex Horley-Orlandelli,               ← Artist Style
dusk, sepia, unreal engine, 8k,          ← Lighting, Detail
wide-angle, cinematic composition        ← Composition

Keyword Cheat Sheet

Lighting Keywords

Natural: natural lighting, sunlight, moonlight, golden hour, dusk, dawn
Artificial: neon, candlelight, spotlight, Edison bulb, fluorescent
Mood: dramatic lighting, soft lighting, backlight, rim lighting
Special: crepuscular rays, glowing, blacklight, lava glow

Detail Keywords

Render: unreal engine, octane render, vray, cinema4d, houdini
Photo: bokeh, depth of field, DSLR, film photography, 100mm
Quality: 8k uhd, highly detailed, ultra realistic, raytracing

Art Style Keywords

Historical: Renaissance, Baroque, Impressionism, Surrealism, Pop Art
Modern: digital art, concept art, anime, manga, fantasy
Technique: oil painting, watercolor, pencil sketch, line-art

Composition Keywords

Distance: close-up, medium shot, long shot, extreme long-shot
Angle: low angle, high angle, aerial view, dutch angle
Lens: wide-angle, fisheye, panoramic, telephoto

Complete prompt anatomy infographic: visual breakdown showing Content Type → Description → Style → Composition flow, with example keywords under each category, cheat sheet style design, easy to reference layout


Common Mistakes

These are mistakes I made often at first.

1. Writing Too Short

❌ beautiful landscape
✅ A digital painting of a serene mountain lake at sunset, 
   golden hour lighting, highly detailed, 8k, wide-angle

2. Ignoring Order

❌ 8k, photograph, wolf, forest, angry
✅ A photograph of an angry wolf in a dark forest, 8k

AI processes words at the front as more important.

3. Conflicting Keywords

❌ realistic photograph, anime style, oil painting

Using opposing styles together produces strange results.

4. Being Stingy with Adjectives

❌ a wolf
✅ a fierce, majestic, grey wolf with glowing amber eyes

Specific adjectives make results richer.


Practice: Evolving a Prompt

Let's progressively develop a prompt with the same subject.

Subject: Futuristic City

Level 1 (Basic):
A city

Level 2 (+ Content Type):
A photograph of a futuristic city

Level 3 (+ Description):
A photograph of a futuristic cyberpunk city at night
with flying cars and neon signs

Level 4 (+ Style):
A photograph of a futuristic cyberpunk city at night
with flying cars and neon signs,
rain-slicked streets, dramatic lighting, unreal engine, 8k

Level 5 (+ Composition):
A photograph of a futuristic cyberpunk city at night
with flying cars and neon signs,
rain-slicked streets, dramatic lighting, unreal engine, 8k,
wide-angle, street level view, cinematic composition

As the level increases, results become more specific and aligned with intent.


Prompt evolution showcase: 5 versions of futuristic city from basic to fully detailed, showing progressive improvement as more prompt elements are added, numbered 1-5, visual learning demonstration


Summary

| Order | Element | Question | Example | |-------|---------|----------|---------|| | 1 | Content Type | What? | A photograph of... | | 2 | Description | Who/Where/How? | angry wolf in foggy woods | | 3 | Style | What mood? | dusk, 8k, by [artist] | | 4 | Composition | What frame? | wide-angle, 16:9 |


Writing prompts is both a skill and an art. There's no single right answer—it's about continuous experimentation and refinement.

At first, follow this structure, and as you get comfortable, you'll find your own patterns. I'm still learning too.

The important thing is understanding why you use each keyword. That way, when results aren't what you wanted, you'll know what to adjust.