SNAPSHOT_CANDID · 失败美学伪手机日记

sample · signatures/snapshot_candid/canonical_sample.png

Run Record

Thu Apr 30 2026 20:33:00 GMT+0800 (China Standard Time)

◐ partial¥0.538.6s wall· 2.3MB

provider · gpt_image_2 (apimart) · aspect_ratio=4:5 · size=4:5 · n=1

prompt · inline (no library entry · 1512 chars)

refs · none (pure T2I)

task_id · task_01KQFJFPJ2YMVCC6BM7JNYPST9

script · experiments/snapshot_candid_test/test_snapshot_candid.py

⚠ failure modes (4)

·tilt anchor not honored · output frame is ~1° not the requested 5-15°
·no visible compression artifacts / oversharpening / green cast despite explicit prompt
·lighting reads as 'soft cinematic' not 'cheap phone sensor'
·model prior for polished portrait dominated the failure-aesthetic anchors

view full sidecar yaml

date: 2026-04-30T12:33:00.000Z
result: partial
prompt:
id: null
text: >
Phone diary aesthetic · slight handheld tilt 7° off-axis · imperfect crop with subject offset
from center · casual snapshot framing as if grabbed mid-moment · cheap-camera color grading
(slight green cast, mild oversharpening, mild compression artifacts). Composition feels
accidental · not posed. Deliberately rough · unpolished is the point.

Subject: A young woman sitting alone at a corner window seat in a small neighborhood cafe,
holding a paper coffee cup with both hands, gazing out at rain-streaked glass. Soft afternoon
light from the window catches her face from the side. She wears an oversized cream knit sweater.
Her hair is pulled into a casual low bun with several strands escaping near her ears. Hand-held
jitter feel · slight motion blur on the cup.

Layout: subject is offset to the right of frame center (not rule-of-thirds composition · just
genuinely off). Frame is tilted 7° clockwise from horizontal. Phone-camera grain visible.
Reflections in the glass are slightly out of focus. The cafe interior is visible but blurred —
wooden table edge in the foreground, blurred bokeh of cafe lights behind.

Color: Dim warm tungsten interior tones with cool blue rain-light through window. Slight green
cast from cheap phone sensor. Saturation pulled down · contrast boosted slightly · highlights
blown out near the window.

NEVER include: professional · cinematic · polished · editorial · studio · well-composed ·
centered · balanced · high-quality · pristine · sharp · model · shoot · portrait.
refs: []
provider:
id: gpt_image_2
relay: apimart
config:
aspect_ratio: '4:5'
size: '4:5'
'n': 1
output:
path: ./snapshot_candid_v1.png
bytes: 2447294
wall_seconds: 38.6
task_id: task_01KQFJFPJ2YMVCC6BM7JNYPST9
script: experiments/snapshot_candid_test/test_snapshot_candid.py
cost_yuan: 0.5
failure_modes:
- tilt anchor not honored · output frame is ~1° not the requested 5-15°
- no visible compression artifacts / oversharpening / green cast despite explicit prompt
- lighting reads as 'soft cinematic' not 'cheap phone sensor'
- model prior for polished portrait dominated the failure-aesthetic anchors
notes: |
First canonical sample for SNAPSHOT_CANDID signature (4th of 4 signatures).
Captured the cafe context · low-key tungsten + cool blue color contrast · offset
composition. But the model's polish prior pulled output toward cinematic portrait
instead of true phone-diary failure aesthetic.

Hypothesis for v2 (if attempted): text-anchor alone is insufficient to defeat
gpt-image-2's polish prior. Future iterations may need:
- reference image of an actual phone snapshot (visual anchor for lo-fi style)
- post-process tilt + compression in ffmpeg / ImageMagick after generation
- or accept the model has a polish ceiling here · use a different model
(e.g. SDXL with a lo-fi LoRA)

Documented in signatures/snapshot_candid/README.md §Limitations of v1 sample.