Cross-cutting · verified · doc-vs-code correction

Content Moderation Pipeline

What moderation actually runs on user-generated prompts and assets — and why doesn't the published architecture match the code?

Last verified 2026-04-17 against code

Slug moderation-pipeline

Integration model vendored library (⚠ not shared service)

Live callers SoundBuddy · ImageBuddy · (ModerationBuddy admin-only)

§1 Claim vs reality

The existing architecture docs describe a shared moderation service. The running system doesn't use it. This diagram starts by surfacing that gap because every downstream conclusion depends on it.

What docs claim

repos.yaml:63-67: "Active for SoundBuddy; planned expansion to FuzzyCode"

docs/architecture/SERVICE_INTEGRATIONS.md:24-25: "SoundBuddy → ModerationBuddy POST /moderate"

Reading the docs, you'd expect an HTTP fan-out where each content-producing service calls ModerationBuddy before proceeding.

What code does

No service calls ModerationBuddy/moderate at runtime.

Instead, each content service embeds the moderation_core/ library directly and calls OpenAI omni-moderation-latest in-process.

ModerationBuddy's HTTP endpoint exists, but is _require_admin-gated and only invoked by a batch offline tool (FuzzycodePagesFlaskServer/moderation_utility.py:38) for analyzing already-published pages.

§2 The actual runtime moderation topology

flowchart LR classDef user fill:#fff460,stroke:#683c06,color:#111 classDef svc fill:#edd9c0,stroke:#683c06,color:#111 classDef lib fill:#c7d9e8,stroke:#1d58b1,color:#111 classDef ext fill:#eaf3ff,stroke:#1d58b1,color:#111 classDef absent fill:#f0f0f0,stroke:#b8432e,color:#888,stroke-dasharray: 5 5 classDef pre fill:#fff4d6,stroke:#a47a3a,color:#111 U([Browser]):::user subgraph SBFC[" "] FC[FuzzyCode
⚠ NO content moderation
only PII firewall] end subgraph SB[" "] SBsvc[SoundBuddy] SBlib[(moderation_core
embedded library)] SBpre[pre-check
banned words] end subgraph IB[" "] IBsvc[ImageBuddy] IBlib[(moderation_core
embedded library)] IBpre[pre-check
banned words] end OAI[OpenAI
omni-moderation-latest] MB[ModerationBuddy
admin-only HTTP
NOT on user path] OFFLINE[Pages batch job
moderation_utility.py] DB[(SoundBuddy
moderation_history
local DB)] U -->|/send, /sound_effect| SBsvc SBsvc --> SBpre SBpre -->|matched → FLAG| DB SBpre -->|clean| SBlib SBlib -->|HTTPS| OAI OAI -->|outcome| SBlib SBlib --> DB U -->|/image_*| IBsvc IBsvc --> IBpre IBpre -->|matched → FLAG| IBlib IBpre -->|clean| IBlib IBlib -->|HTTPS| OAI U -.->|no moderation| FC OFFLINE -.->|admin-key POST /moderate| MB MB -->|HTTPS| OAI class SBsvc,IBsvc,FC,MB svc class SBlib,IBlib lib class OAI ext class SBpre,IBpre pre class OFFLINE absent class DB lib class U user

FuzzyCode services (production)

Embedded moderation_core/ library + its local DB

External (OpenAI)

Pre-check — banned-word matcher

Admin-only / offline paths — not user-facing

§3 Per-service fail-mode

Same library, different fail-modes. The divergence is silent — neither service calls out that it behaves differently than its sibling. An outage of OpenAI moderation has different user-facing effects depending on which service was called.

Service	Integration	Adapter init failure	Adapter runtime error	Pre-check banned word	Evidence
SoundBuddy	embedded `moderation_core`	FAIL-CLOSED (reject prompt)	FAIL-OPEN (allow, log TECHNICAL_FAILURE)	BLOCK	`main.py:1005-1007`, `adapters/openai.py:140-152`
ImageBuddy	embedded `moderation_core`	FAIL-OPEN (allow)	FAIL-OPEN (allow, log TECHNICAL_FAILURE)	BLOCK	`moderation.py:259-285`
FuzzyCode	none	—	—	—	no import of `ModerationExecutor`
SpriteBuddy	none (prompts go via ImageBuddy for content gen)	—	—	—	grep: no moderation_core import
ModerationBuddy HTTP	admin-only endpoint	n/a user-path	FAIL-OPEN in adapter	BLOCK	`main.py:689-837`

Operational consequence. If OpenAI moderation goes down: SoundBuddy rejects new prompts (fail-closed adapter init on cold start), ImageBuddy accepts everything, FuzzyCode is unaffected (doesn't call moderation). A single event produces different user experiences per service. Any SLO / incident playbook needs to account for this per-service.

§4 Outcome taxonomy (what actually gets persisted)

ModerationOutcome is three independent booleans + free-text reason — not an enum. Persisted values (moderation_history.moderation_status) are three allowed, but one of them is never written.

Written value	When	Evidence
`FLAGGED` · categories = `["PRECHECK_BANNED_WORD"]`	Pre-check matches local banned-word list; OpenAI never called	`executor.py:42-58`
`FLAGGED`	OpenAI adapter returns flagged=true after violence-threshold post-processing	`executor.py:73-90`, `adapters/openai.py:93-94`
`ALLOWED`	OpenAI adapter returns flagged=false	`executor.py:73-90`
`ALLOWED` · `error_details = "TECHNICAL_FAILURE"`	Adapter raised; fail-open path	`adapters/openai.py:140-152`
`ERROR` (dead code)	Schema CHECK allows it but no emitter writes it	`0002_moderation_history_local.sql:43`

§5 Gaps — verified, not speculative

No decision state machine. Moderation is a one-shot gate: prompt arrives, outcome emitted, row written. There is no review queue (the needs_review flag exists on the dataclass but is never set true in any code path). No appeal path. No re-evaluation.

No moderation cache. Every prompt is re-evaluated. A repeat prompt pays the OpenAI round-trip each time.

moderation_core/ is vendored into 5 repos via install_update_moderation.sh with no version pinning. Drift between copies is expected over time. Any bug fix needs to be applied in 5 places.

FuzzyCode publishes with no content moderation. The HTML attestation path covers PII leakage (good), but user-supplied text visible on a published page — banner text, image prompts that got baked into HTML — is not screened for policy violations at publish time. Only prompts to the generative services see moderation.

§6 Verification pointers

SoundBuddy entry into moderationSoundBuddyFastAPI2/main.py:1001-1043, 1385, 1594
ImageBuddy entry into moderationImageBuddyRobustFastAPI/moderation.py:200-308
Executormoderation_core/pipelines/executor.py:34-92
Pre-checkmoderation_core/precheck.py:10-14
OpenAI adapter (shared by all embedders)moderation_core/adapters/openai.py:28-87, 93-94, 140-152
ModerationBuddy HTTP endpoint (admin-only)ModerationBuddyFastAPI/main.py:689-837, 703, 883
Offline batch use of the HTTP endpointFuzzycodePagesFlaskServer/moderation_utility.py:36-50
DB schema (live callers)SoundBuddyFastAPI2/db/migrations/0002_moderation_history_local.sql:43
Vendoring scriptscripts/install_update_moderation.sh (copies moderation_core/ into each target repo)
FuzzyCode absence proofgrep -rn "from moderation_core" FuzzyCode/ → only pii_firewall imports