Content Moderation Pipeline
What moderation actually runs on user-generated prompts and assets — and why doesn't the published architecture match the code?
§1 Claim vs reality
The existing architecture docs describe a shared moderation service. The running system doesn't use it. This diagram starts by surfacing that gap because every downstream conclusion depends on it.
What docs claim
repos.yaml:63-67: "Active for SoundBuddy; planned expansion to FuzzyCode"
docs/architecture/SERVICE_INTEGRATIONS.md:24-25: "SoundBuddy → ModerationBuddy POST /moderate"
Reading the docs, you'd expect an HTTP fan-out where each content-producing service calls ModerationBuddy before proceeding.
What code does
No service calls ModerationBuddy/moderate at runtime.
Instead, each content service embeds the moderation_core/ library directly and calls OpenAI omni-moderation-latest in-process.
ModerationBuddy's HTTP endpoint exists, but is _require_admin-gated and only invoked by a batch offline tool (FuzzycodePagesFlaskServer/moderation_utility.py:38) for analyzing already-published pages.
§2 The actual runtime moderation topology
⚠ NO content moderation
only PII firewall] end subgraph SB[" "] SBsvc[SoundBuddy] SBlib[(moderation_core
embedded library)] SBpre[pre-check
banned words] end subgraph IB[" "] IBsvc[ImageBuddy] IBlib[(moderation_core
embedded library)] IBpre[pre-check
banned words] end OAI[OpenAI
omni-moderation-latest] MB[ModerationBuddy
admin-only HTTP
NOT on user path] OFFLINE[Pages batch job
moderation_utility.py] DB[(SoundBuddy
moderation_history
local DB)] U -->|/send, /sound_effect| SBsvc SBsvc --> SBpre SBpre -->|matched → FLAG| DB SBpre -->|clean| SBlib SBlib -->|HTTPS| OAI OAI -->|outcome| SBlib SBlib --> DB U -->|/image_*| IBsvc IBsvc --> IBpre IBpre -->|matched → FLAG| IBlib IBpre -->|clean| IBlib IBlib -->|HTTPS| OAI U -.->|no moderation| FC OFFLINE -.->|admin-key POST /moderate| MB MB -->|HTTPS| OAI class SBsvc,IBsvc,FC,MB svc class SBlib,IBlib lib class OAI ext class SBpre,IBpre pre class OFFLINE absent class DB lib class U user
moderation_core/ library + its local DB§3 Per-service fail-mode
Same library, different fail-modes. The divergence is silent — neither service calls out that it behaves differently than its sibling. An outage of OpenAI moderation has different user-facing effects depending on which service was called.
| Service | Integration | Adapter init failure | Adapter runtime error | Pre-check banned word | Evidence |
|---|---|---|---|---|---|
| SoundBuddy | embedded moderation_core |
FAIL-CLOSED (reject prompt) |
FAIL-OPEN (allow, log TECHNICAL_FAILURE) |
BLOCK | main.py:1005-1007, adapters/openai.py:140-152 |
| ImageBuddy | embedded moderation_core |
FAIL-OPEN (allow) |
FAIL-OPEN (allow, log TECHNICAL_FAILURE) |
BLOCK | moderation.py:259-285 |
| FuzzyCode | none | — | — | — | no import of ModerationExecutor |
| SpriteBuddy | none (prompts go via ImageBuddy for content gen) | — | — | — | grep: no moderation_core import |
| ModerationBuddy HTTP | admin-only endpoint | n/a user-path | FAIL-OPEN in adapter | BLOCK | main.py:689-837 |
§4 Outcome taxonomy (what actually gets persisted)
ModerationOutcome is three independent booleans + free-text reason — not an enum. Persisted values (moderation_history.moderation_status) are three allowed, but one of them is never written.
| Written value | When | Evidence |
|---|---|---|
FLAGGED · categories = ["PRECHECK_BANNED_WORD"] | Pre-check matches local banned-word list; OpenAI never called | executor.py:42-58 |
FLAGGED | OpenAI adapter returns flagged=true after violence-threshold post-processing | executor.py:73-90, adapters/openai.py:93-94 |
ALLOWED | OpenAI adapter returns flagged=false | executor.py:73-90 |
ALLOWED · error_details = "TECHNICAL_FAILURE" | Adapter raised; fail-open path | adapters/openai.py:140-152 |
ERROR (dead code) | Schema CHECK allows it but no emitter writes it | 0002_moderation_history_local.sql:43 |
§5 Gaps — verified, not speculative
needs_review flag exists on the dataclass but is never set true in any code path). No appeal path. No re-evaluation.
moderation_core/ is vendored into 5 repos via install_update_moderation.sh with no version pinning. Drift between copies is expected over time. Any bug fix needs to be applied in 5 places.
§6 Verification pointers
- SoundBuddy entry into moderation
SoundBuddyFastAPI2/main.py:1001-1043, 1385, 1594 - ImageBuddy entry into moderation
ImageBuddyRobustFastAPI/moderation.py:200-308 - Executor
moderation_core/pipelines/executor.py:34-92 - Pre-check
moderation_core/precheck.py:10-14 - OpenAI adapter (shared by all embedders)
moderation_core/adapters/openai.py:28-87, 93-94, 140-152 - ModerationBuddy HTTP endpoint (admin-only)
ModerationBuddyFastAPI/main.py:689-837, 703, 883 - Offline batch use of the HTTP endpoint
FuzzycodePagesFlaskServer/moderation_utility.py:36-50 - DB schema (live callers)
SoundBuddyFastAPI2/db/migrations/0002_moderation_history_local.sql:43 - Vendoring script
scripts/install_update_moderation.sh(copiesmoderation_core/into each target repo) - FuzzyCode absence proof
grep -rn "from moderation_core" FuzzyCode/→ onlypii_firewallimports