Closes structural weakness #4 from the audit (single global key, no rotation,
no KMS path). Customer secrets now use envelope encryption with a real
rotation story.
Model:
KEK — Key Encryption Key, 32 bytes from env (SECRETS_ENCRYPTION_KEY). Never
stored in the DB. Root of trust.
DEK — Data Encryption Key, 32 random bytes we generate, stored in the new
encryption_keys table *wrapped* (AES-256-GCM encrypted) with the KEK.
Secrets are encrypted with the DEK.
Schema:
- encryption_keys (version, wrappedDek, active, rotatedBy, createdAt, retiredAt)
- secrets.keyId — which DEK encrypted this row. NULL = legacy (KEK-direct,
pre-envelope); decryptSecret handles both and the first rotation migrates
legacy rows onto a DEK.
crypto.ts (full rewrite):
- ensureActiveKey() — boot-time, loads keys + creates v1 if none. Fail-closed:
index.ts process.exit(1) if it throws — the API will not serve if encryption
can't initialize.
- encryptSecret() — encrypts with the active DEK, returns { value, keyId }.
- decryptSecret(value, keyId) — DEK path or legacy KEK-direct path.
- rotateKeys() — mints a fresh DEK, re-encrypts EVERY secret under it inside a
single transaction (decrypt-old / encrypt-new per row), retires the old key,
activates the new one. A partial failure is recoverable because every row
carries its own keyId.
- encryptionStatus() — active version, key history, secret + legacy counts.
Admin:
- GET /v1/admin/encryption — status
- POST /v1/admin/encryption/rotate — triggers rotateKeys, audit-logged as
admin.encryption.rotate with { newVersion, reEncrypted }.
- /admin/encryption page — active-key/secret/legacy cards, Rotate button with
confirm, key-history table, plain-English how-it-works. Added to admin nav.
Verified end-to-end:
- boot → encryption_keys v1 active, '[crypto] envelope encryption ready'
- created a server with secret MY_API_KEY → stored ciphertext, keyId = v1
- POST rotate → { newVersion: 2, reEncrypted: 1 }; ciphertext changed, keyId
now v2, v1 retired, v2 active. The decrypt-then-reencrypt round-trip
succeeded (rotation throws otherwise) — the secret is provably recoverable.
- admin UI renders the status + history correctly.
Deferred, named honestly (not built this iteration):
- worker reads secrets from the DB instead of the BullMQ job-data plaintext
copy — would also remove plaintext secrets from Redis. Separate change with
its own risk surface on the iterate/fork flows.
- per-server secret-value rotation UI
- audit_log hash-chaining (tamper-evidence)
- rate limiting on auth endpoints
Sovereign-audit follow-up. The audit's finding pass missed this: every
Iterate (version > 1) ran allocatePort -> a NEW port and deployContainer -> a
NEW container, then pointed the DB row at it — and never stopped the old
container. The previous version kept running forever, holding a host port,
with the old secrets baked into its env, untracked (its containerId was
overwritten in the DB by deployContainer). Same bug class as API-SERVERS-001
but on the iterate path.
Fix: the worker captures the server's current containerId before the build
mutates the row, and after the new container is confirmed live + the DB
updated, it stops the old one. This also makes the 'rolling deploy' the UI
promises actually true — the old version stays up until the new one is live,
then is retired.
deploy.ts stopContainer now returns { ok, detail } (was void) so the worker
can log the outcome.
Verified: generator typecheck clean.
Full reasoning-based audit of all 10 zones. 11 findings, all confirmed real,
zero false positives. 5 fixed now, 6 deferred to a justified backlog.
API-SERVERS-001 (HIGH) — DELETE /v1/servers/:id orphaned the container
The route deleted the DB row but never stopped the Docker container — it
kept running forever on its host port, still serving traffic with the
user's secrets baked into its env. The takedown path got stopContainer in
an earlier commit; this sibling path was missed. DELETE now tears the
container down first. Verified: deleted 'gfgfg' — container 23e0c55c gone,
:4110 connection-refused after.
INFRA-001 (HIGH) — SECRETS_ENCRYPTION_KEY zero-default usable in production
The AES-256-GCM key defaults to 64 zeros and passes the min(64) check. A
prod deploy that forgot to set it booted silently with every secret
encrypted under a public key. config.ts now throws on boot when
NODE_ENV=production and the key is still the placeholder. Verified: prod
boot with the zero key is REFUSED.
API-SERVERS-002 (MEDIUM) — WS build stream had no authorization
GET /v1/builds/:id/stream streamed build logs with no auth, while its REST
twin checks orgId. Now authenticates from the session cookie and rejects
builds outside the caller's org. Verified: no cookie -> 'unauthorized';
cross-org build -> 'not_found'; own build -> streams (no regression).
OAUTH-001 (MEDIUM) — authorization code consumption was not atomic
The 'already used?' check and the 'mark used' write were separate
statements — two requests racing the same code could both mint tokens.
Now a conditional UPDATE ... WHERE consumed_at IS NULL RETURNING; the
loser of the race gets zero rows and invalid_grant.
OAUTH-002 (MEDIUM) — 'plain' PKCE accepted, contradicting AS metadata
The AS metadata advertises code_challenge_methods_supported: ['S256'] but
/oauth/authorize accepted 'plain'. Authorize is now z.literal('S256') and
pkceVerify dropped the plain branch. Verified: authorize with plain -> 400.
Deferred to backlog (documented in TEMPLATE_SECURITY_AUDIT.md is template-only;
this audit's findings are in the commit + certification):
GENERATOR-001 — secrets via docker -e (visible in docker inspect); needs
--env-file rework
RUNNER-001 — generated containers run as root; needs USER node + build
re-test
AUTH-001 — no rate limit on magic-link / oauth register; needs
@fastify/rate-limit
GENERATOR-002— allocatePort check/bind race; low, self-heals on rebuild
AUTH-002 — expired magic_links/sessions/oauth rows never purged; needs
a cron
FEATURES-001 — tool-call metering not wired (metrics always 0); Sprint 4
by plan
Bug: forking a template POSTed /v1/servers with slug = trySlug(template.title),
a fixed value. If the user had any server with that slug already (e.g. forking
the same template twice, or a name collision), the create returned 409
slug_taken — and the fork wizard skips Step 1, so there was no slug field to
fix it with. The user was stuck (8 repeated 409s in the report).
Fix:
- On fork setup, after the fork call, GET /v1/servers and auto-unique the
default slug: echo-demo-template -> echo-demo-template-2 -> -3 ... against the
user's existing slugs. Lookup failure is non-fatal (slug field is editable).
- Fork step 2 now renders editable Name + Slug fields in the fork banner
('must be unique in your workspace' hint) — the normal wizard has these in
Step 1, which the fork flow skips, so they belong here.
- slug_taken build error now reads 'The slug "x" is already used by one of
your servers — change the Slug field above' instead of the raw code.
Note: the SES lockdown-install.js and content.js 'query' errors in the report
are browser extensions, not the app.
Verified: forked echo-demo-template (whose base slug was already taken) — slug
auto-filled echo-demo-template-2, build succeeded, container live on :4111,
template fork_count incremented to 2.
The logged-in user can now reach the marketplace and filter to their own
templates.
Dashboard nav:
- Added 'Marketplace' item (Overview · Servers · Marketplace · Audit · Settings).
/templates page — login-aware:
- Detects session via /v1/auth/me. Logged-in users get a 'Dashboard' + '+ New
server' header instead of 'Home' + 'Start building'.
- New [All templates | My templates] scope toggle, shown only when logged in.
- 'My templates' loads GET /v1/templates/mine and shows EVERY status the user
owns (public / hidden / draft / takedown) with a colored status badge on each
card — so a template you unshared doesn't appear to have vanished.
- Sort tabs (trending/top/newest) hide in 'mine' scope — meaningless for a
handful of own templates. Category filter + search still apply (client-side).
- Takedown cards link to the source server's Publish tab instead of the detail
route (which 410s); everything else opens the detail page.
Backend:
- GET /v1/templates/mine (requireAuth) — all own templates, any status,
registered before /:slug so the static route always wins the match.
- GET /v1/templates/:slug — now does an optional session check: the OWNER can
view their own hidden/draft template (so a 'My templates' card click never
dead-ends in a 404). takedown stays 410 for everyone, owner included — that's
an admin decision, not the owner's to reverse.
Detail page:
- Fork CTA is gated on status === 'public'. For a non-public template the owner
sees an amber 'not forkable — re-share from the Publish tab' notice plus a
'Manage in server' link, instead of a Fork button that would fail silently.
Verified:
- GET /v1/templates/mine → marco's 1 template; 401 without auth
- Owner GET of a hidden template → 200 status:hidden; anon → 404
- Dashboard nav shows Marketplace (screenshot)
- /templates 'My templates' toggle → only own template, public badge, sort tabs
hidden (screenshot)
Goal: maximize template volume without a dark pattern and without leaking data.
Wizard Done-page Share panel:
- 'Share as template in the marketplace (recommended)' checkbox, default ON,
rendered inline in the build-success flow where every user lands.
- Honest copy — corrected a draft that claimed 'only abstracted code pattern is
shared'. That is false: the FULL generated code becomes publicly viewable on
the template detail page (by design, for pre-fork audit). The panel now says:
'Your secrets stay private ... but your generated code becomes publicly
viewable so others can audit it before forking. Unshare anytime.'
- When checked: inline minimal form — short description (prefilled from the
spec), category select, optional per-secret credential hints. One 'Publish to
marketplace' click. Not auto-published silently — that would be a consent dark
pattern; one visible deliberate click keeps it clean.
- Forked servers don't show the panel (re-publishing a fork is an edge case).
Owner unshare/reshare:
- GET /v1/servers/:id/template — owner lookup, drives the Publish tab UI.
- PATCH /v1/templates/:slug/visibility { shared } — owner-only toggle between
public and hidden. 403 for non-owners, 409 if an admin took it down (owner
cannot resurrect an admin takedown). Audit-logged as template.unshare /
template.reshare.
- Server-detail Publish tab now detects an existing template and shows the
shared status (public/hidden/takedown badge), fork count, a marketplace link
and an Unshare/Re-share button — instead of the publish form.
Why this is safe to default ON:
- Secrets are architecturally bound to mcp_servers, never copied into templates.
Publish reads tools_schema + generated_code only; the secrets table is never
touched. Data leak is structurally impossible, not policy-dependent.
- Publish re-scans the generated code for banned patterns AND hardcoded
credentials (sovereign-audit hardening) before it can reach the marketplace.
- The user sees a visible, pre-ticked checkbox and reads one honest sentence
before publishing. Privacy-conscious users untick; everyone else contributes
volume. Informed consent, GDPR-clean.
Verified end-to-end via API:
GET server/:id/template -> null (unpublished)
POST /v1/templates -> published, slug share-test-server
GET server/:id/template -> status public
PATCH visibility {shared:false} -> hidden, drops out of public list
PATCH visibility {shared:true} -> public again
UI: Publish tab renders the shared-status panel with View + Unshare (screenshot
confirmed).
Also: hero badge date set to 2026-05-20. Changed 'MCP spec 2025-11-25' to
'updated 2026-05-20' — claiming an MCP spec dated today would be factually wrong
(no such spec release exists); 'updated' is accurate and gives the requested
fresh date. The real spec date is still cited correctly in /docs.
P0 — three critical issues found by tracing every attack vector on the template
publish + fork + render path. All three fixed and verified with attack tests.
FIX A — Takedown actually stops malicious containers
PATCH /v1/admin/templates with status=takedown previously only updated
mcp_servers.status to 'paused' in the DB. The Docker container kept running
and serving traffic on its allocated port — takedown was cosmetic. Now the
endpoint enumerates every fork's container, calls 'docker rm -f' on each,
clears container_id/public_url/host_port in the DB, and returns the
stoppedContainers count. New apps/api/src/lib/docker.ts owns the stop logic.
Verified: takedown stopped container f5632962, port 4109 connection refused.
FIX B — Reject specEdit on fork
A hand-crafted POST /v1/servers with {templateId, previewId, specEdit} would
enter the spec-edit branch, merge edits into the cached spec, but the worker
reads the pre-built template code (separate cache key), ignoring the merged
spec entirely. User thinks they changed something; deployed container behaves
as the original. Now returns 400 spec_edit_forbidden_on_fork with an explainer
pointing to the Iterate flow.
FIX C — templateId validation via Redis fork-ref
templateId on POST /v1/servers was user-controlled and unvalidated:
fork_count of any template could be pumped, mcp_servers got garbage
template_id rows, takedown cascade would miss the bogus rows. Fork endpoint
now writes a Redis key fork-ref:<previewId> -> templateId (5min TTL).
Server-create requires the ref to exist AND match the submitted templateId.
Verified attack: fake templateId without fork-ref returns 410 fork_ref_expired.
DEFENSE-IN-DEPTH — Hardened static checks
Banned patterns (added):
Function\s*\(['"`] — Function('code')() form, no 'new' needed
\bimport\s*\( — dynamic import escapes bundle scope
\bsetTimeout\s*\(['"`] — setTimeout('code', ms) eval form
\bsetInterval\s*\(['"`]
\bfs\s*\.\s*(unlink|rmdir|rm)\b
\bprocess\s*\.\s*kill\b
you are now in (developer|jailbreak|dan) mode — extra jailbreak markers
Hardcoded-credential patterns (new — scanForLeakedSecrets):
sk-ant-(api|sid)… — Anthropic
sk-… — OpenAI
sk_(live|test)_… — Stripe
ghp_… — GitHub PAT
github_pat_… — GitHub fine-grained
xox[bpoasr]-… — Slack
AKIA[0-9A-Z]{16} — AWS
-----BEGIN…PRIVATE KEY----- — RSA / SSH / GPG
Triggered when a publisher pasted their key into the prompt and Claude
embedded it literally in the generated code. Publish-blocking.
Verified attack: smuggled 'Function("return 1")' into a build's
generated_code, attempted publish → 422 publish_blocked.
Slug regex tightened — fork + detail routes now require
^[a-z0-9][a-z0-9-]{0,63}$ (was loose min(1).max(64) — letting through
'../admin', long strings, mixed case).
UI warning — Publish-as-template form now shows an amber callout listing
what's scanned and explicitly stating egress allowlisting is roadmap, not
enforced today (was misleading: the field was collected, never enforced).
TEMPLATE_SECURITY_AUDIT.md added — documents all 20 audited vectors with
severity, status, and rationale for what's deferred.
UI polish
globals.css — select/input/textarea/button get color-scheme: dark + custom
chevron + option styling so Chrome's native popdown stops rendering as a
white OS-themed widget on dark pages. The /templates category dropdown was
the immediate trigger; same rule applies system-wide.
What this enables:
- A user builds an MCP server. If others would benefit, they click 'Publish as
template' on their server detail page. The spec + pre-rendered TypeScript
snapshot is preserved.
- Visitors browse /templates, filter by category, sort by trending/top/newest.
Each template card shows fork count + active deployment count as natural
manipulation-resistant popularity signal.
- /templates/[slug] shows the full plan: tool list with input schemas,
required-credential explanations (with 'how to get one' deep links), and a
collapsible code preview so users can audit before forking.
- Fork is one click → /servers/new?template=slug. The wizard skips Step 1 and
pre-fills Step 2 with the template's parsed spec. Forker only fills in their
own credentials. mcp_servers.template_id is recorded; template.fork_count is
bumped atomically. Each fork gets its own isolated container with its own
port, its own AES-256 secrets — the template author has zero visibility into
the fork's traffic or data.
- Admin /admin/templates moderation: verify quality templates (shows shield
badge in marketplace), hide low-effort ones, takedown anything malicious.
Takedowns cascade-pause every fork container — owners must re-deploy.
Why template+fork instead of shared-container:
- Shared containers would mean the publisher's quota + their secrets + their
logs are exposed to forkers. Bad ergonomics, bad security, bad ownership.
- Templates/forks decouple the spec (shared, vouched-for) from the runtime
(isolated per user). Network-effect moat without the trust collapse.
Why no 5-star voting in v1:
- Manipulation-anfällig, empty lists without adoption. We use fork count +
active deploys + verified badge. Trending algorithm:
score = (activeDeploys * 3 + forks) / sqrt(ageDays + 1)
Real signal, no brigading attack surface.
Backend:
- New schema: templates table (16 cols incl. tools_schema, generated_code,
required_secrets, allowedDomains, status enum, verified, fork_count).
- mcp_servers.template_id FK + idx for fork lookup.
- @bmm/types: SpecEdit unchanged, CreateServerInput accepts optional templateId.
- preview-cache.ts: new cachePrebuiltCode/loadPrebuiltCode for storing the
template's full rendered server.ts alongside the spec. Generator worker
detects this and skips the render step — uses the audited pre-built code
verbatim. Banned-pattern re-scan at publish time.
- routes/templates.ts: 5 public/auth routes + 2 admin routes. Banned-pattern
re-scan before publish. Slug auto-uniqued. forkCount atomic-increment via
SQL.
UI:
- /templates marketplace with trending/top/newest tabs, category filter, search.
Cards show forks + live count + author + verified badge.
- /templates/[slug] full detail with tools, credentials-with-hints, expandable
code preview, fork CTA, ownership + stats sidebar, 'forking is safe' explainer.
- /servers/new?template=slug — wizard auto-jumps to Step 2 with template spec
pre-filled, fork banner at top with link back to template.
- /servers/[id] new Publish tab with title, category, descriptions, per-secret
hint fields (description + howToGetUrl per UPPER_SNAKE_CASE key).
- /admin/templates moderation with verify/hide/takedown actions.
- Marketing nav now includes /templates.
Verified end-to-end:
- Published Echo Demo Template from marco@test.local's live server
- Marketplace lists it correctly with stats
- Detail page renders with all sections
- Fork CTA navigates to wizard with ?template= param
- Wizard skips Step 1, shows fork banner, pre-fills spec
- Build succeeds in ~10s (cached spec + prebuilt code path skips Claude AND
render), container live on :4109 with proper OAuth 401 → token → 200 flow
- DB: templates.fork_count=1, activeDeployments=1, mcp_servers.template_id
populated on the fork
- /admin/templates shows the new template with verify/hide/takedown controls
The indigo filled square didn't match the in-page Logo component. The nav-bar
logo is a monochrome outlined rounded square + M path, currentColor on the
dark page background.
Favicon now follows the same design: outlined rect + M, stroke adapts:
- light browser tabs → #0A0A0B (near-black)
- dark browser tabs → #FAFAFA
Apple-icon stays as the indigo filled tile — iOS home-screen icons need solid
backgrounds, monochrome outlines disappear there.
app/icon.svg — 32px vector with the brand 'M' (logo-matching) on #6366F1 indigo
rounded square. Sharp at every browser size.
app/apple-icon.tsx — 180x180 PNG rendered at request time via next/og
ImageResponse with the same design scaled up. Covers iOS home-screen + iPadOS.
Next 15 auto-discovers both via the file-based metadata convention and injects:
<link rel='icon' href='/icon.svg' type='image/svg+xml' sizes='any'>
<link rel='apple-touch-icon' href='/apple-icon' type='image/png' sizes='180x180'>
Verified: both URLs return 200, both link tags appear in the rendered HTML head,
brand matches the in-page Logo component.
The wizard's confirm step is no longer read-only. Users can refine what Claude
parsed before committing to a build.
Backend:
- @bmm/types adds SpecEdit (tools[name,description,inputSchema] + requiredSecrets);
CreateServerInput accepts an optional specEdit alongside previewId.
- Servers create endpoint: when specEdit is provided, loads cached spec from Redis,
index-merges the edits in (keeping LLM-generated implementations untouched),
re-validates via GeneratorSpec, re-runs the banned-pattern scan, overwrites the
Redis cache so the worker reads the user's version. Refuses with
preview_expired/tool_count_mismatch/banned_pattern on safety failures.
- New overwriteSpec() helper in preview-cache.
Frontend:
- Step 2 renders each tool as an editable card: name input, description textarea,
JSON schema textarea with parse-on-keystroke validation (inline error if invalid).
- Required secrets list is editable: keys via uppercase-snake-case input, +Add /
remove buttons, secret values kept in sync when keys are renamed.
- Reset-to-AI-suggestion button appears when edits are dirty.
- Pre-submit validation: schema must parse, secret keys must match UPPER_SNAKE_CASE,
required secret values must be provided.
- Warning copy: 'Renaming parameters may require an Iterate after build — the
existing impl references the original names.'
Verified end-to-end via browser smoke test: edited description + renamed tool
landed correctly in mcp_servers.tools_schema and in the live container at :4107.
Implementation field preserved from the original cached spec.
Sprint 3.5: close every dead link and replace the single-step wizard with the
spec-mandated 3-step flow.
Wizard:
- Step 1 collects prompt + name + slug, calls /v1/servers/preview.
- Step 2 renders parsed tools (name, description, input schema as copyable JSON)
+ a credential field per requiredSecret Claude actually identified. Self-contained
servers see 'No credentials needed' instead of generic Notion placeholders.
- Step 3 streams the live build over WebSocket and shows install snippets.
New dashboard pages:
- /settings — org, plan/usage, members table, API keys + billing stubs (Sprint 4),
encryption status. Reads /v1/me/org.
- /audit — filterable table over /v1/audit with action pills, resource refs, IP,
metadata JSON.
Docs site (/docs + 6 sub-pages):
- Sticky 240px sidebar, max-w-prose article column, shared DocsTitle/H2/Code primitives.
- Quickstart, MCP concepts, OAuth 2.1 flow (full walkthrough with curl), Authoring
tools, Self-hosting, API reference, FAQ.
Marketing pages:
- /changelog with tagged release timeline.
- /security with 8 pillars + disclosure.
- /privacy with GDPR-aware sections.
- /terms (10 clauses).
- /pricing full page (nav now points here instead of /#pricing anchor).
- /status with live 10s probes against /api/health and /login.
Footer 'system status' badge now links to /status.
All 20 routes 200 OK in smoke crawl. Typecheck clean across packages.
- POST /v1/servers/preview runs Claude synchronously, validates output, caches spec
in Redis under preview:<id> with 5min TTL, returns previewId+spec+detectedSecrets.
- POST /v1/servers accepts optional previewId; worker reuses the cached spec if
the entry is still present, otherwise regenerates fresh. Skips the second
Claude round-trip (~30s saved on the demoable path).
- audit() helper writes auth.login, auth.logout, server.create, server.iterate,
server.delete to audit_log with ip, metadata, resourceId.
- GET /v1/me/org returns organization + members list for the settings page.
- GET /v1/audit?limit=&action=&resourceType= returns scoped audit entries.
- Bump @modelcontextprotocol/sdk from 1.0.4 to 1.29.0 in runner-template
(1.0.4 has no McpServer or StreamableHTTPServerTransport — file not found at runtime).
- Bump zod to 3.25.76 across workspace to satisfy modern SDK peer dep.
- Split OAUTH_ISSUER (canonical, host-reachable) from CONTROL_PLANE_URL (container-reachable for JWKS).
Runner verifies iss against OAUTH_ISSUER; fetches JWKS from CONTROL_PLANE_URL.
Both API and runner now agree on http://localhost:4000/oauth as the issuer in dev.
- Move postgres host port 5432 to 5440, redis 6379 to 6390 to avoid collisions with
native installs on the dev machine.
- Move web from 3000 to 3001 (3000 occupied by Gitea on dev machine).
- Drop pino-pretty transport from API to avoid runtime require of an unbundled dep.
- Cast build_logs.level (varchar) to BuildEvent's literal union in WS replay path.
- Remove unused reqBase helper in oauth.ts.