Commit Graph

24 Commits

Author SHA1 Message Date
Marco Sadjadi
147ba69968 fix(runner): alias params/input to args so tool implementations don't ReferenceError
Some checks failed
Deploy to Production / deploy (push) Has been cancelled
Auth chain finally landed but tool calls crashed in the wetter server
with "Error: params is not defined". The MCP SDK passes the validated
tool args as a single parameter; our template names that parameter
`args` but the model frequently writes `params.location` / `input.x`
because that's how OpenAPI and JSON-RPC reference docs read.

Two-sided fix:
- render.ts wraps every implementation with `const params = args; const
  input = args;` inside the try block. Whichever alias the model
  picked, the variable resolves to the same validated object.
- SYSTEM_PROMPT now states the variable name EXPLICITLY ("variable
  named EXACTLY `args`, e.g. args.location") so new generations stop
  drifting on that detail.

Existing wetter runner needs a rebuild to pick up the alias shim.
2026-05-28 21:39:11 +02:00
Marco Sadjadi
0c6d738a6b feat(preview): SSE-streamed generation, no CF 100s edge cap
All checks were successful
Deploy to Production / deploy (push) Successful in 1m27s
Architectural fix for "spec_too_large" / preview_timeout — the sync
endpoint had to fit the whole model run into Cloudflare's ~100s edge
window, which made the system fragile against any prompt that produced
a verbose spec. The new streaming path pipes Anthropic's token deltas
as Server-Sent Events; every chunk resets CF's idle timer and a 15s
keepalive comment guarantees activity even during slow first-token
windows.

@bmm/llm: new streamSpecFromAnthropic() exposes the SDK's .stream()
flow with the same typed-error contract as generateSpec — same
SpecTruncatedError / SpecValidationError / SpecTimeoutError raised from
the relevant moment.

API: POST /v1/servers/preview/stream returns text/event-stream with
events 'text' (deltas), 'spec' (final success payload, same shape as
the sync endpoint), 'error' (typed). Anthropic-only — GLM/hobby falls
back to the sync route via 409 streaming_unavailable.

Frontend: apiSseStream() handles the POST + ReadableStream + SSE
parser. The wizard's analyze() prefers the stream and only uses the
sync endpoint on the explicit 409 fallback.

nginx (api.buildmymcpserver.com): the /v1/builds/ location block (which
already had proxy_buffering off + 600s read timeout for the WS build
stream) now also matches /v1/servers/preview/stream so the SSE
response isn't buffered.
2026-05-28 21:11:05 +02:00
Marco Sadjadi
b930a454e8 fix(llm): tighter system prompt + 12288 max_tokens for paid tiers
All checks were successful
Deploy to Production / deploy (push) Successful in 1m33s
Sonnet 4.6 was still hitting max_tokens on ambitious prompts like
"WorldWeather MCP for any location" because the implementation bodies
ballooned with defensive scaffolding. Two changes:

1. SYSTEM_PROMPT now imposes hard limits the model can self-enforce:
   - at most 6 tools (combine related capabilities with a mode param)
   - implementation body <= 40 lines, no comments, no overengineering
   - descriptions <= 100 chars
   These keep a typical preview under ~7k output tokens.

2. team/enterprise maxTokens 8192 -> 12288. At ~130 tok/s that fits in
   ~94s, still under Cloudflare's 100s edge cap. Hobby (GLM) and pro
   (Haiku) keep their existing limits — they were not hitting the
   ceiling.

SpecTruncatedError still fires + surfaces 422 spec_too_large when even
12288 isn't enough, so the user gets actionable feedback instead of an
opaque zod error.
2026-05-28 21:01:50 +02:00
Marco Sadjadi
d2b19a5439 fix(preview): max_tokens 4096→8192 + detect truncation explicitly
All checks were successful
Deploy to Production / deploy (push) Successful in 1m24s
Root cause of repeat 422s: 4096 was too tight for ambitious prompts
(Marco's research-assistant prompt produces ~12kB of JSON before the
model gets cut off mid-string). The error then surfaced as an opaque
"Unterminated string in JSON" zod failure instead of pointing the user
at the real problem.

Two fixes:
- maxTokens back to 8192 (the original) for all Claude tiers, 4096 for
  GLM. Timeouts bumped to 95s — Sonnet 4.6 at ~130 tok/s does 8192 in
  ~63s, ~30s headroom for cold starts, still under Cloudflare's 100s
  edge cap.
- Detect stop_reason === 'max_tokens' on the Anthropic response BEFORE
  parsing and throw the new SpecTruncatedError. /preview catches it
  and returns 422 spec_too_large with a clear "split the prompt"
  message instead of leaking the zod parse failure.
2026-05-28 19:34:40 +02:00
Marco Sadjadi
979d1abfca feat(preview): log spec validation failures with raw output
All checks were successful
Deploy to Production / deploy (push) Successful in 1m25s
422s from /preview hid the actual reason: zod_message tells which field
was wrong and a 400-char preview of the model output reveals refusals
or non-JSON returns. Both stay in the api log only — never surfaced
to the client unchanged.
2026-05-28 19:19:57 +02:00
Marco Sadjadi
5a8e736113 fix(llm): preview timeout 60s→90s + maxTokens 8192→4096
All checks were successful
Deploy to Production / deploy (push) Successful in 1m21s
Enterprise plan was hitting SpecTimeoutError exactly at 60s because the
Sonnet 4.6 preview was budgeted for 8192 tokens at ~80 tok/s (≈102s
worst case) inside a 60s window. The frontend then rolled back to step
1 with no spec.

A real spec is small (<= ~10 tools, ~1.5–2.5k output tokens in practice)
so 4096 is plenty and lets even Sonnet finish in ~51s worst case. The
90s timeout buys headroom for cold starts while staying under
Cloudflare's 100s edge cap. Hobby/GLM bumped to 90s too — same
headroom argument.
2026-05-28 18:51:51 +02:00
Marco Sadjadi
3a05766f88 fix(oauth): allow generic RFC 7591 DCR + expand install snippets
All checks were successful
Deploy to Production / deploy (push) Successful in 1m28s
- /oauth/register: drop resource_required check, accept generic
  registrations (Claude Desktop omits resource in DCR body per spec).
  serverId stored as NULL; /authorize still enforces org-ownership
  + access-token aud claim still pinned to resource. Fixes Claude
  Desktop DCR failure (ofid_d7e39530c109fa7f).
- /oauth/authorize: skip strict server.id check when client.serverId
  is NULL (generic client); org check remains the security boundary.
- schema: oauth_clients.server_id no longer NOT NULL.
- migration 0002: ALTER COLUMN server_id DROP NOT NULL (already
  applied on prod).
- install-snippets: add Claude Code (CLI), VS Code, Codex, raw URL
  tabs. Claude Desktop now shows form-field values (Name / Remote MCP
  Server URL / OAuth Client ID / Secret) matching the new Custom
  Connector UI instead of the obsolete JSON config.
- types: InstallTarget enum extended.
- hero-video: clicking the audio toggle restarts the video from
  frame 0 so unmute aligns with the spoken opening.
- marketing: drop em-dashes from rendered copy.
2026-05-28 17:20:01 +02:00
Marco Sadjadi
8c6f04f034 feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
All checks were successful
Deploy to Production / deploy (push) Successful in 52s
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
  the JWT sub claim from the original authorization so refreshes can
  re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
  expires_at for the row (was 1h — same as access token, which killed
  refresh after 1h).
- New refresh_token grant branch:
    * looks up token by refresh-hash + expiry
    * client_id must match, client_secret verified if confidential
    * RFC 8707: requested resource must equal stored resource
    * OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
      new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.

Clients no longer have to re-authorize hourly. Closes Zb-001.

PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
  (default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
  "slug.MCP_DOMAIN port;" and computes publicUrl as
  https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
  MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
  accepts an optional slug arg and removes the map fragment. Callers
  (DELETE /v1/servers/:id, admin template takedown) updated.

Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
    1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
       reads slug→port from /opt/buildmymcpserver/runner-map.combined
    2. systemd inotify service watches the map dir, combines fragments
       on any change, reloads nginx
    3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
  cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
  api + generator, set MCP_DOMAIN in env, recreate containers.

Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
Marco Sadjadi
aa79a71357 security: sovereign-audit Pass-2 fixes — auth-lib, oauth, templates
All checks were successful
Deploy to Production / deploy (push) Successful in 54s
Six confirmed findings closed (3 MEDIUM, 3 LOW). Tier-1 surfaces from
Pass-1 re-verified non-regressed; this pass deepened the audit on the
auth library, OAuth issuer, and template marketplace.

Za-002 MEDIUM (scrypt cost) — bump SCRYPT_N from 2^14 → 2^17 (131072)
  matching current OWASP guidance for password hashing in 2026. Hash
  format embeds N (`scrypt$N$salt$hash`), so the existing admin
  password at the old cost still verifies — backward-compatible. Also
  added explicit maxmem ceilings since Node's default (~32MiB) is
  insufficient for the new N.

Za-003 MEDIUM (single-use race) — consumeMagicLink was SELECT-then-
  UPDATE; two parallel redemptions could both win and mint two
  sessions from the same token. Now uses the same atomic
  `UPDATE … WHERE id = ? AND consumedAt IS NULL RETURNING id` pattern
  /oauth/token already had — loser of the race gets
  invalid_or_expired_token.

Za-004 LOW (membership ordering) — `.orderBy(memberships.createdAt)`
  added so when org-invites eventually let a user belong to multiple
  orgs, the same one wins every login instead of insertion-order
  roulette. Latent-bug pre-empt.

Zb-002 LOW (OAuth register spam) — /oauth/register now per-IP daily
  rate-limited at 20/day (well above any legitimate MCP-client
  bootstrap pattern). Prevents DB-row spam.

Zc-001 MEDIUM (banned-pattern drift) — three separate copies of
  BANNED_PATTERNS had drifted apart. The publish-time scanner in
  templates.ts was MISSING the 7 new patterns added in Pass-1
  (process.binding, dlopen, .constructor.constructor, vm.runIn*,
  globalThis['..']). Single source of truth in @bmm/llm now exports
  SHARED_BANNED_PATTERNS; templates.ts composes PUBLISH_BANNED_PATTERNS
  = SHARED ∪ code-only-extras (dynamic import, fs.rm, setTimeout-with-
  string, process.kill, jailbreak markers).

Zc-002 LOW (N+1) — /v1/templates list was issuing one COUNT(*) per
  template (101 queries for a 100-row page). Now one grouped query
  with templateId GROUP BY, merged in JS. p95 doesn't degrade with
  marketplace growth.

DEFERRED (documented, scoped for next sprint):
  Za-001 HIGH — Account takeover via cross-provider email lookup.
    Requires schema change (users.primaryProvider). Mitigation in
    /settings/account banner planned.
  Zb-001 MEDIUM — /oauth/token refresh_token grant: advertised in
    AS metadata but unsupported_grant_type. Either implement (~40
    LOC) or strip from metadata.
  Zc-003 LOW — Admin takedown partial-failure consistency.
  Zd-001 IMPROVE — DEK cache invalidation across replicas (single-
    instance today).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 18:15:54 +02:00
Marco Sadjadi
f8af3fc0fd security: sovereign-audit Phase 2 fixes — trustProxy, Docker hardening, banned-pattern overhaul
All checks were successful
Deploy to Production / deploy (push) Successful in 55s
Five confirmed findings from the sovereign-audit pass, ordered by severity:

Z3-001 CRITICAL — Fastify now trustProxy:true so req.ip resolves to the
real visitor IP via X-Forwarded-For instead of always being the nginx /
docker-bridge peer. Every per-IP rate-limit in the codebase was silently
collapsed into one global counter; this restores them.

Z1-001 CRITICAL — runner container hardening flags (--read-only,
--cap-drop=ALL, --security-opt=no-new-privileges:true, --pids-limit=100,
--memory=512m, --cpus=0.5, tmpfs /tmp) were sitting commented-out as a
TODO despite /security promising them. Now applied unconditionally on
production/staging; opt-out flag RUNNER_DISABLE_HARDENING=1 for Win-dev.

Z2-001 + Z2-002 CRITICAL / MEDIUM — banned-pattern blacklist tightened
(Function(...) without `new`, process.binding, process.dlopen,
.constructor.constructor, _load, vm.runIn*Context, globalThis['..'],
"system prompt override"). scanForInjection now also walks tool.name and
every inputSchema property description, not only implementation +
description — closes the prompt-injection-into-AI-client surface that
downstream clients (Claude Desktop, Cursor) read verbatim. The duplicate
BANNED_PATTERNS in apps/api/src/routes/servers.ts deleted in favour of
the single shared scanForInjection export from @bmm/llm.

Z4-001 HIGH — /v1/auth/magic-link gained the two-axis daily rate-limit
the SMS endpoint already had: 10/IP/day + 5/email/day. Combined with the
trustProxy fix above these are now real per-visitor limits.

Z4-002 MEDIUM — magic-link callback URL no longer printed to stdout in
production. In dev it still prints (so devs can click the link); in
production we log only "issued, URL withheld" and a loud error if no
email sender is wired (Resend integration is the actual launch
blocker — left as a TODO).

Z6-001 MEDIUM — /v1/builds/:id/stream WebSocket now refuses cross-origin
upgrades. SameSite=Lax already mitigates in modern browsers; this is the
defense-in-depth against browser bugs and non-browser clients.

FALSE POSITIVES dismissed: slug path-traversal (schema regex
^[a-z][a-z0-9-]*$ in @bmm/types catches it); session-after-promote
(getSession re-fetches isAdmin from DB on every request).

DEFERRED (not blockers, tracked):
- Z1-002 generated-server HTTPS — needs nginx wildcard subdomain TLS
- Z1-003 docker image cleanup cron
- Z2-001 v2 — real sandbox runtime (multi-week refactor)
- Z3-002 rawBody-per-request memory — branch on webhook path only
- Z5-001 multi-user org RBAC for billing — gated on Team feature
- Email sender integration (Resend) — launch blocker

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 18:02:59 +02:00
Marco Sadjadi
ef30baf52a feat: Swiss-compliant launch — Impressum/AGB/Contact, support panel, DSG exports, cookie banner
All checks were successful
Deploy to Production / deploy (push) Successful in 57s
Legal (Swiss minimum, no individual named):
- Impressum page (UWG Art. 3 lit. s) — provider, contact via support panel,
  no email required, jurisdiction = Switzerland
- AGB page — subscription terms, payment, cancellation, suspension on payment
  fail, 14-day money-back, AI-processing-per-tier disclosure, Swiss law +
  Swiss venue, modeled after typical Schweizer SaaS terms
- Privacy: Stripe added as subprocessor with full data-flow disclosure

Support panel replaces email contact entirely:
- @bmm/db: support_status enum + support_tickets + support_messages tables,
  migration applied to prod DB
- @bmm/api: support routes (user create/list/view/reply, admin list/view/reply
  /set-status), public /v1/contact for logged-out visitors with per-IP rate
  limit of 3 submissions/day to prevent spam-flood
- Web: /settings/support (list + new), /settings/support/[id] (conversation),
  /admin/support, /admin/support/[id]
- Public /contact form with email collection for guest tickets

Data rights (DSG Art. 25 / GDPR Art. 15+20):
- /v1/account/export returns user-scoped JSON of profile, org, servers,
  builds, audit, support tickets and messages — excludes hashes, encrypted
  secrets, other-user data
- /settings/account: download button + deletion-via-ticket workflow

Production-readiness gaps closed:
- org.suspended now blocks /v1/servers POST and /v1/servers/preview (402);
  webhook flagged this state but enforcement was missing
- Cookie banner: minimal, essential-cookies-only disclosure (Swiss DSG +
  GDPR compliant without dark-pattern consent UI), mounts on both layouts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 17:12:06 +02:00
Marco Sadjadi
bc174c1302 feat: tiered LLM (GLM free / Claude paid) + rate limits + quota enforcement
All checks were successful
Deploy to Production / deploy (push) Successful in 53s
The free tier was hemorrhaging Anthropic cost with no abuse cap (no rate
limit on /preview, Opus default in the build worker, 5-min cache TTL that
made cache-miss the common case). This switches free users to GLM, paid
users to Claude tiers, and tightens every leak found in the audit.

Backend:
- @bmm/llm: GLM provider via Zhipu's OpenAI-compatible endpoint, pickPreviewModel
  + pickBuildModel helpers, plan-aware ModelChoice
- preview-cache TTL 5min -> 24h (kills the cache-miss path)
- /v1/servers/preview: picks model from caller's plan, returns model name to UI
- /v1/servers POST: enforces SERVER_LIMITS per plan (402), rate-limits builds
- daily rate-limit on preview (5/40/150/1000) and build (3/20/100/500)
- /v1/auth/me returns plan so the wizard can show the right model name
- generator worker: GLM default, Anthropic Sonnet fallback if GLM errors

Frontend:
- Wizard fetches plan, shows "<model> is drafting the tool spec" pre-emptively,
  upgrade hint for hobby users, friendly errors for 402 / 429
- Pricing page: AI-model line per tier (Open-tier / Haiku / Sonnet / Opus),
  Team €149 -> €199, Enterprise €499 -> €999, daily-preview limit per tier
- Privacy + Security: explicit subprocessor disclosure for Anthropic (US) /
  Zhipu (CN) and which tier uses which

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 23:50:00 +02:00
Marco Sadjadi
e198d44e1e fix(preview): stop spec generation timing out behind the edge proxy
All checks were successful
Deploy to Production / deploy (push) Successful in 50s
The /v1/servers/preview route ran claude-opus-4-7 synchronously; full spec
generation routinely exceeded Cloudflare's ~100s proxy cap, so the browser
received a headerless 524 and reported it as a CORS failure.

- preview now uses claude-sonnet-4-6 with a 45s per-attempt timeout and one
  retry — comfortably inside the proxy budget
- generateSpec maps an exhausted timeout to SpecTimeoutError; the route
  returns a clean 504 (with CORS headers) instead of a stalled connection
- analyze step: live elapsed-seconds counter as freeze-proof, plus a
  reduced-motion exception so the loading spinner keeps spinning (a status
  indicator, which WCAG exempts from reduced-motion)
- textarea resize grip restyled to dark theme (light hatch on dark square)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 23:52:48 +02:00
Marco Sadjadi
cc3c5ad444 feat(auth): GitHub OAuth login + SMS one-time-code login
Some checks failed
Deploy to Production / deploy (push) Failing after 1m8s
GitHub: /v1/auth/github + /callback — authorization-code flow, fetches
the verified primary email via /user/emails, reuses upsertOAuthLogin.

SMS: phone is now a first-class login identity.
- schema: users.email nullable, users.phone added, new sms_codes table.
- @bmm/auth: issueSmsCode / consumeSmsCode — 6-digit code, hashed at
  rest, 10-min TTL, per-phone rate limit, 5-attempt cap, get-or-create
  user by phone.
- apps/api: /v1/auth/sms/request + /verify, Twilio REST send (no SDK),
  per-IP throttle. /v1/auth/providers now reports google/github/sms.
- login UI: Google + GitHub buttons, Email|Phone toggle, two-step SMS
  (number -> 6-digit code with one-time-code autofill).

SMS link was rejected in favour of an OTP code — carrier link-scanners
consume magic-link tokens before the user taps them.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 22:59:58 +02:00
Marco Sadjadi
38aa5875d3 feat(auth): add "Continue with Google" OAuth 2.0 login
Server-side authorization-code flow: /v1/auth/google redirects to the
consent screen with a CSRF state cookie; /v1/auth/google/callback
exchanges the code, validates the ID token (iss/aud/exp/email_verified),
and mints a 30-day session via upsertOAuthLogin. /v1/auth/providers lets
the login UI hide the button until GOOGLE_OAUTH_ID/SECRET are set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 00:26:44 +02:00
Marco Sadjadi
a68e882092 feat(crypto): envelope encryption + key rotation via admin panel
Closes structural weakness #4 from the audit (single global key, no rotation,
no KMS path). Customer secrets now use envelope encryption with a real
rotation story.

Model:
  KEK — Key Encryption Key, 32 bytes from env (SECRETS_ENCRYPTION_KEY). Never
        stored in the DB. Root of trust.
  DEK — Data Encryption Key, 32 random bytes we generate, stored in the new
        encryption_keys table *wrapped* (AES-256-GCM encrypted) with the KEK.
        Secrets are encrypted with the DEK.

Schema:
- encryption_keys (version, wrappedDek, active, rotatedBy, createdAt, retiredAt)
- secrets.keyId — which DEK encrypted this row. NULL = legacy (KEK-direct,
  pre-envelope); decryptSecret handles both and the first rotation migrates
  legacy rows onto a DEK.

crypto.ts (full rewrite):
- ensureActiveKey() — boot-time, loads keys + creates v1 if none. Fail-closed:
  index.ts process.exit(1) if it throws — the API will not serve if encryption
  can't initialize.
- encryptSecret() — encrypts with the active DEK, returns { value, keyId }.
- decryptSecret(value, keyId) — DEK path or legacy KEK-direct path.
- rotateKeys() — mints a fresh DEK, re-encrypts EVERY secret under it inside a
  single transaction (decrypt-old / encrypt-new per row), retires the old key,
  activates the new one. A partial failure is recoverable because every row
  carries its own keyId.
- encryptionStatus() — active version, key history, secret + legacy counts.

Admin:
- GET  /v1/admin/encryption        — status
- POST /v1/admin/encryption/rotate — triggers rotateKeys, audit-logged as
  admin.encryption.rotate with { newVersion, reEncrypted }.
- /admin/encryption page — active-key/secret/legacy cards, Rotate button with
  confirm, key-history table, plain-English how-it-works. Added to admin nav.

Verified end-to-end:
- boot → encryption_keys v1 active, '[crypto] envelope encryption ready'
- created a server with secret MY_API_KEY → stored ciphertext, keyId = v1
- POST rotate → { newVersion: 2, reEncrypted: 1 }; ciphertext changed, keyId
  now v2, v1 retired, v2 active. The decrypt-then-reencrypt round-trip
  succeeded (rotation throws otherwise) — the secret is provably recoverable.
- admin UI renders the status + history correctly.

Deferred, named honestly (not built this iteration):
- worker reads secrets from the DB instead of the BullMQ job-data plaintext
  copy — would also remove plaintext secrets from Redis. Separate change with
  its own risk surface on the iterate/fork flows.
- per-server secret-value rotation UI
- audit_log hash-chaining (tamper-evidence)
- rate limiting on auth endpoints
2026-05-20 22:36:08 +02:00
Marco Sadjadi
8334de13a8 feat(marketplace): template publish + fork + voting/ranking + admin moderation
What this enables:
- A user builds an MCP server. If others would benefit, they click 'Publish as
  template' on their server detail page. The spec + pre-rendered TypeScript
  snapshot is preserved.
- Visitors browse /templates, filter by category, sort by trending/top/newest.
  Each template card shows fork count + active deployment count as natural
  manipulation-resistant popularity signal.
- /templates/[slug] shows the full plan: tool list with input schemas,
  required-credential explanations (with 'how to get one' deep links), and a
  collapsible code preview so users can audit before forking.
- Fork is one click → /servers/new?template=slug. The wizard skips Step 1 and
  pre-fills Step 2 with the template's parsed spec. Forker only fills in their
  own credentials. mcp_servers.template_id is recorded; template.fork_count is
  bumped atomically. Each fork gets its own isolated container with its own
  port, its own AES-256 secrets — the template author has zero visibility into
  the fork's traffic or data.
- Admin /admin/templates moderation: verify quality templates (shows shield
  badge in marketplace), hide low-effort ones, takedown anything malicious.
  Takedowns cascade-pause every fork container — owners must re-deploy.

Why template+fork instead of shared-container:
- Shared containers would mean the publisher's quota + their secrets + their
  logs are exposed to forkers. Bad ergonomics, bad security, bad ownership.
- Templates/forks decouple the spec (shared, vouched-for) from the runtime
  (isolated per user). Network-effect moat without the trust collapse.

Why no 5-star voting in v1:
- Manipulation-anfällig, empty lists without adoption. We use fork count +
  active deploys + verified badge. Trending algorithm:
    score = (activeDeploys * 3 + forks) / sqrt(ageDays + 1)
  Real signal, no brigading attack surface.

Backend:
- New schema: templates table (16 cols incl. tools_schema, generated_code,
  required_secrets, allowedDomains, status enum, verified, fork_count).
- mcp_servers.template_id FK + idx for fork lookup.
- @bmm/types: SpecEdit unchanged, CreateServerInput accepts optional templateId.
- preview-cache.ts: new cachePrebuiltCode/loadPrebuiltCode for storing the
  template's full rendered server.ts alongside the spec. Generator worker
  detects this and skips the render step — uses the audited pre-built code
  verbatim. Banned-pattern re-scan at publish time.
- routes/templates.ts: 5 public/auth routes + 2 admin routes. Banned-pattern
  re-scan before publish. Slug auto-uniqued. forkCount atomic-increment via
  SQL.

UI:
- /templates marketplace with trending/top/newest tabs, category filter, search.
  Cards show forks + live count + author + verified badge.
- /templates/[slug] full detail with tools, credentials-with-hints, expandable
  code preview, fork CTA, ownership + stats sidebar, 'forking is safe' explainer.
- /servers/new?template=slug — wizard auto-jumps to Step 2 with template spec
  pre-filled, fork banner at top with link back to template.
- /servers/[id] new Publish tab with title, category, descriptions, per-secret
  hint fields (description + howToGetUrl per UPPER_SNAKE_CASE key).
- /admin/templates moderation with verify/hide/takedown actions.
- Marketing nav now includes /templates.

Verified end-to-end:
- Published Echo Demo Template from marco@test.local's live server
- Marketplace lists it correctly with stats
- Detail page renders with all sections
- Fork CTA navigates to wizard with ?template= param
- Wizard skips Step 1, shows fork banner, pre-fills spec
- Build succeeds in ~10s (cached spec + prebuilt code path skips Claude AND
  render), container live on :4109 with proper OAuth 401 → token → 200 flow
- DB: templates.fork_count=1, activeDeployments=1, mcp_servers.template_id
  populated on the fork
- /admin/templates shows the new template with verify/hide/takedown controls
2026-05-19 23:22:35 +02:00
Marco Sadjadi
c62fcd07ef feat(admin): password-auth admin panel with 8 pages + 15 API endpoints
Schema migrations:
- users.is_admin boolean
- users.password_hash text (scrypt N=16384, 16-byte salt)
- users.last_login_at timestamp
- organizations.suspended + suspended_reason
- admin_settings table (DB-stored prompt override + future settings)

Auth (@bmm/auth):
- hashPassword + verifyPassword via node:crypto scrypt (no extra dep)
- loginWithPassword: scrypt-verifies, issues 30-day session, updates last_login_at
- seedAdmin: idempotent upsert keyed on email; creates org + membership on first run
- AuthedUser now carries isAdmin flag

API:
- POST /v1/auth/admin/login (email + password) — 300ms throttle on failure
- requireAdmin preHandler — 401 if no session, 403 if non-admin
- Bootstrap: api on boot calls seedAdmin(ADMIN_EMAIL, ADMIN_PASSWORD, ADMIN_NAME)
  if env present. Idempotent.

Admin API routes (all gated by requireAdmin):
- GET /v1/admin/overview (totals, trends 7d, server-status breakdown, builds 24h, recent activity)
- GET /v1/admin/users (search, per-row org + plan + serverCount)
- PATCH /v1/admin/users/:id (isAdmin, name)
- DELETE /v1/admin/users/:id (self-delete blocked)
- GET /v1/admin/orgs (member + server counts)
- PATCH /v1/admin/orgs/:id (plan, quota, suspended; cascades to mcp_servers.status=paused on suspend)
- GET /v1/admin/servers (cross-org with status filter)
- POST /v1/admin/servers/:id/rebuild (re-queues build using last prompt)
- DELETE /v1/admin/servers/:id
- GET /v1/admin/builds (status filter, error messages, prompt previews)
- GET /v1/admin/builds/:id/logs
- GET /v1/admin/audit (system-wide with user email join)
- GET /v1/admin/system (DB ping, Redis ping, BullMQ queue depth, docker ps count)
- GET /v1/admin/prompt (builtin + override + updatedAt)
- PATCH /v1/admin/prompt (value: string | null) — saves DB override or drops it

UI (apps/web/app/admin/*):
- /admin/login — password form, separate from /login magic-link
- AdminLayout — Linear-style sidebar (8 nav items), bottom panel with user email +
  'user view' shortcut + logout, client-side requireAdmin guard with redirect
- /admin — overview dashboard with 4 metric cards, 2 panels (status + 24h builds),
  recent activity table linking to full audit
- /admin/users — search + admin toggle + delete (self-delete blocked)
- /admin/orgs — plan/quota/suspend actions via prompts
- /admin/servers — cross-org table with rebuild + delete actions, status filter
- /admin/builds — every build cross-fleet with error vs prompt preview
- /admin/audit — system-wide log + CSV export + filter dropdowns
- /admin/system — auto-refreshing 5s health probes for Postgres, Redis, queue, Docker
- /admin/prompt — live editor for the LLM system prompt with built-in baseline,
  override-state badge, drop-override action, diff preview, save-as-override

End-to-end verified: login as marco.frangiskatos@gmail.com + Melusa112233.*, every
admin page returns 200, admin login + overview tested via screenshot, docker probe
returns true count of running MCP containers.
2026-05-19 23:01:26 +02:00
Marco Sadjadi
dda8f94de4 feat(wizard): editable spec in step 2 — name, description, JSON schema, secrets
The wizard's confirm step is no longer read-only. Users can refine what Claude
parsed before committing to a build.

Backend:
- @bmm/types adds SpecEdit (tools[name,description,inputSchema] + requiredSecrets);
  CreateServerInput accepts an optional specEdit alongside previewId.
- Servers create endpoint: when specEdit is provided, loads cached spec from Redis,
  index-merges the edits in (keeping LLM-generated implementations untouched),
  re-validates via GeneratorSpec, re-runs the banned-pattern scan, overwrites the
  Redis cache so the worker reads the user's version. Refuses with
  preview_expired/tool_count_mismatch/banned_pattern on safety failures.
- New overwriteSpec() helper in preview-cache.

Frontend:
- Step 2 renders each tool as an editable card: name input, description textarea,
  JSON schema textarea with parse-on-keystroke validation (inline error if invalid).
- Required secrets list is editable: keys via uppercase-snake-case input, +Add /
  remove buttons, secret values kept in sync when keys are renamed.
- Reset-to-AI-suggestion button appears when edits are dirty.
- Pre-submit validation: schema must parse, secret keys must match UPPER_SNAKE_CASE,
  required secret values must be provided.
- Warning copy: 'Renaming parameters may require an Iterate after build — the
  existing impl references the original names.'

Verified end-to-end via browser smoke test: edited description + renamed tool
landed correctly in mcp_servers.tools_schema and in the live container at :4107.
Implementation field preserved from the original cached spec.
2026-05-19 22:10:26 +02:00
Marco Sadjadi
1c92964bbd feat(api,generator): preview endpoint + spec cache + audit-log writes
- POST /v1/servers/preview runs Claude synchronously, validates output, caches spec
  in Redis under preview:<id> with 5min TTL, returns previewId+spec+detectedSecrets.
- POST /v1/servers accepts optional previewId; worker reuses the cached spec if
  the entry is still present, otherwise regenerates fresh. Skips the second
  Claude round-trip (~30s saved on the demoable path).
- audit() helper writes auth.login, auth.logout, server.create, server.iterate,
  server.delete to audit_log with ip, metadata, resourceId.
- GET /v1/me/org returns organization + members list for the settings page.
- GET /v1/audit?limit=&action=&resourceType= returns scoped audit entries.
2026-05-19 18:08:29 +02:00
Marco Sadjadi
bb0d9c2cda feat(llm): extract Claude SYSTEM_PROMPT + generateSpec into shared @bmm/llm package 2026-05-19 18:05:31 +02:00
Marco Sadjadi
ab67203921 fix: live-run wiring (SDK 1.29, zod 3.25, OAUTH_ISSUER split, alt host ports, web on 3001, log level cast, pino transport)
- Bump @modelcontextprotocol/sdk from 1.0.4 to 1.29.0 in runner-template
  (1.0.4 has no McpServer or StreamableHTTPServerTransport — file not found at runtime).
- Bump zod to 3.25.76 across workspace to satisfy modern SDK peer dep.
- Split OAUTH_ISSUER (canonical, host-reachable) from CONTROL_PLANE_URL (container-reachable for JWKS).
  Runner verifies iss against OAUTH_ISSUER; fetches JWKS from CONTROL_PLANE_URL.
  Both API and runner now agree on http://localhost:4000/oauth as the issuer in dev.
- Move postgres host port 5432 to 5440, redis 6379 to 6390 to avoid collisions with
  native installs on the dev machine.
- Move web from 3000 to 3001 (3000 occupied by Gitea on dev machine).
- Drop pino-pretty transport from API to avoid runtime require of an unbundled dep.
- Cast build_logs.level (varchar) to BuildEvent's literal union in WS replay path.
- Remove unused reqBase helper in oauth.ts.
2026-05-19 00:57:23 +02:00
Marco Sadjadi
15697ba6dd feat(types,auth): zod contracts + magic-link session auth 2026-05-19 00:22:17 +02:00
Marco Sadjadi
439c91cbbf feat(db): drizzle schema and client (orgs, servers, builds, oauth, metrics, audit) 2026-05-19 00:21:18 +02:00