buildmymcpserver

Author	SHA1	Message	Date
Marco Sadjadi	1093dc40a7	fix(runner): correct PUBLIC_URL + mount runner-map volume All checks were successful Deploy to Production / deploy (push) Successful in 1m38s Details Two overlapping bugs were killing OAuth discovery for every external MCP client (Claude Desktop, Cursor, etc.): 1. worker.ts injected PUBLIC_URL=http://<RUNNER_HOST>:<port> into the runner container even when MCP_DOMAIN was set. Result: the runner's /.well-known/oauth-protected-resource advertised an unreachable URL and the WWW-Authenticate header pointed at a non-HTTPS loopback address. Claude Desktop refused to follow the discovery chain. Now derives PUBLIC_URL from the same computePublicUrl() helper that builds the user-visible URL stored in mcp_servers.public_url, so the container's self-reported resource matches its actual route. 2. docker-compose.prod.yml never mounted /opt/buildmymcpserver/runner-map into the api / generator containers. The .conf snippet written by the generator landed in an ephemeral container path; the host inotify watcher saw an empty directory and produced an empty runner-map.combined. Result: nginx 404'd every /<slug>/* request, the runner was unreachable from the public domain, and OAuth discovery couldn't even begin. Mount added to both services. Existing weather server has the wrong PUBLIC_URL baked in and must be recreated after deploy. No customers yet. export computePublicUrl from deploy.ts so worker.ts can call it.	2026-05-28 17:54:56 +02:00
Marco Sadjadi	d0f3c202eb	fix(tls): pivot per-runner TLS to path-routing on single subdomain All checks were successful Deploy to Production / deploy (push) Successful in 54s Details The per-subdomain approach (.mcp.buildmymcpserver.com) failed at the Cloudflare edge — Universal SSL only covers ONE-level wildcards, so the TLS handshake on slug.mcp.buildmymcpserver.com hits SSL alert 40 handshake_failure. The two paths to fix that (CF Advanced Cert Manager at $10/mo, or a Let's-Encrypt wildcard via DNS-01 with certbot) both trade either money or ops for the URL aesthetic. Pivot to path-routing on the single subdomain mcp.buildmymcpserver.com, which IS covered by free Universal SSL. publicUrl format changes from https://<slug>.mcp.buildmymcpserver.com → https://mcp.buildmymcpserver.com/<slug> No recurring cost, works with the existing CF setup, MCP clients don't care about the URL shape (it comes from the wizard's install snippet). Code changes: - generator/lib/deploy.ts: publicUrl computed as `${MCP_DOMAIN}/${slug}` instead of `${slug}.${MCP_DOMAIN}` * writeRunnerMapEntry writes one-line nginx snippet: if ($bmm_slug = "<slug>") { set $bmm_port <port>; } (was: a map-entry pair "<slug>.<MCP_DOMAIN> <port>;") - setup-runner-tls.sh: * nginx vhost is now single server_name mcp.buildmymcpserver.com * regex location captures (?<bmm_slug>...)(?<bmm_path>/.)? includes runner-map.combined inside the location block so the generated if-snippets set $bmm_port; unknown slug → 404 * proxy_pass strips the slug prefix: /<slug>/foo → 127.0.0.1:port/foo * Prereq docs updated: just A-record for mcp (no wildcard needed), same Origin CA cert reused * Added /health endpoint at vhost root for monitoring Systemd watcher + map dir + volume mounts unchanged — same file paths, just different snippet content. Re-running setup-runner-tls.sh on the host overwrites the wildcard vhost with the new path-based one. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 22:51:30 +02:00
Marco Sadjadi	8c6f04f034	feat: oauth refresh-token grant + per-runner subdomain TLS plumbing All checks were successful Deploy to Production / deploy (push) Successful in 52s Details OAUTH REFRESH-TOKEN - oauth_tokens.subject column added (migration applied to prod DB): stores the JWT sub claim from the original authorization so refreshes can re-mint with the same identity without re-walking the (consumed) code. - Authorization-code branch now writes subject AND uses a 30-day expires_at for the row (was 1h — same as access token, which killed refresh after 1h). - New refresh_token grant branch: * looks up token by refresh-hash + expiry * client_id must match, client_secret verified if confidential * RFC 8707: requested resource must equal stored resource * OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT, new refresh token, extended expiry; loser of a race sees invalid_grant - Access TTL (1h) and refresh TTL (30d) extracted as constants. Clients no longer have to re-authorize hourly. Closes Zb-001. PER-RUNNER SUBDOMAIN TLS (Z1-002) Code path: - New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR (default /var/runner-map) in generator config. - deployContainer: writes /var/runner-map/<slug>.conf with content "slug.MCP_DOMAIN port;" and computes publicUrl as https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when MCP_DOMAIN is unset (zero behaviour change until host is configured). - stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now accepts an optional slug arg and removes the map fragment. Callers (DELETE /v1/servers/:id, admin template takedown) updated. Infra path (one-time host setup — Marco runs as root): - scripts/setup-runner-tls.sh: 1. nginx vhost matching .mcp.buildmymcpserver.com via regex → reads slug→port from /opt/buildmymcpserver/runner-map.combined 2. systemd inotify service watches the map dir, combines fragments on any change, reloads nginx 3. installs inotify-tools if missing, idempotent - Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA cert for .mcp.buildmymcpserver.com, SSL mode Full (strict). - After running: edit docker-compose.prod.yml to mount the map dir into api + generator, set MCP_DOMAIN in env, recreate containers. Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host action away from closing it on the infra side. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 22:09:06 +02:00
Marco Sadjadi	f8af3fc0fd	security: sovereign-audit Phase 2 fixes — trustProxy, Docker hardening, banned-pattern overhaul All checks were successful Deploy to Production / deploy (push) Successful in 55s Details Five confirmed findings from the sovereign-audit pass, ordered by severity: Z3-001 CRITICAL — Fastify now trustProxy:true so req.ip resolves to the real visitor IP via X-Forwarded-For instead of always being the nginx / docker-bridge peer. Every per-IP rate-limit in the codebase was silently collapsed into one global counter; this restores them. Z1-001 CRITICAL — runner container hardening flags (--read-only, --cap-drop=ALL, --security-opt=no-new-privileges:true, --pids-limit=100, --memory=512m, --cpus=0.5, tmpfs /tmp) were sitting commented-out as a TODO despite /security promising them. Now applied unconditionally on production/staging; opt-out flag RUNNER_DISABLE_HARDENING=1 for Win-dev. Z2-001 + Z2-002 CRITICAL / MEDIUM — banned-pattern blacklist tightened (Function(...) without `new`, process.binding, process.dlopen, .constructor.constructor, _load, vm.runInContext, globalThis['..'], "system prompt override"). scanForInjection now also walks tool.name and every inputSchema property description, not only implementation + description — closes the prompt-injection-into-AI-client surface that downstream clients (Claude Desktop, Cursor) read verbatim. The duplicate BANNED_PATTERNS in apps/api/src/routes/servers.ts deleted in favour of the single shared scanForInjection export from @bmm/llm. Z4-001 HIGH — /v1/auth/magic-link gained the two-axis daily rate-limit the SMS endpoint already had: 10/IP/day + 5/email/day. Combined with the trustProxy fix above these are now real per-visitor limits. Z4-002 MEDIUM — magic-link callback URL no longer printed to stdout in production. In dev it still prints (so devs can click the link); in production we log only "issued, URL withheld" and a loud error if no email sender is wired (Resend integration is the actual launch blocker — left as a TODO). Z6-001 MEDIUM — /v1/builds/:id/stream WebSocket now refuses cross-origin upgrades. SameSite=Lax already mitigates in modern browsers; this is the defense-in-depth against browser bugs and non-browser clients. FALSE POSITIVES dismissed: slug path-traversal (schema regex ^[a-z][a-z0-9-]$ in @bmm/types catches it); session-after-promote (getSession re-fetches isAdmin from DB on every request). DEFERRED (not blockers, tracked): - Z1-002 generated-server HTTPS — needs nginx wildcard subdomain TLS - Z1-003 docker image cleanup cron - Z2-001 v2 — real sandbox runtime (multi-week refactor) - Z3-002 rawBody-per-request memory — branch on webhook path only - Z5-001 multi-user org RBAC for billing — gated on Team feature - Email sender integration (Resend) — launch blocker Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-25 18:02:59 +02:00
Marco Sadjadi	bc174c1302	feat: tiered LLM (GLM free / Claude paid) + rate limits + quota enforcement All checks were successful Deploy to Production / deploy (push) Successful in 53s Details The free tier was hemorrhaging Anthropic cost with no abuse cap (no rate limit on /preview, Opus default in the build worker, 5-min cache TTL that made cache-miss the common case). This switches free users to GLM, paid users to Claude tiers, and tightens every leak found in the audit. Backend: - @bmm/llm: GLM provider via Zhipu's OpenAI-compatible endpoint, pickPreviewModel + pickBuildModel helpers, plan-aware ModelChoice - preview-cache TTL 5min -> 24h (kills the cache-miss path) - /v1/servers/preview: picks model from caller's plan, returns model name to UI - /v1/servers POST: enforces SERVER_LIMITS per plan (402), rate-limits builds - daily rate-limit on preview (5/40/150/1000) and build (3/20/100/500) - /v1/auth/me returns plan so the wizard can show the right model name - generator worker: GLM default, Anthropic Sonnet fallback if GLM errors Frontend: - Wizard fetches plan, shows "<model> is drafting the tool spec" pre-emptively, upgrade hint for hobby users, friendly errors for 402 / 429 - Pricing page: AI-model line per tier (Open-tier / Haiku / Sonnet / Opus), Team €149 -> €199, Enterprise €499 -> €999, daily-preview limit per tier - Privacy + Security: explicit subprocessor disclosure for Anthropic (US) / Zhipu (CN) and which tier uses which Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>	2026-05-23 23:50:00 +02:00
Marco Sadjadi	8d47b20ae5	fix(generator): iterate orphaned the previous container — rolling deploy Sovereign-audit follow-up. The audit's finding pass missed this: every Iterate (version > 1) ran allocatePort -> a NEW port and deployContainer -> a NEW container, then pointed the DB row at it — and never stopped the old container. The previous version kept running forever, holding a host port, with the old secrets baked into its env, untracked (its containerId was overwritten in the DB by deployContainer). Same bug class as API-SERVERS-001 but on the iterate path. Fix: the worker captures the server's current containerId before the build mutates the row, and after the new container is confirmed live + the DB updated, it stops the old one. This also makes the 'rolling deploy' the UI promises actually true — the old version stays up until the new one is live, then is retired. deploy.ts stopContainer now returns { ok, detail } (was void) so the worker can log the outcome. Verified: generator typecheck clean.	2026-05-20 20:58:30 +02:00
Marco Sadjadi	bb0d9c2cda	feat(llm): extract Claude SYSTEM_PROMPT + generateSpec into shared @bmm/llm package	2026-05-19 18:05:31 +02:00
Marco Sadjadi	ab67203921	fix: live-run wiring (SDK 1.29, zod 3.25, OAUTH_ISSUER split, alt host ports, web on 3001, log level cast, pino transport) - Bump @modelcontextprotocol/sdk from 1.0.4 to 1.29.0 in runner-template (1.0.4 has no McpServer or StreamableHTTPServerTransport — file not found at runtime). - Bump zod to 3.25.76 across workspace to satisfy modern SDK peer dep. - Split OAUTH_ISSUER (canonical, host-reachable) from CONTROL_PLANE_URL (container-reachable for JWKS). Runner verifies iss against OAUTH_ISSUER; fetches JWKS from CONTROL_PLANE_URL. Both API and runner now agree on http://localhost:4000/oauth as the issuer in dev. - Move postgres host port 5432 to 5440, redis 6379 to 6390 to avoid collisions with native installs on the dev machine. - Move web from 3000 to 3001 (3000 occupied by Gitea on dev machine). - Drop pino-pretty transport from API to avoid runtime require of an unbundled dep. - Cast build_logs.level (varchar) to BuildEvent's literal union in WS replay path. - Remove unused reqBase helper in oauth.ts.	2026-05-19 00:57:23 +02:00
Marco Sadjadi	cc24dd4a63	feat(generator): BullMQ worker (Claude API + spec render + docker build + local deploy)	2026-05-19 00:26:53 +02:00

9 Commits