feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
import fs from 'node:fs/promises';
|
2026-05-19 00:26:53 +02:00
|
|
|
import net from 'node:net';
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
import path from 'node:path';
|
2026-05-19 00:26:53 +02:00
|
|
|
import { createDb, eq, isNotNull, mcpServers } from '@bmm/db';
|
|
|
|
|
import { config } from '../config.js';
|
|
|
|
|
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
/**
|
2026-05-25 22:51:30 +02:00
|
|
|
* Per-runner TLS via path-routing on mcp.buildmymcpserver.com. When
|
|
|
|
|
* MCP_DOMAIN is set, the generator publishes each container at
|
|
|
|
|
* https://<MCP_DOMAIN>/<slug>
|
|
|
|
|
* and writes a one-line nginx snippet per server into RUNNER_MAP_DIR.
|
|
|
|
|
* A host-side systemd inotify watcher combines the snippets into a single
|
|
|
|
|
* file that the nginx vhost includes inside its location block, mapping
|
|
|
|
|
* the captured slug to its local runner port.
|
|
|
|
|
*
|
|
|
|
|
* Path-routing (instead of per-subdomain) is the bootstrap-friendly choice:
|
|
|
|
|
* mcp.buildmymcpserver.com is covered by Cloudflare's free Universal SSL,
|
|
|
|
|
* whereas *.mcp.buildmymcpserver.com would need CF Advanced Cert Manager
|
|
|
|
|
* ($10/mo) or a custom Let's-Encrypt wildcard via DNS-01 (free but more
|
|
|
|
|
* ops). See scripts/setup-runner-tls.sh for the one-time host setup.
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
*
|
|
|
|
|
* If MCP_DOMAIN is unset, both the URL formatter and the map writer no-op
|
|
|
|
|
* and we fall back to the legacy http://host:port URL — zero behaviour
|
|
|
|
|
* change without the host-side infra in place.
|
|
|
|
|
*/
|
|
|
|
|
function runnerMapPath(slug: string): string {
|
|
|
|
|
return path.join(config.RUNNER_MAP_DIR, `${slug}.conf`);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
async function writeRunnerMapEntry(slug: string, port: number): Promise<void> {
|
|
|
|
|
if (!config.MCP_DOMAIN) return;
|
2026-05-25 22:51:30 +02:00
|
|
|
// nginx snippet — included inside a `location ~` block that captures
|
|
|
|
|
// $bmm_slug. Each runner contributes one line; the systemd watcher
|
|
|
|
|
// concatenates them into /opt/buildmymcpserver/runner-map.combined.
|
|
|
|
|
const line = `if ($bmm_slug = "${slug}") { set $bmm_port ${port}; }\n`;
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
try {
|
|
|
|
|
await fs.mkdir(config.RUNNER_MAP_DIR, { recursive: true });
|
|
|
|
|
await fs.writeFile(runnerMapPath(slug), line, 'utf8');
|
|
|
|
|
} catch (err) {
|
|
|
|
|
// Don't fail the deploy if the map dir isn't mounted yet — runner still
|
|
|
|
|
// serves on http://host:port and the user can manually proxy.
|
|
|
|
|
console.warn(`[runner-tls] could not write map entry for ${slug}:`, err);
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
async function removeRunnerMapEntry(slug: string): Promise<void> {
|
|
|
|
|
if (!config.MCP_DOMAIN) return;
|
|
|
|
|
try {
|
|
|
|
|
await fs.rm(runnerMapPath(slug), { force: true });
|
|
|
|
|
} catch {
|
|
|
|
|
// Idempotent — missing file is fine.
|
|
|
|
|
}
|
|
|
|
|
}
|
|
|
|
|
|
2026-05-28 17:54:56 +02:00
|
|
|
export function computePublicUrl(slug: string, port: number): string {
|
2026-05-25 22:51:30 +02:00
|
|
|
if (config.MCP_DOMAIN) return `https://${config.MCP_DOMAIN}/${slug}`;
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
return `http://${config.RUNNER_HOST}:${port}`;
|
|
|
|
|
}
|
|
|
|
|
|
security: sovereign-audit Phase 2 fixes — trustProxy, Docker hardening, banned-pattern overhaul
Five confirmed findings from the sovereign-audit pass, ordered by severity:
Z3-001 CRITICAL — Fastify now trustProxy:true so req.ip resolves to the
real visitor IP via X-Forwarded-For instead of always being the nginx /
docker-bridge peer. Every per-IP rate-limit in the codebase was silently
collapsed into one global counter; this restores them.
Z1-001 CRITICAL — runner container hardening flags (--read-only,
--cap-drop=ALL, --security-opt=no-new-privileges:true, --pids-limit=100,
--memory=512m, --cpus=0.5, tmpfs /tmp) were sitting commented-out as a
TODO despite /security promising them. Now applied unconditionally on
production/staging; opt-out flag RUNNER_DISABLE_HARDENING=1 for Win-dev.
Z2-001 + Z2-002 CRITICAL / MEDIUM — banned-pattern blacklist tightened
(Function(...) without `new`, process.binding, process.dlopen,
.constructor.constructor, _load, vm.runIn*Context, globalThis['..'],
"system prompt override"). scanForInjection now also walks tool.name and
every inputSchema property description, not only implementation +
description — closes the prompt-injection-into-AI-client surface that
downstream clients (Claude Desktop, Cursor) read verbatim. The duplicate
BANNED_PATTERNS in apps/api/src/routes/servers.ts deleted in favour of
the single shared scanForInjection export from @bmm/llm.
Z4-001 HIGH — /v1/auth/magic-link gained the two-axis daily rate-limit
the SMS endpoint already had: 10/IP/day + 5/email/day. Combined with the
trustProxy fix above these are now real per-visitor limits.
Z4-002 MEDIUM — magic-link callback URL no longer printed to stdout in
production. In dev it still prints (so devs can click the link); in
production we log only "issued, URL withheld" and a loud error if no
email sender is wired (Resend integration is the actual launch
blocker — left as a TODO).
Z6-001 MEDIUM — /v1/builds/:id/stream WebSocket now refuses cross-origin
upgrades. SameSite=Lax already mitigates in modern browsers; this is the
defense-in-depth against browser bugs and non-browser clients.
FALSE POSITIVES dismissed: slug path-traversal (schema regex
^[a-z][a-z0-9-]*$ in @bmm/types catches it); session-after-promote
(getSession re-fetches isAdmin from DB on every request).
DEFERRED (not blockers, tracked):
- Z1-002 generated-server HTTPS — needs nginx wildcard subdomain TLS
- Z1-003 docker image cleanup cron
- Z2-001 v2 — real sandbox runtime (multi-week refactor)
- Z3-002 rawBody-per-request memory — branch on webhook path only
- Z5-001 multi-user org RBAC for billing — gated on Team feature
- Email sender integration (Resend) — launch blocker
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 18:02:59 +02:00
|
|
|
/**
|
|
|
|
|
* Container hardening flags applied on every runner deployment on Linux
|
|
|
|
|
* production hosts. Skipped only when explicitly disabled (dev/Windows
|
|
|
|
|
* Docker Desktop, which doesn't fully honour --read-only on bind mounts).
|
|
|
|
|
*
|
|
|
|
|
* Without these, a tenant container runs as root with full capabilities on
|
|
|
|
|
* the shared host — combined with the LLM static-check being a regex
|
|
|
|
|
* blacklist (Z2-001), this would let a malicious tenant execute arbitrary
|
|
|
|
|
* code on the host. With them, the blast radius collapses to "within the
|
|
|
|
|
* container", which holds only that tenant's own decrypted secrets.
|
|
|
|
|
*/
|
|
|
|
|
const HARDENING_FLAGS = [
|
|
|
|
|
'--read-only',
|
|
|
|
|
'--cap-drop=ALL',
|
|
|
|
|
'--security-opt=no-new-privileges:true',
|
|
|
|
|
'--pids-limit=100',
|
|
|
|
|
'--memory=512m',
|
|
|
|
|
'--memory-swap=512m',
|
|
|
|
|
'--cpus=0.5',
|
|
|
|
|
// /tmp needs writable space — runner-template uses it for build/cache.
|
|
|
|
|
'--tmpfs=/tmp:rw,nosuid,nodev,size=64m',
|
|
|
|
|
];
|
|
|
|
|
|
|
|
|
|
function shouldHarden(): boolean {
|
|
|
|
|
// Explicit opt-out for local dev on Windows where --read-only conflicts
|
|
|
|
|
// with how Docker Desktop binds volumes. Production must always harden.
|
|
|
|
|
if (process.env.RUNNER_DISABLE_HARDENING === '1') return false;
|
|
|
|
|
const env = process.env.NODE_ENV;
|
|
|
|
|
return env === 'production' || env === 'staging';
|
|
|
|
|
}
|
|
|
|
|
|
2026-05-19 00:26:53 +02:00
|
|
|
const db = createDb();
|
|
|
|
|
|
|
|
|
|
async function portFree(port: number, host = '127.0.0.1'): Promise<boolean> {
|
|
|
|
|
return new Promise((resolve) => {
|
|
|
|
|
const tester = net
|
|
|
|
|
.createServer()
|
|
|
|
|
.once('error', () => resolve(false))
|
|
|
|
|
.once('listening', () => tester.close(() => resolve(true)))
|
|
|
|
|
.listen(port, host);
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function allocatePort(): Promise<number> {
|
|
|
|
|
const used = new Set(
|
|
|
|
|
(
|
|
|
|
|
await db
|
|
|
|
|
.select({ port: mcpServers.hostPort })
|
|
|
|
|
.from(mcpServers)
|
|
|
|
|
.where(isNotNull(mcpServers.hostPort))
|
|
|
|
|
)
|
|
|
|
|
.map((r) => r.port)
|
|
|
|
|
.filter((p): p is number => typeof p === 'number'),
|
|
|
|
|
);
|
|
|
|
|
for (let port = config.RUNNER_PORT_RANGE_START; port <= config.RUNNER_PORT_RANGE_END; port++) {
|
|
|
|
|
if (used.has(port)) continue;
|
|
|
|
|
if (await portFree(port)) return port;
|
|
|
|
|
}
|
|
|
|
|
throw new Error('no_free_port');
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export interface DeployHandle {
|
|
|
|
|
containerId: string;
|
|
|
|
|
publicUrl: string;
|
|
|
|
|
hostPort: number;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export interface DeployInput {
|
|
|
|
|
serverId: string;
|
|
|
|
|
slug: string;
|
|
|
|
|
hostPort: number;
|
|
|
|
|
imageTag: string;
|
|
|
|
|
envVars: Record<string, string>;
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function deployContainer(input: DeployInput): Promise<DeployHandle> {
|
security: sovereign-audit Phase 2 fixes — trustProxy, Docker hardening, banned-pattern overhaul
Five confirmed findings from the sovereign-audit pass, ordered by severity:
Z3-001 CRITICAL — Fastify now trustProxy:true so req.ip resolves to the
real visitor IP via X-Forwarded-For instead of always being the nginx /
docker-bridge peer. Every per-IP rate-limit in the codebase was silently
collapsed into one global counter; this restores them.
Z1-001 CRITICAL — runner container hardening flags (--read-only,
--cap-drop=ALL, --security-opt=no-new-privileges:true, --pids-limit=100,
--memory=512m, --cpus=0.5, tmpfs /tmp) were sitting commented-out as a
TODO despite /security promising them. Now applied unconditionally on
production/staging; opt-out flag RUNNER_DISABLE_HARDENING=1 for Win-dev.
Z2-001 + Z2-002 CRITICAL / MEDIUM — banned-pattern blacklist tightened
(Function(...) without `new`, process.binding, process.dlopen,
.constructor.constructor, _load, vm.runIn*Context, globalThis['..'],
"system prompt override"). scanForInjection now also walks tool.name and
every inputSchema property description, not only implementation +
description — closes the prompt-injection-into-AI-client surface that
downstream clients (Claude Desktop, Cursor) read verbatim. The duplicate
BANNED_PATTERNS in apps/api/src/routes/servers.ts deleted in favour of
the single shared scanForInjection export from @bmm/llm.
Z4-001 HIGH — /v1/auth/magic-link gained the two-axis daily rate-limit
the SMS endpoint already had: 10/IP/day + 5/email/day. Combined with the
trustProxy fix above these are now real per-visitor limits.
Z4-002 MEDIUM — magic-link callback URL no longer printed to stdout in
production. In dev it still prints (so devs can click the link); in
production we log only "issued, URL withheld" and a loud error if no
email sender is wired (Resend integration is the actual launch
blocker — left as a TODO).
Z6-001 MEDIUM — /v1/builds/:id/stream WebSocket now refuses cross-origin
upgrades. SameSite=Lax already mitigates in modern browsers; this is the
defense-in-depth against browser bugs and non-browser clients.
FALSE POSITIVES dismissed: slug path-traversal (schema regex
^[a-z][a-z0-9-]*$ in @bmm/types catches it); session-after-promote
(getSession re-fetches isAdmin from DB on every request).
DEFERRED (not blockers, tracked):
- Z1-002 generated-server HTTPS — needs nginx wildcard subdomain TLS
- Z1-003 docker image cleanup cron
- Z2-001 v2 — real sandbox runtime (multi-week refactor)
- Z3-002 rawBody-per-request memory — branch on webhook path only
- Z5-001 multi-user org RBAC for billing — gated on Team feature
- Email sender integration (Resend) — launch blocker
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 18:02:59 +02:00
|
|
|
// Docker CLI is portable across linux/mac/win — sufficient for now; future
|
|
|
|
|
// iteration will switch to the engine API via UNIX socket.
|
2026-05-19 00:26:53 +02:00
|
|
|
const { spawn } = await import('node:child_process');
|
|
|
|
|
const containerName = `bmm-mcp-${input.slug}-${Date.now().toString(36)}`;
|
|
|
|
|
const args = [
|
|
|
|
|
'run',
|
|
|
|
|
'-d',
|
|
|
|
|
'--name',
|
|
|
|
|
containerName,
|
|
|
|
|
'-p',
|
|
|
|
|
`${input.hostPort}:3000`,
|
|
|
|
|
];
|
security: sovereign-audit Phase 2 fixes — trustProxy, Docker hardening, banned-pattern overhaul
Five confirmed findings from the sovereign-audit pass, ordered by severity:
Z3-001 CRITICAL — Fastify now trustProxy:true so req.ip resolves to the
real visitor IP via X-Forwarded-For instead of always being the nginx /
docker-bridge peer. Every per-IP rate-limit in the codebase was silently
collapsed into one global counter; this restores them.
Z1-001 CRITICAL — runner container hardening flags (--read-only,
--cap-drop=ALL, --security-opt=no-new-privileges:true, --pids-limit=100,
--memory=512m, --cpus=0.5, tmpfs /tmp) were sitting commented-out as a
TODO despite /security promising them. Now applied unconditionally on
production/staging; opt-out flag RUNNER_DISABLE_HARDENING=1 for Win-dev.
Z2-001 + Z2-002 CRITICAL / MEDIUM — banned-pattern blacklist tightened
(Function(...) without `new`, process.binding, process.dlopen,
.constructor.constructor, _load, vm.runIn*Context, globalThis['..'],
"system prompt override"). scanForInjection now also walks tool.name and
every inputSchema property description, not only implementation +
description — closes the prompt-injection-into-AI-client surface that
downstream clients (Claude Desktop, Cursor) read verbatim. The duplicate
BANNED_PATTERNS in apps/api/src/routes/servers.ts deleted in favour of
the single shared scanForInjection export from @bmm/llm.
Z4-001 HIGH — /v1/auth/magic-link gained the two-axis daily rate-limit
the SMS endpoint already had: 10/IP/day + 5/email/day. Combined with the
trustProxy fix above these are now real per-visitor limits.
Z4-002 MEDIUM — magic-link callback URL no longer printed to stdout in
production. In dev it still prints (so devs can click the link); in
production we log only "issued, URL withheld" and a loud error if no
email sender is wired (Resend integration is the actual launch
blocker — left as a TODO).
Z6-001 MEDIUM — /v1/builds/:id/stream WebSocket now refuses cross-origin
upgrades. SameSite=Lax already mitigates in modern browsers; this is the
defense-in-depth against browser bugs and non-browser clients.
FALSE POSITIVES dismissed: slug path-traversal (schema regex
^[a-z][a-z0-9-]*$ in @bmm/types catches it); session-after-promote
(getSession re-fetches isAdmin from DB on every request).
DEFERRED (not blockers, tracked):
- Z1-002 generated-server HTTPS — needs nginx wildcard subdomain TLS
- Z1-003 docker image cleanup cron
- Z2-001 v2 — real sandbox runtime (multi-week refactor)
- Z3-002 rawBody-per-request memory — branch on webhook path only
- Z5-001 multi-user org RBAC for billing — gated on Team feature
- Email sender integration (Resend) — launch blocker
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 18:02:59 +02:00
|
|
|
if (shouldHarden()) {
|
|
|
|
|
args.push(...HARDENING_FLAGS);
|
|
|
|
|
}
|
2026-05-19 00:26:53 +02:00
|
|
|
for (const [k, v] of Object.entries(input.envVars)) {
|
|
|
|
|
args.push('-e', `${k}=${v}`);
|
|
|
|
|
}
|
|
|
|
|
args.push('--restart=unless-stopped', input.imageTag);
|
|
|
|
|
|
|
|
|
|
return await new Promise<DeployHandle>((resolve, reject) => {
|
|
|
|
|
const child = spawn('docker', args, { stdio: ['ignore', 'pipe', 'pipe'] });
|
|
|
|
|
let out = '';
|
|
|
|
|
let err = '';
|
|
|
|
|
child.stdout.on('data', (d) => {
|
|
|
|
|
out += d.toString();
|
|
|
|
|
});
|
|
|
|
|
child.stderr.on('data', (d) => {
|
|
|
|
|
err += d.toString();
|
|
|
|
|
});
|
|
|
|
|
child.on('error', (e) => reject(e));
|
|
|
|
|
child.on('close', async (code) => {
|
|
|
|
|
if (code !== 0) {
|
|
|
|
|
reject(new Error(`docker_run_failed (exit ${code}): ${err.trim() || out.trim()}`));
|
|
|
|
|
return;
|
|
|
|
|
}
|
|
|
|
|
const containerId = out.trim().slice(0, 64);
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
const publicUrl = computePublicUrl(input.slug, input.hostPort);
|
|
|
|
|
// Drop the nginx map fragment BEFORE persisting publicUrl so the
|
|
|
|
|
// user-visible URL is reachable by the time the wizard polls "live".
|
|
|
|
|
await writeRunnerMapEntry(input.slug, input.hostPort);
|
2026-05-19 00:26:53 +02:00
|
|
|
await db
|
|
|
|
|
.update(mcpServers)
|
|
|
|
|
.set({
|
|
|
|
|
containerId,
|
|
|
|
|
hostPort: input.hostPort,
|
|
|
|
|
publicUrl,
|
|
|
|
|
status: 'live',
|
|
|
|
|
updatedAt: new Date(),
|
|
|
|
|
})
|
|
|
|
|
.where(eq(mcpServers.id, input.serverId));
|
|
|
|
|
resolve({ containerId, publicUrl, hostPort: input.hostPort });
|
|
|
|
|
});
|
|
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
2026-05-20 20:58:30 +02:00
|
|
|
export async function stopContainer(
|
|
|
|
|
containerId: string,
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
slug?: string,
|
2026-05-20 20:58:30 +02:00
|
|
|
): Promise<{ ok: boolean; detail: string }> {
|
|
|
|
|
if (!containerId || containerId.length < 4) {
|
|
|
|
|
return { ok: false, detail: 'invalid_container_id' };
|
|
|
|
|
}
|
feat: oauth refresh-token grant + per-runner subdomain TLS plumbing
OAUTH REFRESH-TOKEN
- oauth_tokens.subject column added (migration applied to prod DB): stores
the JWT sub claim from the original authorization so refreshes can
re-mint with the same identity without re-walking the (consumed) code.
- Authorization-code branch now writes subject AND uses a 30-day
expires_at for the row (was 1h — same as access token, which killed
refresh after 1h).
- New refresh_token grant branch:
* looks up token by refresh-hash + expiry
* client_id must match, client_secret verified if confidential
* RFC 8707: requested resource must equal stored resource
* OAuth 2.1 rotation: atomic UPDATE WHERE old_hash → new access JWT,
new refresh token, extended expiry; loser of a race sees invalid_grant
- Access TTL (1h) and refresh TTL (30d) extracted as constants.
Clients no longer have to re-authorize hourly. Closes Zb-001.
PER-RUNNER SUBDOMAIN TLS (Z1-002)
Code path:
- New MCP_DOMAIN env (e.g. "mcp.buildmymcpserver.com") + RUNNER_MAP_DIR
(default /var/runner-map) in generator config.
- deployContainer: writes /var/runner-map/<slug>.conf with content
"slug.MCP_DOMAIN port;" and computes publicUrl as
https://<slug>.<MCP_DOMAIN>. Falls back to http://host:port when
MCP_DOMAIN is unset (zero behaviour change until host is configured).
- stopContainer (both api/lib/docker.ts and generator/lib/deploy.ts) now
accepts an optional slug arg and removes the map fragment. Callers
(DELETE /v1/servers/:id, admin template takedown) updated.
Infra path (one-time host setup — Marco runs as root):
- scripts/setup-runner-tls.sh:
1. nginx vhost matching *.mcp.buildmymcpserver.com via regex →
reads slug→port from /opt/buildmymcpserver/runner-map.combined
2. systemd inotify service watches the map dir, combines fragments
on any change, reloads nginx
3. installs inotify-tools if missing, idempotent
- Prereqs documented at top: Cloudflare wildcard DNS proxied, Origin CA
cert for *.mcp.buildmymcpserver.com, SSL mode Full (strict).
- After running: edit docker-compose.prod.yml to mount the map dir into
api + generator, set MCP_DOMAIN in env, recreate containers.
Closes Zb-001 fully. Closes Z1-002 on the code side; one Marco-on-host
action away from closing it on the infra side.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-25 22:09:06 +02:00
|
|
|
// Remove the nginx map fragment first so the slug stops serving 502 from
|
|
|
|
|
// the proxy as soon as the container goes down. Idempotent — called
|
|
|
|
|
// multiple times with the same slug is fine.
|
|
|
|
|
if (slug) await removeRunnerMapEntry(slug);
|
|
|
|
|
|
2026-05-19 00:26:53 +02:00
|
|
|
const { spawn } = await import('node:child_process');
|
2026-05-20 20:58:30 +02:00
|
|
|
return await new Promise<{ ok: boolean; detail: string }>((resolve) => {
|
|
|
|
|
const child = spawn('docker', ['rm', '-f', containerId], {
|
|
|
|
|
stdio: ['ignore', 'pipe', 'pipe'],
|
|
|
|
|
});
|
|
|
|
|
let err = '';
|
|
|
|
|
child.stderr?.on('data', (d: Buffer) => {
|
|
|
|
|
err += d.toString();
|
|
|
|
|
});
|
|
|
|
|
child.on('error', () => resolve({ ok: false, detail: 'spawn_failed' }));
|
|
|
|
|
child.on('close', (code) =>
|
|
|
|
|
resolve(code === 0 ? { ok: true, detail: '' } : { ok: false, detail: err.trim() || `exit ${code}` }),
|
|
|
|
|
);
|
2026-05-19 00:26:53 +02:00
|
|
|
});
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
export async function dockerAvailable(): Promise<boolean> {
|
|
|
|
|
const { spawn } = await import('node:child_process');
|
|
|
|
|
return await new Promise<boolean>((resolve) => {
|
|
|
|
|
const child = spawn('docker', ['version'], { stdio: 'ignore' });
|
|
|
|
|
child.on('error', () => resolve(false));
|
|
|
|
|
child.on('close', (code) => resolve(code === 0));
|
|
|
|
|
});
|
|
|
|
|
}
|