Commit Graph

5 Commits

Author SHA1 Message Date
Marco Sadjadi
38aa5875d3 feat(auth): add "Continue with Google" OAuth 2.0 login
Server-side authorization-code flow: /v1/auth/google redirects to the
consent screen with a CSRF state cookie; /v1/auth/google/callback
exchanges the code, validates the ID token (iss/aud/exp/email_verified),
and mints a 30-day session via upsertOAuthLogin. /v1/auth/providers lets
the login UI hide the button until GOOGLE_OAUTH_ID/SECRET are set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-21 00:26:44 +02:00
Marco Sadjadi
9cce4a94c2 fix(security): sovereign-audit — close 2 HIGH + 3 MEDIUM findings
Full reasoning-based audit of all 10 zones. 11 findings, all confirmed real,
zero false positives. 5 fixed now, 6 deferred to a justified backlog.

API-SERVERS-001 (HIGH) — DELETE /v1/servers/:id orphaned the container
  The route deleted the DB row but never stopped the Docker container — it
  kept running forever on its host port, still serving traffic with the
  user's secrets baked into its env. The takedown path got stopContainer in
  an earlier commit; this sibling path was missed. DELETE now tears the
  container down first. Verified: deleted 'gfgfg' — container 23e0c55c gone,
  :4110 connection-refused after.

INFRA-001 (HIGH) — SECRETS_ENCRYPTION_KEY zero-default usable in production
  The AES-256-GCM key defaults to 64 zeros and passes the min(64) check. A
  prod deploy that forgot to set it booted silently with every secret
  encrypted under a public key. config.ts now throws on boot when
  NODE_ENV=production and the key is still the placeholder. Verified: prod
  boot with the zero key is REFUSED.

API-SERVERS-002 (MEDIUM) — WS build stream had no authorization
  GET /v1/builds/:id/stream streamed build logs with no auth, while its REST
  twin checks orgId. Now authenticates from the session cookie and rejects
  builds outside the caller's org. Verified: no cookie -> 'unauthorized';
  cross-org build -> 'not_found'; own build -> streams (no regression).

OAUTH-001 (MEDIUM) — authorization code consumption was not atomic
  The 'already used?' check and the 'mark used' write were separate
  statements — two requests racing the same code could both mint tokens.
  Now a conditional UPDATE ... WHERE consumed_at IS NULL RETURNING; the
  loser of the race gets zero rows and invalid_grant.

OAUTH-002 (MEDIUM) — 'plain' PKCE accepted, contradicting AS metadata
  The AS metadata advertises code_challenge_methods_supported: ['S256'] but
  /oauth/authorize accepted 'plain'. Authorize is now z.literal('S256') and
  pkceVerify dropped the plain branch. Verified: authorize with plain -> 400.

Deferred to backlog (documented in TEMPLATE_SECURITY_AUDIT.md is template-only;
this audit's findings are in the commit + certification):
  GENERATOR-001 — secrets via docker -e (visible in docker inspect); needs
    --env-file rework
  RUNNER-001   — generated containers run as root; needs USER node + build
    re-test
  AUTH-001     — no rate limit on magic-link / oauth register; needs
    @fastify/rate-limit
  GENERATOR-002— allocatePort check/bind race; low, self-heals on rebuild
  AUTH-002     — expired magic_links/sessions/oauth rows never purged; needs
    a cron
  FEATURES-001 — tool-call metering not wired (metrics always 0); Sprint 4
    by plan
2026-05-20 18:15:03 +02:00
Marco Sadjadi
c62fcd07ef feat(admin): password-auth admin panel with 8 pages + 15 API endpoints
Schema migrations:
- users.is_admin boolean
- users.password_hash text (scrypt N=16384, 16-byte salt)
- users.last_login_at timestamp
- organizations.suspended + suspended_reason
- admin_settings table (DB-stored prompt override + future settings)

Auth (@bmm/auth):
- hashPassword + verifyPassword via node:crypto scrypt (no extra dep)
- loginWithPassword: scrypt-verifies, issues 30-day session, updates last_login_at
- seedAdmin: idempotent upsert keyed on email; creates org + membership on first run
- AuthedUser now carries isAdmin flag

API:
- POST /v1/auth/admin/login (email + password) — 300ms throttle on failure
- requireAdmin preHandler — 401 if no session, 403 if non-admin
- Bootstrap: api on boot calls seedAdmin(ADMIN_EMAIL, ADMIN_PASSWORD, ADMIN_NAME)
  if env present. Idempotent.

Admin API routes (all gated by requireAdmin):
- GET /v1/admin/overview (totals, trends 7d, server-status breakdown, builds 24h, recent activity)
- GET /v1/admin/users (search, per-row org + plan + serverCount)
- PATCH /v1/admin/users/:id (isAdmin, name)
- DELETE /v1/admin/users/:id (self-delete blocked)
- GET /v1/admin/orgs (member + server counts)
- PATCH /v1/admin/orgs/:id (plan, quota, suspended; cascades to mcp_servers.status=paused on suspend)
- GET /v1/admin/servers (cross-org with status filter)
- POST /v1/admin/servers/:id/rebuild (re-queues build using last prompt)
- DELETE /v1/admin/servers/:id
- GET /v1/admin/builds (status filter, error messages, prompt previews)
- GET /v1/admin/builds/:id/logs
- GET /v1/admin/audit (system-wide with user email join)
- GET /v1/admin/system (DB ping, Redis ping, BullMQ queue depth, docker ps count)
- GET /v1/admin/prompt (builtin + override + updatedAt)
- PATCH /v1/admin/prompt (value: string | null) — saves DB override or drops it

UI (apps/web/app/admin/*):
- /admin/login — password form, separate from /login magic-link
- AdminLayout — Linear-style sidebar (8 nav items), bottom panel with user email +
  'user view' shortcut + logout, client-side requireAdmin guard with redirect
- /admin — overview dashboard with 4 metric cards, 2 panels (status + 24h builds),
  recent activity table linking to full audit
- /admin/users — search + admin toggle + delete (self-delete blocked)
- /admin/orgs — plan/quota/suspend actions via prompts
- /admin/servers — cross-org table with rebuild + delete actions, status filter
- /admin/builds — every build cross-fleet with error vs prompt preview
- /admin/audit — system-wide log + CSV export + filter dropdowns
- /admin/system — auto-refreshing 5s health probes for Postgres, Redis, queue, Docker
- /admin/prompt — live editor for the LLM system prompt with built-in baseline,
  override-state badge, drop-override action, diff preview, save-as-override

End-to-end verified: login as marco.frangiskatos@gmail.com + Melusa112233.*, every
admin page returns 200, admin login + overview tested via screenshot, docker probe
returns true count of running MCP containers.
2026-05-19 23:01:26 +02:00
Marco Sadjadi
ab67203921 fix: live-run wiring (SDK 1.29, zod 3.25, OAUTH_ISSUER split, alt host ports, web on 3001, log level cast, pino transport)
- Bump @modelcontextprotocol/sdk from 1.0.4 to 1.29.0 in runner-template
  (1.0.4 has no McpServer or StreamableHTTPServerTransport — file not found at runtime).
- Bump zod to 3.25.76 across workspace to satisfy modern SDK peer dep.
- Split OAUTH_ISSUER (canonical, host-reachable) from CONTROL_PLANE_URL (container-reachable for JWKS).
  Runner verifies iss against OAUTH_ISSUER; fetches JWKS from CONTROL_PLANE_URL.
  Both API and runner now agree on http://localhost:4000/oauth as the issuer in dev.
- Move postgres host port 5432 to 5440, redis 6379 to 6390 to avoid collisions with
  native installs on the dev machine.
- Move web from 3000 to 3001 (3000 occupied by Gitea on dev machine).
- Drop pino-pretty transport from API to avoid runtime require of an unbundled dep.
- Cast build_logs.level (varchar) to BuildEvent's literal union in WS replay path.
- Remove unused reqBase helper in oauth.ts.
2026-05-19 00:57:23 +02:00
Marco Sadjadi
9658e843df feat(api): Fastify control plane (auth, servers, WS build stream, OAuth 2.1 AS, JWKS) 2026-05-19 00:24:47 +02:00