The free tier was hemorrhaging Anthropic cost with no abuse cap (no rate
limit on /preview, Opus default in the build worker, 5-min cache TTL that
made cache-miss the common case). This switches free users to GLM, paid
users to Claude tiers, and tightens every leak found in the audit.
Backend:
- @bmm/llm: GLM provider via Zhipu's OpenAI-compatible endpoint, pickPreviewModel
+ pickBuildModel helpers, plan-aware ModelChoice
- preview-cache TTL 5min -> 24h (kills the cache-miss path)
- /v1/servers/preview: picks model from caller's plan, returns model name to UI
- /v1/servers POST: enforces SERVER_LIMITS per plan (402), rate-limits builds
- daily rate-limit on preview (5/40/150/1000) and build (3/20/100/500)
- /v1/auth/me returns plan so the wizard can show the right model name
- generator worker: GLM default, Anthropic Sonnet fallback if GLM errors
Frontend:
- Wizard fetches plan, shows "<model> is drafting the tool spec" pre-emptively,
upgrade hint for hobby users, friendly errors for 402 / 429
- Pricing page: AI-model line per tier (Open-tier / Haiku / Sonnet / Opus),
Team €149 -> €199, Enterprise €499 -> €999, daily-preview limit per tier
- Privacy + Security: explicit subprocessor disclosure for Anthropic (US) /
Zhipu (CN) and which tier uses which
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sovereign-audit follow-up. The audit's finding pass missed this: every
Iterate (version > 1) ran allocatePort -> a NEW port and deployContainer -> a
NEW container, then pointed the DB row at it — and never stopped the old
container. The previous version kept running forever, holding a host port,
with the old secrets baked into its env, untracked (its containerId was
overwritten in the DB by deployContainer). Same bug class as API-SERVERS-001
but on the iterate path.
Fix: the worker captures the server's current containerId before the build
mutates the row, and after the new container is confirmed live + the DB
updated, it stops the old one. This also makes the 'rolling deploy' the UI
promises actually true — the old version stays up until the new one is live,
then is retired.
deploy.ts stopContainer now returns { ok, detail } (was void) so the worker
can log the outcome.
Verified: generator typecheck clean.
- Bump @modelcontextprotocol/sdk from 1.0.4 to 1.29.0 in runner-template
(1.0.4 has no McpServer or StreamableHTTPServerTransport — file not found at runtime).
- Bump zod to 3.25.76 across workspace to satisfy modern SDK peer dep.
- Split OAUTH_ISSUER (canonical, host-reachable) from CONTROL_PLANE_URL (container-reachable for JWKS).
Runner verifies iss against OAUTH_ISSUER; fetches JWKS from CONTROL_PLANE_URL.
Both API and runner now agree on http://localhost:4000/oauth as the issuer in dev.
- Move postgres host port 5432 to 5440, redis 6379 to 6390 to avoid collisions with
native installs on the dev machine.
- Move web from 3000 to 3001 (3000 occupied by Gitea on dev machine).
- Drop pino-pretty transport from API to avoid runtime require of an unbundled dep.
- Cast build_logs.level (varchar) to BuildEvent's literal union in WS replay path.
- Remove unused reqBase helper in oauth.ts.