Commit Graph

5 Commits

Author SHA1 Message Date
Marco Sadjadi
bc174c1302 feat: tiered LLM (GLM free / Claude paid) + rate limits + quota enforcement
All checks were successful
Deploy to Production / deploy (push) Successful in 53s
The free tier was hemorrhaging Anthropic cost with no abuse cap (no rate
limit on /preview, Opus default in the build worker, 5-min cache TTL that
made cache-miss the common case). This switches free users to GLM, paid
users to Claude tiers, and tightens every leak found in the audit.

Backend:
- @bmm/llm: GLM provider via Zhipu's OpenAI-compatible endpoint, pickPreviewModel
  + pickBuildModel helpers, plan-aware ModelChoice
- preview-cache TTL 5min -> 24h (kills the cache-miss path)
- /v1/servers/preview: picks model from caller's plan, returns model name to UI
- /v1/servers POST: enforces SERVER_LIMITS per plan (402), rate-limits builds
- daily rate-limit on preview (5/40/150/1000) and build (3/20/100/500)
- /v1/auth/me returns plan so the wizard can show the right model name
- generator worker: GLM default, Anthropic Sonnet fallback if GLM errors

Frontend:
- Wizard fetches plan, shows "<model> is drafting the tool spec" pre-emptively,
  upgrade hint for hobby users, friendly errors for 402 / 429
- Pricing page: AI-model line per tier (Open-tier / Haiku / Sonnet / Opus),
  Team €149 -> €199, Enterprise €499 -> €999, daily-preview limit per tier
- Privacy + Security: explicit subprocessor disclosure for Anthropic (US) /
  Zhipu (CN) and which tier uses which

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-23 23:50:00 +02:00
Marco Sadjadi
8d47b20ae5 fix(generator): iterate orphaned the previous container — rolling deploy
Sovereign-audit follow-up. The audit's finding pass missed this: every
Iterate (version > 1) ran allocatePort -> a NEW port and deployContainer -> a
NEW container, then pointed the DB row at it — and never stopped the old
container. The previous version kept running forever, holding a host port,
with the old secrets baked into its env, untracked (its containerId was
overwritten in the DB by deployContainer). Same bug class as API-SERVERS-001
but on the iterate path.

Fix: the worker captures the server's current containerId before the build
mutates the row, and after the new container is confirmed live + the DB
updated, it stops the old one. This also makes the 'rolling deploy' the UI
promises actually true — the old version stays up until the new one is live,
then is retired.

deploy.ts stopContainer now returns { ok, detail } (was void) so the worker
can log the outcome.

Verified: generator typecheck clean.
2026-05-20 20:58:30 +02:00
Marco Sadjadi
bb0d9c2cda feat(llm): extract Claude SYSTEM_PROMPT + generateSpec into shared @bmm/llm package 2026-05-19 18:05:31 +02:00
Marco Sadjadi
ab67203921 fix: live-run wiring (SDK 1.29, zod 3.25, OAUTH_ISSUER split, alt host ports, web on 3001, log level cast, pino transport)
- Bump @modelcontextprotocol/sdk from 1.0.4 to 1.29.0 in runner-template
  (1.0.4 has no McpServer or StreamableHTTPServerTransport — file not found at runtime).
- Bump zod to 3.25.76 across workspace to satisfy modern SDK peer dep.
- Split OAUTH_ISSUER (canonical, host-reachable) from CONTROL_PLANE_URL (container-reachable for JWKS).
  Runner verifies iss against OAUTH_ISSUER; fetches JWKS from CONTROL_PLANE_URL.
  Both API and runner now agree on http://localhost:4000/oauth as the issuer in dev.
- Move postgres host port 5432 to 5440, redis 6379 to 6390 to avoid collisions with
  native installs on the dev machine.
- Move web from 3000 to 3001 (3000 occupied by Gitea on dev machine).
- Drop pino-pretty transport from API to avoid runtime require of an unbundled dep.
- Cast build_logs.level (varchar) to BuildEvent's literal union in WS replay path.
- Remove unused reqBase helper in oauth.ts.
2026-05-19 00:57:23 +02:00
Marco Sadjadi
cc24dd4a63 feat(generator): BullMQ worker (Claude API + spec render + docker build + local deploy) 2026-05-19 00:26:53 +02:00