|
All checks were successful
Deploy to Production / deploy (push) Successful in 1m24s
Root cause of repeat 422s: 4096 was too tight for ambitious prompts (Marco's research-assistant prompt produces ~12kB of JSON before the model gets cut off mid-string). The error then surfaced as an opaque "Unterminated string in JSON" zod failure instead of pointing the user at the real problem. Two fixes: - maxTokens back to 8192 (the original) for all Claude tiers, 4096 for GLM. Timeouts bumped to 95s — Sonnet 4.6 at ~130 tok/s does 8192 in ~63s, ~30s headroom for cold starts, still under Cloudflare's 100s edge cap. - Detect stop_reason === 'max_tokens' on the Anthropic response BEFORE parsing and throw the new SpecTruncatedError. /preview catches it and returns 422 spec_too_large with a clear "split the prompt" message instead of leaking the zod parse failure. |
||
|---|---|---|
| .. | ||
| src | ||
| package.json | ||
| tsconfig.json | ||