fix(llm): preview timeout 60s→90s + maxTokens 8192→4096
All checks were successful
Deploy to Production / deploy (push) Successful in 1m21s

Enterprise plan was hitting SpecTimeoutError exactly at 60s because the
Sonnet 4.6 preview was budgeted for 8192 tokens at ~80 tok/s (≈102s
worst case) inside a 60s window. The frontend then rolled back to step
1 with no spec.

A real spec is small (<= ~10 tools, ~1.5–2.5k output tokens in practice)
so 4096 is plenty and lets even Sonnet finish in ~51s worst case. The
90s timeout buys headroom for cold starts while staying under
Cloudflare's 100s edge cap. Hobby/GLM bumped to 90s too — same
headroom argument.
This commit is contained in:
Marco Sadjadi 2026-05-28 18:51:51 +02:00
parent 1093dc40a7
commit 5a8e736113

View File

@ -87,39 +87,44 @@ export interface ModelChoice {
* fit. Hobby uses GLM as the cost lever; paid tiers escalate to Claude the * fit. Hobby uses GLM as the cost lever; paid tiers escalate to Claude the
* visible quality/speed jump *is* the upgrade pitch. * visible quality/speed jump *is* the upgrade pitch.
* *
* Measured token rates: glm-4-plus ~58 tok/s (3500 tok 60s) · * Measured token rates: glm-4-plus ~58 tok/s · Claude Haiku 4.5 ~200 tok/s ·
* Claude Haiku 4.5 ~200 tok/s (8192 tok 41s) · Claude Sonnet 4.6 ~80 tok/s. * Claude Sonnet 4.6 ~80 tok/s. A spec is small (<= ~10 tools with short
* descriptions, ~1.52.5k output tokens in practice) so we cap maxTokens at
* 4096 well under the model's hard ceiling and tight enough that even
* Sonnet finishes inside 60s in the worst case (4096 / 80 51s). The
* timeouts above 90s buy headroom for cold starts / slow API responses
* while staying clear of Cloudflare's 100s edge cap.
*/ */
const PREVIEW_MODELS: Record<Plan, ModelChoice> = { const PREVIEW_MODELS: Record<Plan, ModelChoice> = {
hobby: { hobby: {
provider: 'glm', provider: 'glm',
model: 'glm-4-plus', model: 'glm-4-plus',
maxTokens: 3500, maxTokens: 3500,
timeoutMs: 65_000, timeoutMs: 90_000,
displayName: 'Open-tier AI', displayName: 'Open-tier AI',
displayBadge: 'open-tier', displayBadge: 'open-tier',
}, },
pro: { pro: {
provider: 'anthropic', provider: 'anthropic',
model: 'claude-haiku-4-5-20251001', model: 'claude-haiku-4-5-20251001',
maxTokens: 8192, maxTokens: 4096,
timeoutMs: 60_000, timeoutMs: 90_000,
displayName: 'Claude Haiku 4.5', displayName: 'Claude Haiku 4.5',
displayBadge: 'claude-haiku', displayBadge: 'claude-haiku',
}, },
team: { team: {
provider: 'anthropic', provider: 'anthropic',
model: 'claude-sonnet-4-6', model: 'claude-sonnet-4-6',
maxTokens: 8192, maxTokens: 4096,
timeoutMs: 60_000, timeoutMs: 90_000,
displayName: 'Claude Sonnet 4.6', displayName: 'Claude Sonnet 4.6',
displayBadge: 'claude-sonnet', displayBadge: 'claude-sonnet',
}, },
enterprise: { enterprise: {
provider: 'anthropic', provider: 'anthropic',
model: 'claude-sonnet-4-6', model: 'claude-sonnet-4-6',
maxTokens: 8192, maxTokens: 4096,
timeoutMs: 60_000, timeoutMs: 90_000,
displayName: 'Claude Sonnet 4.6', displayName: 'Claude Sonnet 4.6',
displayBadge: 'claude-sonnet', displayBadge: 'claude-sonnet',
}, },