fix(preview): switch spec generation to Haiku 4.5 to fit the proxy window
All checks were successful
Deploy to Production / deploy (push) Successful in 51s

Sonnet still overran Cloudflare's edge timeout — the 504 fired at 90s but
the proxy had already cut the connection, so the browser saw a headerless
524 reported as a CORS error.

Measured against the live API: Haiku 4.5 generates the spec at ~200 tok/s,
so a full 8k-token spec completes in ~40s. With a hard 60s timeout and no
retries the route is guaranteed to answer well inside the proxy window.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Marco Sadjadi 2026-05-22 00:03:12 +02:00
parent e198d44e1e
commit 083b6e5d41
2 changed files with 9 additions and 6 deletions

View File

@ -53,11 +53,14 @@ export async function serverRoutes(app: FastifyInstance): Promise<void> {
try { try {
const { spec, source } = await generateSpec(parsed.data.prompt, { const { spec, source } = await generateSpec(parsed.data.prompt, {
apiKey: config.ANTHROPIC_API_KEY, apiKey: config.ANTHROPIC_API_KEY,
// Sonnet 4.6 drafts the spec well inside Cloudflare's ~100s proxy cap; // Preview generates the spec synchronously inside an HTTP request that
// Opus routinely exceeded it, which reached the browser as a CORS error. // sits behind Cloudflare's edge timeout. Haiku 4.5 (~200 tok/s — a full
model: 'claude-sonnet-4-6', // 8k-token spec in ~40s) is the only model fast enough; Sonnet and Opus
timeoutMs: 45_000, // overran the proxy cap, which reached the browser as a CORS error. The
maxRetries: 1, // hard 60s timeout guarantees a clean 504 before the proxy gives up.
model: 'claude-haiku-4-5-20251001',
timeoutMs: 60_000,
maxRetries: 0,
}); });
const previewId = await cacheSpec(spec); const previewId = await cacheSpec(spec);
return reply.send({ return reply.send({

View File

@ -457,7 +457,7 @@ function NewServerPageInner() {
<Loader2 className="mx-auto animate-spin text-[--color-accent]" size={22} /> <Loader2 className="mx-auto animate-spin text-[--color-accent]" size={22} />
<p className="mt-4 text-[13px]">Analyzing your prompt</p> <p className="mt-4 text-[13px]">Analyzing your prompt</p>
<p className="mt-1 text-[12px] text-[--color-fg-subtle]"> <p className="mt-1 text-[12px] text-[--color-fg-subtle]">
Claude Sonnet 4.6 is drafting the tool spec. Usually 2040 seconds. Claude is drafting the tool spec. Usually 1540 seconds.
</p> </p>
<p className="mono mt-3 text-[11px] tabular-nums text-[--color-fg-muted]"> <p className="mono mt-3 text-[11px] tabular-nums text-[--color-fg-muted]">
{elapsedSec}s elapsed {elapsedSec}s elapsed