fix(generator): iterate orphaned the previous container — rolling deploy

Sovereign-audit follow-up. The audit's finding pass missed this: every
Iterate (version > 1) ran allocatePort -> a NEW port and deployContainer -> a
NEW container, then pointed the DB row at it — and never stopped the old
container. The previous version kept running forever, holding a host port,
with the old secrets baked into its env, untracked (its containerId was
overwritten in the DB by deployContainer). Same bug class as API-SERVERS-001
but on the iterate path.

Fix: the worker captures the server's current containerId before the build
mutates the row, and after the new container is confirmed live + the DB
updated, it stops the old one. This also makes the 'rolling deploy' the UI
promises actually true — the old version stays up until the new one is live,
then is retired.

deploy.ts stopContainer now returns { ok, detail } (was void) so the worker
can log the outcome.

Verified: generator typecheck clean.
This commit is contained in:
Marco Sadjadi 2026-05-20 20:58:30 +02:00
parent 9cce4a94c2
commit 8d47b20ae5
2 changed files with 41 additions and 6 deletions

View File

@ -104,12 +104,25 @@ export async function deployContainer(input: DeployInput): Promise<DeployHandle>
});
}
export async function stopContainer(containerId: string): Promise<void> {
export async function stopContainer(
containerId: string,
): Promise<{ ok: boolean; detail: string }> {
if (!containerId || containerId.length < 4) {
return { ok: false, detail: 'invalid_container_id' };
}
const { spawn } = await import('node:child_process');
await new Promise<void>((resolve) => {
const child = spawn('docker', ['rm', '-f', containerId], { stdio: 'ignore' });
child.on('close', () => resolve());
child.on('error', () => resolve());
return await new Promise<{ ok: boolean; detail: string }>((resolve) => {
const child = spawn('docker', ['rm', '-f', containerId], {
stdio: ['ignore', 'pipe', 'pipe'],
});
let err = '';
child.stderr?.on('data', (d: Buffer) => {
err += d.toString();
});
child.on('error', () => resolve({ ok: false, detail: 'spawn_failed' }));
child.on('close', (code) =>
resolve(code === 0 ? { ok: true, detail: '' } : { ok: false, detail: err.trim() || `exit ${code}` }),
);
});
}

View File

@ -6,7 +6,7 @@ import { config } from './config.js';
import { generateSpec } from './lib/claude.js';
import { renderServerCode } from './lib/render.js';
import { dockerBuild, prepareBuildContext, staticCheck } from './lib/build.js';
import { allocatePort, deployContainer, dockerAvailable } from './lib/deploy.js';
import { allocatePort, deployContainer, dockerAvailable, stopContainer } from './lib/deploy.js';
import { emitDone, emitError, emitLog, emitStatus } from './lib/emit.js';
const db = createDb();
@ -46,6 +46,16 @@ export const worker = new Worker<JobData>(
const { buildId, serverId, prompt, version, slug, secrets, previewId } = job.data;
const log = (level: 'info' | 'warn' | 'error', msg: string) => emitLog(buildId, level, msg);
// Capture the container currently serving this server (if any) BEFORE the
// build mutates the row. On an iterate (version > 1) we deploy the new
// container, then tear this old one down — rolling-deploy, no orphan.
const [priorState] = await db
.select({ containerId: mcpServers.containerId })
.from(mcpServers)
.where(eq(mcpServers.id, serverId))
.limit(1);
const oldContainerId = priorState?.containerId ?? null;
try {
await db.update(builds).set({ status: 'generating', startedAt: new Date() }).where(eq(builds.id, buildId));
await db.update(mcpServers).set({ status: 'generating', updatedAt: new Date() }).where(eq(mcpServers.id, serverId));
@ -141,6 +151,18 @@ export const worker = new Worker<JobData>(
.set({ status: 'live', currentVersion: version, publicUrl: handle.publicUrl, updatedAt: new Date() })
.where(eq(mcpServers.id, serverId));
// Rolling deploy: the new container is live — now retire the previous one.
// Without this every iterate would leave an orphan holding a host port.
if (oldContainerId && oldContainerId !== handle.containerId) {
const stopped = await stopContainer(oldContainerId);
await log(
stopped.ok ? 'info' : 'warn',
stopped.ok
? `Retired previous container ${oldContainerId.slice(0, 12)}`
: `Could not stop previous container ${oldContainerId.slice(0, 12)}: ${stopped.detail}`,
);
}
await emitStatus(buildId, 'success');
await emitDone(buildId, 'success', serverId, handle.publicUrl);
} catch (err) {