@gemstack/ai-autopilot

Orchestration for @gemstack/ai-sdk agents: the "director" layer that runs many agent runs under a control policy.

ai-sdk owns the single-agent loop and the handoff / subagent primitives. ai-autopilot owns orchestrating multiple runs: which agents run, in what order, how their results combine, and when to stop. Anything that is just a call to an ai-sdk primitive belongs in ai-sdk; this package adds value as the topology and control-policy layer on top.

bash

pnpm add @gemstack/ai-autopilot @gemstack/ai-sdk

Supervisor (plan, dispatch, synthesize)

The supervisor/worker topology is the first orchestration shape this package ships:

Plan - a planner decomposes the task into subtasks.
Dispatch - each subtask runs on a worker agent, with bounded concurrency, an optional token budget, and per-subtask error isolation.
Synthesize - a synthesizer combines the results into the final answer.

import { Supervisor, agentPlanner, agentSynthesizer } from '@gemstack/ai-autopilot'

const supervisor = new Supervisor({
  plan: agentPlanner(plannerAgent),                          // LLM decomposition
  workers: { research: researchAgent, write: writerAgent },  // routed by subtask.worker
  synthesize: agentSynthesizer(editorAgent),                 // LLM synthesis
  concurrency: 3,
  maxSubtasks: 8,
  budget: { maxTotalTokens: 200_000 },
  onEvent: (e) => console.log(e.type),
})

const run = await supervisor.run('Draft a launch brief for product X')
console.log(run.text)          // synthesized answer
console.log(run.results)       // per-subtask outcomes (ok / error / usage)
console.log(run.usage)         // aggregate token usage across dispatched subtasks
console.log(run.stoppedEarly)  // true if a guardrail trimmed or halted work

Supervisor validates its options at construction (plan, workers, positive concurrency / maxSubtasks), and run() rejects an empty task, so misconfiguration fails fast with a clear message.

Pieces are pluggable

Each stage is a plain function, so you mix LLM and deterministic logic freely:

plan - a Planner: (task) => Subtask[]. Use agentPlanner(agent) for LLM decomposition, or return a static list (or any hand-rolled logic).
workers - a single Agent (every subtask runs on it), a Record<string, Agent> (routed by subtask.worker), or a WorkerRouter function for full control.
synthesize - a Synthesizer: (task, results) => string. Defaults to defaultSynthesize (concatenate the successful results, no LLM call); pass agentSynthesizer(agent) for an LLM pass.

agentPlanner and agentSynthesizer are the two adapters that turn an ai-sdk agent into a Planner / Synthesizer; everything else can be ordinary code.

Guardrails

Guardrail	Default	Effect
`concurrency`	`4`	Max workers in flight; positive integer.
`maxSubtasks`	none	Hard cap. A longer plan is trimmed and `stoppedEarly` is set. Omit for no cap.
`budget.maxTotalTokens`	none	Stop dispatching once aggregate dispatch usage crosses the limit. In-flight workers finish (usage can overshoot slightly); remaining subtasks are skipped. Omit for no limit.

Two further safety properties hold without configuration:

Error isolation - a worker that throws becomes an ok: false result; siblings continue.
Observer safety - an onEvent callback that throws is logged and swallowed; it never aborts the run.

Progress is reported through onEvent as typed SupervisorEvents (plan, plan-trimmed, dispatch-start, dispatch-result, budget-exceeded, synthesize).

The run result

supervisor.run(task) resolves to a SupervisorRun:

Field	Type	Meaning
`text`	`string`	The synthesized final answer.
`plan`	`PlannedSubtask[]`	The plan that was executed (after any guardrail trimming).
`results`	`SubtaskResult[]`	One result per dispatched subtask, in plan order. Each carries `text`, `ok`, optional `error`, and `usage`.
`usage`	`TokenUsage`	Aggregate token usage across the dispatched subtasks (planning and synthesis spend are not included, since the `Planner` / `Synthesizer` contracts return data, not usage).
`stoppedEarly`	`boolean`	True when a guardrail (subtask cap or token budget) stopped work early.

Scope (what's deferred)

The supervisor dispatches autonomous workers via agent.prompt(). A worker that pauses for a client-tool or approval round-trip is reported as a failed subtask. Durable pause/resume across a supervised run (building on ai-sdk's SubAgentRunStore and resume primitives) is a deferred adapter, as are other topologies (pipelines, debate) and queue-backed long-running execution. Those land on demand, behind optional seams, not in the core.

@gemstack/ai-autopilot ​

Supervisor (plan, dispatch, synthesize) ​

Pieces are pluggable ​

Guardrails ​

The run result ​

Scope (what's deferred) ​

See also ​