Add Copilot SDK-powered spec compiler with autopilot mode#7
Add Copilot SDK-powered spec compiler with autopilot mode#7basiclines wants to merge 27 commits intomainfrom
Conversation
Adds a new `build` subcommand to the spec compiler that uses `@github/copilot-sdk` to launch a Copilot agent session, feed it the generated compilation prompt, and stream the agent's work to the terminal. The build command: - Validates Copilot auth before starting (with clear error messages) - Prompts for model selection via `gum choose` (falls back to defaults) - Prompts for reasoning effort (skipped if model doesn't support it) - Creates an autopilot session (approveAll — no permission prompts) - Streams output in two modes: - Normal: compact phase-level status with tool activity - Verbose (--verbose): raw agent transcript with streaming deltas - Tracks metrics: tokens, tool calls, files written - Prints a compilation summary (time, files, LOC, tokens) - Auto-locks specs on success (skip with --no-lock) - Handles SIGINT for graceful cancellation New flags: --model, --effort, --verbose, --no-lock Existing commands (status, prompt, lock, clean) are unchanged. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Flags (--model, --effort, --out) now pre-select the default value in the interactive prompt rather than skipping it entirely. Adds a new output directory picker via `gum input`. Flow order changed: auth runs first (fail fast), then interactive config, then prompt generation uses the user-chosen distDir. Non-TTY / no gum: falls back to sensible defaults silently. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
After each compilation pass, the user is prompted to run another pass. Subsequent passes send the agent a focused improvement prompt that re-reads specs and fixes missed implementations, failing tests, and inconsistencies. - printSummary now shows pass number and cumulative pass count - Session stays alive between passes (only disconnects on exit) - gum confirm with readline fallback for pass prompt - Auto-lock deferred until all passes complete Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace scanOutput() filesystem walk with countAgentOutput() that uses the tracked filesWritten set from tool execution events. This excludes node_modules and other non-agent files from the count. Also track file deletions separately so the summary shows 'N written, M deleted' when files are removed. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Helps users understand which lock file is keeping specs clean. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use chalk-based console.log consistently for compilation status messages instead of shelling out to gum log. Removes gumLog helper. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The Copilot SDK agent may use varying tool names (edit, create, write_to_file, str_replace_editor, etc.) depending on the model. Use substring matching on normalized tool names instead of exact string comparisons so file writes are always tracked. Also broadens the path argument lookup to check path, file_path, filePath, and file argument names. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The output dir picker default is now dist/<target>/ (e.g. dist/bun/). If the user changes it to dist/bun-claude/, that's used as-is — no extra /<target> suffix appended. The prompt command still correctly joins distDir + target since it receives the base dist/ dir. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use indentation and dim gray text only for cleaner output. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove gum (external Go binary) dependency entirely. All interactive prompts (model picker, effort picker, output dir, multi-pass confirm) now use @clack/prompts which runs in-process with zero external deps. - Single code path instead of gum-vs-chalk branching - No install prerequisites beyond Bun - Add Copilot subscription to README prerequisites Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Displays the agent's final message between the phase checkmarks and the summary stats. Gives visibility into what the agent accomplished in each pass without needing verbose mode. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Restructure the compilation prompt to prioritize a working interactive playground over surface coverage. Key changes: - Add 'depth over breadth' philosophy section - Require components to be fully complete (impl + tests + demo) before moving to the next one - Make --interactive the primary output, not a stub - Verification happens per-component, not just at the end - Multi-pass improvement prompt reinforces interactive demo priority Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Use marked + marked-terminal to render the agent's final message with proper markdown formatting (headings, bold, code, lists) and wrap it in a boxen container with dim border and 'Agent summary' title. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tell the agent to end each pass with a structured summary of what was accomplished and explicit next-pass priorities. This makes the boxed agent summary actionable and helps the user decide whether to run another pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add rule to system prompt telling the agent to check npm/GitHub for libraries before assuming they don't exist. Prevents knowledge cutoff issues where the agent falls back to polyfills for packages that are actually published. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Previously the lock only ran once after all passes finished. Now each successful pass (no errors) locks immediately, so completed components are banked incrementally. A bad subsequent pass won't lose the progress from earlier passes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Remove auto-lock after passes. Instead, instruct the agent in the system prompt to run 'bun run compile lock --target <t> --component <Name>' after fully completing each component (tests pass + demo wired). This ensures only genuinely completed components get locked. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Auto-continues passes without prompting, up to a max of (dirty specs + 5) passes. Shows pass count as 'Pass N/max' in the log. Useful for running full compilations unattended. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Rewrite the 'Compiling to a target' section to document: - All CLI commands in a table - Full build flags reference (including --autopilot) - Interactive vs autopilot workflow examples - Agent-driven per-component locking - Multi-pass philosophy (depth over breadth) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Document the compile prompt command as an alternative to compile build, for users who want to feed prompts to external agents manually. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds an automated compile build command that uses the GitHub Copilot SDK to compile TUIKit specs into target implementations, including an interactive CLI configuration flow, multi-pass refinement (with optional autopilot), and terminal rendering of agent summaries.
Changes:
- Implement Copilot SDK–driven build orchestration (session creation, streaming output, pass loop, metrics, summary rendering).
- Add interactive terminal prompts (model/effort/output dir) and autopilot multi-pass behavior.
- Update docs and dependencies to support the new build workflow and UI output.
Show a summary per file
| File | Description |
|---|---|
scripts/compile.ts |
Adds compile build command, Copilot SDK session orchestration, multi-pass loop, clack prompts, and agent summary rendering. |
package.json |
Adds dependencies for prompts, markdown rendering, and Copilot SDK. |
bun.lock |
Updates lockfile for the newly added dependencies. |
README.md |
Documents the new compile build workflow, flags, multi-pass behavior, and manual prompt usage. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comments suppressed due to low confidence (4)
scripts/compile.ts:1200
- The help text says
--no-lock"Skip auto-lock after successful build", but the build flow doesn't perform any auto-locking in this script (locking only happens ifcompile lockis run). Update the flag description (or implement auto-locking) so CLI help matches actual behavior.
--effort <level> Reasoning effort: low | medium | high | xhigh (default: high)
--verbose Show full agent transcript (raw streaming output)
--no-lock Skip auto-lock after successful build
--autopilot Auto-run passes without confirmation (max: components + 5)
README.md:292
- The README says the prompt is written to
<out>/_compile-prompt.md, butcompile promptcurrently writes to<out>/<target>/_compile-prompt.md(e.g.dist/go/_compile-prompt.md). Please update this path (and related examples) to match the actual output location, or change the command to write directly into<out>/.
The prompt is written to `<out>/_compile-prompt.md`. It contains:
README.md:301
- This note says the agent session should set its working directory to the dist target folder so relative spec paths resolve, but the generated prompt references specs like
docs/schema.md,tokens/..., andcomponents/...(relative to the repo root). Running fromdist/<target>would make those paths incorrect. Either update the prompt to emit paths relative to the dist folder, or change this guidance to say the working directory must be the repo root (SPECS_DIR).
> **Important:** Any coding session that uses this prompt should set its working
> directory to the dist target folder (where `_compile-prompt.md` lives). The
> prompt references spec files using relative paths that resolve correctly only
> from that location.
scripts/compile.ts:768
--no-lockis passed intocmdBuild(), but the system prompt still instructs the agent to runcompile lockafter finishing components/tokens. If--no-lockis meant to prevent locking, conditionally omit these instructions (or block lock writes via the permission handler) when the flag is set.
LOCKING COMPLETED COMPONENTS:
After you fully complete a component (implementation + tests passing + demo wired),
lock it by running:
bun run compile lock --target ${target} --component <Name>
This records the component as compiled so it won't be recompiled in future runs.
- Files reviewed: 3/4 changed files
- Comments generated: 4
| model: config.model, | ||
| onPermissionRequest: approveAll, | ||
| streaming: true, |
There was a problem hiding this comment.
Using approveAll for onPermissionRequest means the agent can perform arbitrary file and shell operations once the session starts, including outside the intended output directory. Consider using a custom permission handler that restricts writes/deletes/commands to outDir (and target-specific test/demo commands) to reduce the blast radius, especially for --autopilot/CI use.
| ` Output: ${relative(SPECS_DIR, outDir)}/`, | ||
| noLock ? ` Lock: skipped (--no-lock)` : ` Lock: ${relative(SPECS_DIR, lockPath(target))} updated`, | ||
| ].join("\n"); |
There was a problem hiding this comment.
printSummary() reports the lock file as "updated" whenever --no-lock is not set, but cmdBuild() never calls writeLock()/cmdLock() (locking is only done if the agent happens to run compile lock). This makes the summary misleading and the --no-lock flag effectively a no-op. Either implement an actual auto-lock step (e.g., only when the session completed without errors) or change the summary/flag semantics to reflect agent-driven locking.
This issue also appears in the following locations of the same file:
- line 764
- line 1197
…prompt path docs - --no-lock now conditionally omits lock instructions from system prompt - Updated help text to reflect actual behavior - Fixed README prompt path: <out>/<target>/_compile-prompt.md - Fixed working directory guidance: repo root, not dist folder Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Set session.rpc.mode.set() based on --autopilot flag: - Default: 'interactive' mode, user confirms each pass - --autopilot: 'autopilot' mode, agent runs autonomously Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…behavior Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…m loop) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Motivation
The TUIKit spec compiler previously only generated prompt files for manual copy-paste into external agents. This PR adds a fully automated
compile buildcommand that uses the GitHub Copilot SDK to compile component specs into target code -- including multi-pass refinement, interactive CLI prompts, and an unattended autopilot mode.Approach
The core addition is
compile build, which orchestrates a Copilot SDK session to transform component specs into working code for any target (Bun, Node, Go, Rust). Key design decisions:compile lock --component <Name>to bank completed work. Subsequent passes only recompile dirty (unlocked) specs.--autopilot): Auto-continues passes without user confirmation, capped atdirty_count + 5max attempts. Useful for CI or unattended builds.gumdependency with a pure JS solution -- model picker, reasoning effort, output dir, and pass confirmation all work in-terminal with no external binaries.marked+marked-terminal+boxen.Notable details
reasoningEffortis only passed to models that support it (checked viaModelInfo.supportedReasoningEfforts).Files changed
scripts/compile.ts-- Main compiler: SDK integration, prompt generation, build loop, autopilot, agent locking, summary renderingpackage.json-- Added@clack/prompts,marked,marked-terminal,boxenREADME.md-- Comprehensive docs: commands table, build flags, autopilot, multi-pass workflow, manual prompt generationbun.lock-- Updated lockfile