opencode — Nghiên cứu sâu các kỹ thuật Harness

Phân tích chi tiết scaffolding agent-loop trong repo anomalyco/opencode (aka sst/opencode). 28 kỹ thuật, mỗi kỹ thuật kèm code thật từ repo, code example, pros/cons và bài viết tham khảo.

Ngày: 2026-04-20Tác giả: Nghĩa & CoworkRepo: anomalyco/opencodeCommit analyzed: dev branch, ~15k LOC trong core harness

1. Tổng quan opencode Intro

opencode là một AI coding agent open-source chạy trên terminal, IDE, và desktop. Repo anomalyco/opencode là fork chính của sst/opencode với rất nhiều package: core agent (packages/opencode), desktop app (Tauri + Electron), web console, plugin SDK, infrastructure (SST), v.v.

Báo cáo này chỉ tập trung vào harness của core agent — phần "scaffolding" xoay quanh LLM call để biến một model trần thành một coding agent hoạt động được. Scope bao gồm 6 subsystem chính:

session/ — agent loop, streaming processor, compaction, retry
tool/ — tool registry, tool implementations, truncation
provider/ — abstraction cho 20+ LLM providers, message transformation, error classification
permission/ — permission engine với wildcard + arity
agent/ — agent definitions (build, plan, custom subagents)
mcp/, plugin/ — extension points

Harness là gì?

Cộng đồng coding-agent đang dần dùng thuật ngữ "harness" như một shorthand cho tất cả những gì không phải model — tức là: Agent = Model + Harness. Harness engineering là subset của context engineering, xoay quanh việc quản lý context window, tool orchestration, state persistence, error recovery, verification, safety, và lifecycle.

Tham khảo về khái niệm harness: Martin Fowler — Harness engineering for coding agent users · OpenAI — Harness engineering: leveraging Codex in an agent-first world · HumanLayer — Skill Issue: Harness Engineering for Coding Agents · Avi Chawla — The Anatomy of an Agent Harness.

Kiến trúc tổng thể

┌─────────────────────────────────────────────────────────────────────┐ │ User input / CLI / TUI │ └────────────────────────────────┬────────────────────────────────────┘ │ ┌────────────▼────────────┐ │ session/prompt.ts │ ← Main agent loop (runLoop) │ (ReAct-style, step++) │ Detect finish reason, retry, compact └────┬──────────────┬─────┘ │ │ ┌───────────────▼────┐ ┌──▼─────────────────┐ │ session/llm.ts │ │ session/processor │ ← Handles stream events: │ streamText() wrap │ │ Effect.Stream pipe │ reasoning/text/tool/finish └────────┬───────────┘ └────────┬───────────┘ │ │ ┌────────▼──────────┐ ┌──────────▼─────────┐ │ provider/ │ │ tool/registry │ ← Resolves available tools │ SDK lazy-loaded │ │ + permission gate│ per-agent, with truncation └────────┬──────────┘ └──────────┬─────────┘ │ │ ┌────────▼──────────┐ ┌──────────▼─────────┐ │ provider/transform│ │ tool/*.ts │ ← bash, edit, read, task, │ per-provider quirk│ │ (each with .txt) │ skill, plan, webfetch, ... └───────────────────┘ └────────────────────┘ │ │ ┌────────▼───────────────────────────▼─────────┐ │ permission/index.ts (wildcard + arity) │ │ session/compaction.ts (overflow → prune) │ │ session/retry.ts (exp backoff + headers) │ └──────────────────────────────────────────────┘

Tech stack

Layer	Công nghệ	Ghi chú
Runtime	Bun + Node	Bun chính, Node compat
Language	TypeScript	Monorepo TurboRepo
Async/Effect	Effect v4 (beta)	Core pattern toàn bộ
AI SDK	Vercel AI SDK (`streamText`)	Unified LLM interface
Validation	Zod + Effect Schema	Tool inputs + config
DB	Drizzle ORM + SQLite	Session persistence
Parsing	tree-sitter WASM	Bash + PowerShell for permission scan
Observability	OpenTelemetry	Native span instrumentation
MCP	Model Context Protocol	External tool servers

Bảng tóm tắt 28 kỹ thuật

ID	Kỹ thuật	Theme
T1	ReAct loop với finish-reason-aware exit	Loop
T2	Streaming event demultiplexer	Loop
T3	Deferred tool-call coordination	Loop
T4	Interruption-safe scope cleanup	Loop
T5	Synthetic system reminder	Loop
T6	Doom loop detection	Loop
T7	Token overflow với reserved buffer (cache-aware)	Context
T8	Tail-turn-preserving compaction	Context
T9	Tool output pruning với protected tools	Context
T10	Cache-aware 2-part system prompt	Context
T11	Auto compaction continuation (replay/continue)	Context
T12	Structured compaction template	Context
T13	.txt tool description pattern	Tool
T14	Output truncation + file spill	Tool
T15	Zod schema validation + custom errors	Tool
T16	Fuzzy edit matching 3-tier	Tool
T17	Bash tree-sitter scan for permission	Tool
T18	Sub-agent via Task tool	Tool
T19	Plugin tool dynamic discovery	Tool
T20	OTel tracing + metadata pipeline	Tool
T21	Multi-provider SDK lazy loading	Provider
T22	Provider-specific message transformation	Provider
T23	Retry + overflow pattern detection	Provider
T24	Wildcard + session permission state	Permission
T25	Arity-based command normalization	Permission
T26	Per-tool permission context extraction	Permission
T27	Dynamic env + skill injection	Prompt
T28	AGENTS.md / CLAUDE.md cascading	Prompt

A. Agent Loop & Streaming 6 kỹ thuật

Đây là trái tim của harness — vòng lặp agent và pipeline streaming event. opencode đặc biệt chú ý đến các edge case mà harness ngây thơ thường bỏ qua: model trả finish_reason=stop khi còn tool pending, stream bị interrupt giữa chừng, tool call lặp vô hạn.

T1. ReAct loop với finish-reason-aware exit

A.1

File: session/prompt.ts · Hàm: runLoop() · Lines: 1305–1530

Code từ opencode

while (true) { yield* status.set(sessionID, { type: "busy" }) yield*
        slog.info("loop", { step }) let msgs = yield*
        MessageV2.filterCompactedEffect(sessionID) // Detect
        finish condition — check AFTER tool execution, not before if (lastAssistant?.finish && !["tool-calls"].includes(lastAssistant.finish)
        && !hasToolCalls && lastUser.id < lastAssistant.id) {
        yield* slog.info("exiting
        loop") break } step++ const result = yield*
        handle.process(streamInput) if (result === "stop") return "break" if (result === "compact") yield*
        compaction.create(...) }

Tại sao quan trọng: Nhiều harness ngây thơ exit loop khi thấy finish_reason === "stop". opencode kiểm tra finish reason sau khi tool đã execute, và chỉ exit khi !hasToolCalls. Điều này sống sót với các model đôi khi trả stop dù vẫn còn tool pending (quirk phổ biến của một số provider), đảm bảo tool result luôn được gửi về.

Code example (generic)

// Naive (wrong): while
        (true) { const resp =
        await llm.call(messages) if (resp.finishReason === "stop") break // may exit with unfinished tool_calls! for (const tc of resp.toolCalls) messages.push(executeToolCall(tc))
        } // Opencode-style (correct): while (true) { const resp = await
        llm.call(messages) const hasToolCalls =
        resp.toolCalls.length > 0 if (hasToolCalls) {
        for (const tc of resp.toolCalls) messages.push(executeToolCall(tc))
        continue // always loop again
        after executing tools } if (["stop", "end_turn"].includes(resp.finishReason)) break }

Ưu điểm

Chịu được provider quirks (stop thay vì tool-calls)
Tool result không bao giờ bị bỏ rơi
Logic tường minh, dễ debug

Nhược điểm

Cần explicit termination (risk vô hạn — xem T6)
Thêm 1 LLM call dư thừa khi model đã định stop
Khó test — state phức tạp theo step

Tham khảo

Phân tích sâu T1: anatomy code, failure modes, so sánh harness, implementation recipe

T2. Streaming event demultiplexer

A.2

File: session/processor.ts · Hàm: handleEvent() · Lines: 216–461 · LLM wrap: session/llm.ts L54–94

Code từ opencode

const handleEvent = Effect.fnUntraced(function* (value: StreamEvent) { switch (value.type) { case
        "reasoning-start": ctx.reasoningMap[value.id] =
        { id: PartID.ascending(), text: "", time: {
        start: Date.now() } } yield*
        session.updatePart(ctx.reasoningMap[value.id]); return case "reasoning-delta": ctx.reasoningMap[value.id].text +=
        value.text yield* session.updatePartDelta({
        .../* delta */, delta: value.text }); return case "tool-input-start": const
        part = yield* session.updatePart({ status: "pending", input: {} }) ctx.toolcalls[value.id] = {
        done: yield* Deferred.make<void>(), partID: part.id } return case "tool-call": yield*
        updateToolCall(value.toolCallId, (m) => ({ ...m, status: "running", input: value.input })) // Doom loop check — see T6 return } })

Stream được wrap trong Effect.Stream (session/llm.ts):

// session/processor.ts:544-554 const stream = llm.stream(streamInput) yield* stream.pipe( Stream.tap((event) =>
        handleEvent(event)), Stream.takeUntil(() => ctx.needsCompaction),
        Stream.runDrain, )

Tại sao quan trọng: AI SDK trả async iterable; opencode bọc trong Effect.Stream để có: (1) interrupt được qua Stream.takeUntil, (2) transactional per-event, (3) composable retry, (4) cancel qua scope. Mỗi event type có handler riêng → UI update thật real-time (tool transitions pending→running→completed), không phải đợi full response.

Code example (generic)

// Demultiplex stream events by type: async function processStream(stream:
        AsyncIterable<Event>, ctx: Context) { for
        await (const ev of stream) { switch (ev.type)
        { case "text-delta":
        ctx.onText?.(ev.delta); break case "tool-input-start":
        ctx.pendingTools.set(ev.id, new Deferred()); break case "tool-call": ctx.onToolCall?.(ev); break case "finish": ctx.finish = ev.finishReason; break } if (ctx.shouldStop)
        break // cooperative
        cancel } }

Ưu điểm

UI real-time update (tool transitions, reasoning deltas)
Cooperative cancellation trivial
Per-event state mutation tường minh
Composable với retry, timeout, tracing

Nhược điểm

State machine phức tạp hơn (nhiều partial state)
Coupling với format event của provider SDK
Cần tracking cẩn thận partial text để ghép lại

Tham khảo

Phân tích sâu T2: event types, Effect.Stream, UI real-time update, cancel semantics

T3. Deferred tool-call coordination

A.3

File: session/processor.ts · Lines: 134–195, 273–278, 544–582

Code từ opencode

type ToolCall = { partID:
        MessageV2.ToolPart["id"] messageID:
        MessageV2.ToolPart["messageID"] sessionID:
        MessageV2.ToolPart["sessionID"] done:
        Deferred.Deferred<void> // ← sync point } ctx.toolcalls: Record<string, ToolCall> = {} // On
        tool-input-start event: ctx.toolcalls[value.id] = { done: yield* Deferred.make<void>(), partID: part.id } //
        Tool executor completes it: yield*
        Deferred.succeed(done, undefined) // Cleanup waits for all in-flight tools (but with
        timeout): yield* Effect.forEach(
        Object.values(ctx.toolcalls), (call) =>
        Deferred.await(call.done).pipe(Effect.timeout("250
        millis"), Effect.ignore), { concurrency: "unbounded" }, )

Tại sao quan trọng: AI SDK tool handler là async; LLM stream event lại được consume sync từng event. Cần sync point giữa "stream finish" và "all tools done". Deferred là cầu nối. Timeout 250ms đảm bảo tool stuck không block loop mãi.

Code example (generic)

// Using native Promise as deferred: const pendingTools = new
        Map<string, { resolve: () => void;
        promise: Promise<void> }>() function trackToolCall(id: string) { let resolve!: () => void
        const promise = new
        Promise<void>((r) => { resolve = r })
        pendingTools.set(id, { resolve, promise }) } async
        function waitForAllTools(timeoutMs = 250) { const promises = [...pendingTools.values()].map((p)
        => Promise.race([p.promise, new Promise((res)
        => setTimeout(res, timeoutMs))]) ) await
        Promise.all(promises) pendingTools.clear() }

Ưu điểm

Không block stream consumer khi tool async
Timeout bảo vệ khỏi stuck tool
Rõ ràng "nothing pending" trước khi loop tiếp

Nhược điểm

Deferred manual mapping — có thể rò rỉ nếu quên cleanup
Timeout 250ms arbitrary — có thể quá ngắn cho tool nặng
Unbounded concurrency có thể overload

Tham khảo

Phân tích sâu T3: Deferred pattern, concurrency, timeout tuning, native Promise alternative

T4. Interruption-safe scope cleanup

A.4

File: session/processor.ts · Lines: 544–582

Code từ opencode

return yield*
        Effect.gen(function* () { yield* Effect.gen(function* ()
        { ctx.currentText = undefined ctx.reasoningMap =
        {} const stream = llm.stream(streamInput) yield* stream.pipe( Stream.tap((event) =>
        handleEvent(event)), Stream.takeUntil(() => ctx.needsCompaction),
        Stream.runDrain, ) }).pipe( Effect.onInterrupt(() => Effect.gen(function* () { aborted = true
        if (!ctx.assistantMessage.error) { yield* halt(new
        DOMException("Aborted", "AbortError")) } })), Effect.catchCauseIf( (cause)
        => !Cause.hasInterruptsOnly(cause), (cause) =>
        Effect.fail(Cause.squash(cause)), ), Effect.retry(...),
        Effect.catch(halt), Effect.ensuring(cleanup()), // ←
        always runs ) })

Tại sao quan trọng: Cancel user giữa stream (Ctrl+C), retry, hoặc error bất ngờ — cleanup luôn chạy. Cleanup đóng tool call pending với status "aborted", finalize partial text, save state. Effect cho phép semantic này transactional — một tính năng hiếm trong các agent harness khác (thường phải tự quản lý AbortController + try/finally thủ công).

Code example (generic)

// Vanilla JS với try/finally: async function streamWithCleanup(signal: AbortSignal)
        { const ctx = createContext() try { for await (const ev of llm.stream({ signal })) { if (signal.aborted) throw new
        DOMException("Aborted", "AbortError") await
        handleEvent(ev, ctx) } } catch (err) { if ((err as Error).name ===
        "AbortError") ctx.markAborted() else throw err } finally {
        await ctx.finalizeAllPendingTools() // always runs } }

Ưu điểm

Resource leak-safe (file handles, DB connections, tool child processes)
UI không bị "stuck tool pending" sau cancel
Effect.ensuring tránh được try/catch hell

Nhược điểm

Đòi hỏi phải dùng Effect hoặc abstraction tương đương
Interrupt semantics không quen thuộc với dev Promise-based
Cleanup logic dễ bị bỏ sót nếu có nhiều exit path

Tham khảo

Phân tích sâu T4: Effect.ensuring, resource leaks, AbortController pattern, async cleanup

T5. Synthetic system reminder cho mid-turn messages

A.5

File: session/prompt.ts · Lines: 1453–1468

Code từ opencode

if (step > 1 && lastFinished) { for (const m of msgs) { if (m.info.role !==
        "user" || m.info.id <= lastFinished.id)
        continue for (const p of m.parts) { if (p.type !== "text" ||
        p.ignored || p.synthetic) continue if (!p.text.trim()) continue
        p.text = [ "<system-reminder>", "The user sent the following message:", p.text, "", "Please address this message
        and continue with your tasks.", "</system-reminder>", ].join("\n") } } }

Tại sao quan trọng: Khi user gõ thêm message giữa lúc agent đang làm việc (multi-step loop đang chạy), model dễ bỏ qua input mới nếu nó đi cùng history thông thường. Wrap trong <system-reminder> nhấn mạnh input mới cần được address. Chỉ apply step > 1 (step 1 không cần vì user msg là context gốc).

Code example (generic)

function markMidTurnUserMessages(messages:
        Message[], lastAssistantId: string, step: number) { if (step <= 1) return
        messages return messages.map((m) => { if (m.role !== "user" || m.id
        <= lastAssistantId) return m return { ...m, content: `<system-reminder>\nNew message:
        ${m.content}\nPlease
        address.\n</system-reminder>`, } }) }

Ưu điểm

Đảm bảo user input mới không bị "missed"
Không cần model hỗ trợ interrupt thật
Rẻ (just prompt wrapping)

Nhược điểm

Overhead token cho wrapper tags
Model có thể respond "I acknowledge" rồi ignore thực sự
Wrap quá nhiều làm loãng system prompt chính

Tham khảo

Anthropic — Claude Code auto mode (discusses system reminder patterns)
HumanLayer — Harness Engineering (context re-injection)

Phân tích sâu T5: XML wrapping, step guard, model attention, prompt engineering

T6. Doom loop detection

A.6

File: session/processor.ts (trong handleEvent case tool-call) · PR: #3445 fix: add doom loop detection

Code từ opencode (concept)

// Khi thấy tool-call event: case "tool-call": { yield* updateToolCall(value.toolCallId, (m) => ({
        ...m, status: "running", input: value.input }))
        // Doom loop: 3 tool-call cuối cùng giống hệt nhau
        (tên + input) const recentParts =
        parts.slice(-3) if (recentParts.every((p) =>
        p.tool === value.toolName && JSON.stringify(p.state.input) ===
        JSON.stringify(value.input))) { yield*
        permission.ask({ permission: "doom_loop",
        patterns: [value.toolName], always: [], metadata: { input: value.input
        }, }) } return }

Tại sao quan trọng: LLM đôi khi bị kẹt gọi cùng tool với cùng args (vd đọc file không tồn tại liên tục). Không có guard, agent burn token & tiền không dừng. opencode detect 3 lần lặp & prompt user xác nhận (permission type doom_loop) — cho phép user abort hoặc thoát.

Code example (generic)

const MAX_REPEATS = 3 function isDoomLoop(history: ToolCall[], next:
        ToolCall): boolean { const recent = history.slice(-MAX_REPEATS) if (recent.length < MAX_REPEATS) return false const signature =
        (tc: ToolCall) => `${tc.name}:${JSON.stringify(tc.input)}` const target =
        signature(next) return recent.every((tc) =>
        signature(tc) === target) } // Usage: if (isDoomLoop(history, nextCall)) { const ok = await askUser("Detected repeated tool call. Continue?") if (!ok) throw new Error("Doom loop aborted") }

Ưu điểm

Tránh chi phí & thời gian lãng phí
Trigger khi thật sự có vấn đề (tol arg signature)
User vẫn có quyền continue nếu muốn

Nhược điểm

False positive với legitimate retry (vd LSP tool gọi nhiều lần trên cùng file)
Threshold = 3 arbitrary — có thể quá tight hoặc quá loose
Không detect được "near-duplicate" (arg thay đổi nhẹ)

Tham khảo

B. Context Management — Giữ cho cửa sổ ngữ cảnh không cháy

Model nào cũng có context window giới hạn. Một session dài vài giờ, hàng trăm tool call, có thể tích lũy hàng trăm ngàn token. opencode chọn cách "phòng ngự chủ động": đếm token proactively, compact khi đạt ngưỡng, và giữ lại những phần quan trọng nhất (turn cuối + công cụ không được phép compact).

T7. Overflow detection với reserved buffer (cache-aware)

B.1

File: session/overflow.ts · Lines: 1-26

Code từ opencode

export const DEFAULT_RESERVED_OUTPUT_TOKENS = 20_000 export function
        computeUsable(modelLimit: number, reserved =
        DEFAULT_RESERVED_OUTPUT_TOKENS) { return Math.max(0, modelLimit -
        reserved) } /** * Tổng token context = input + cache.read + cache.write.
        * Với Anthropic, cache write/read vẫn chiếm chỗ trong context window, *
        nên phải cộng tất cả để đánh giá overflow chính xác. */ export function
        totalContextTokens(usage: Usage): number { return (usage.input ?? 0) +
        (usage.cache?.read ?? 0) + (usage.cache?.write ?? 0) } export function
        shouldCompact(usage: Usage, modelLimit: number, reserved?: number) {
        const usable = computeUsable(modelLimit, reserved) const used =
        totalContextTokens(usage) return used &gt;= usable }

Tại sao quan trọng: Nhiều harness chỉ đếm input_tokens trả về từ API mà quên cached tokens — dẫn tới overflow bất ngờ. opencode đếm cả cache read/write vào context để so với limit, đồng thời chừa sẵn 20k token cho output nên model luôn đủ chỗ sinh câu trả lời tiếp theo.

Code example (generic)

interface Usage { input: number output: number cache?: { read: number;
        write: number } } const RESERVED_OUTPUT = 20_000 // luôn chừa cho
        completion function needsCompaction(usage: Usage, contextWindow:
        number): boolean { const total = usage.input + (usage.cache?.read ?? 0)
        + (usage.cache?.write ?? 0) const threshold = contextWindow -
        RESERVED_OUTPUT return total &gt;= threshold } // Sử dụng sau mỗi lần
        gọi model if (needsCompaction(lastUsage, model.contextWindow)) { await
        compactConversation(session) }

Ưu điểm

Đếm đúng context thật (bao gồm cache), tránh nhầm lẫn giữa "billing token" và "context token"
Reserve output buffer giúp model luôn đủ chỗ để trả lời đầy đủ, không bị truncate giữa chừng
Logic đơn giản, dễ tune per-provider (một số provider cache không chiếm context)

Nhược điểm

Reserve cứng 20k có thể quá ít với các response dài (code generation lớn) hoặc quá nhiều với Q&A ngắn
Không tính tool schema / system prompt đã nằm trong input, có thể miscount khi các part này thay đổi động
Khi provider API trả usage không chính xác (Bedrock, một vài gateway), số đếm sẽ lệch

Tham khảo

Phân tích sâu T7: cache-aware counting, reserved buffer, per-provider differences, failure modes

T8. Tail-turn preserving compaction

B.2

File: session/compaction.ts · Lines: 33-170

Code từ opencode

// Budget cho "đuôi" giữ nguyên không compact const tailBudget =
        Math.min( Math.max(Math.floor(usable * 0.25), 2_000), 8_000, ) // Duyệt
        từ cuối ngược lên, giữ lại các turn cho tới khi vượt budget const tail:
        Message[] = [] let tailTokens = 0 for (let i = messages.length - 1; i
        &gt;= 0; i--) { const turnTokens = countTokens(messages[i]) if
        (tailTokens + turnTokens &gt; tailBudget &amp;&amp; tail.length &gt;=
        MIN_TAIL) break tail.unshift(messages[i]) tailTokens += turnTokens } //
        Compact phần đầu thành summary const head = messages.slice(0,
        messages.length - tail.length) const summary = await model.complete({
        system: COMPACTION_PROMPT, messages: head, }) return [ { role: "system",
        content:
        `&lt;prior-conversation-summary&gt;\n${summary}\n&lt;/prior-conversation-summary&gt;`
        }, ...tail, ]

Tại sao quan trọng: Nếu compact "sạch" toàn bộ, model mất ngữ cảnh các chỉ thị gần nhất (đang đọc file gì, user vừa yêu cầu gì). Giữ lại tail (25% budget, min 2k, max 8k) bảo toàn "ngữ cảnh tức thời" trong khi nén phần lịch sử.

Code example (generic)

async function compactWithTail(messages: Msg[], usableTokens: number) {
        const tailBudget = Math.min(Math.max(usableTokens * 0.25, 2000), 8000)
        let tailTokens = 0 const tail: Msg[] = [] for (let i = messages.length -
        1; i &gt;= 0; i--) { const t = countTokens(messages[i]) if (tailTokens +
        t &gt; tailBudget &amp;&amp; tail.length &gt;= 2) break
        tail.unshift(messages[i]) tailTokens += t } const head =
        messages.slice(0, messages.length - tail.length) const summary = await
        summarize(head) // 1 lần gọi LLM cho phần đầu return
        [systemSummary(summary), ...tail] // ghép lại }

Ưu điểm

Giữ được "short-term memory" — các tool output / chỉ thị gần nhất không bị mất
Budget tỉ lệ (25%) tự scale theo model size: Opus giữ nhiều hơn Haiku
MIN_TAIL = 2 đảm bảo user message cuối luôn còn nguyên văn

Nhược điểm

Summary có thể "hallucinate" hoặc bỏ sót chi tiết quan trọng từ đầu cuộc hội thoại
25% là magic number — workload code-gen nặng có thể cần 40-50%
Không có cơ chế pin "important messages" (ví dụ câu user định nghĩa yêu cầu gốc)

Tham khảo

Phân tích sâu T8: tail budget algorithm, backward walk, MIN_TAIL, compaction pipeline

T9. Tool output pruning với protected tools

B.3

File: session/compaction.ts · Lines: 171-219

Code từ opencode

const PROTECTED_TOOLS = new Set(["skill"]) // skill output đắt → giữ
        nguyên function pruneToolOutputs(messages: Message[]): Message[] { //
        Không prune 2 user turn cuối → model cần xem tool output gần nhất để
        hành động const recentUserTurnIndices =
        findLastUserTurnIndices(messages, 2) const skipBoundary =
        recentUserTurnIndices[0] ?? messages.length return messages.map((msg,
        idx) =&gt; { if (idx &gt;= skipBoundary) return msg if (msg.role !==
        "tool") return msg if (PROTECTED_TOOLS.has(msg.toolName)) return msg //
        Thay output bằng placeholder + metadata đánh dấu đã compact return {
        ...msg, content: `&lt;tool-output-compacted tool="${msg.toolName}"
        /&gt;`, metadata: { ...msg.metadata, time: { compacted: Date.now() } },
        } }) }

Tại sao quan trọng: Tool output (bash, grep, read file...) thường dài nhất trong conversation. Prune output cũ nhưng giữ tool call + chat message giúp model thấy được "lịch sử hành động" mà không chịu chi phí token của output. Protected tool list tránh xoá skill prompt (thường đóng vai trò instruction runtime).

Code example (generic)

const PROTECTED = new Set(["skill", "memory"]) function
        pruneOldToolOutputs(msgs: Msg[]): Msg[] { const recentBoundary =
        lastNUserTurns(msgs, 2) // giữ nguyên 2 turn cuối return msgs.map((m, i)
        =&gt; { if (i &gt;= recentBoundary) return m if (m.role !== "tool" ||
        PROTECTED.has(m.toolName)) return m return { ...m, content:
        `[${m.toolName} output compacted]`, metadata: { compactedAt: Date.now()
        }, } }) }

Ưu điểm

Giảm context đáng kể (tool output thường lớn nhất) mà vẫn giữ "shape" cuộc hội thoại
Protected list cho phép đánh dấu các output "đắt giá" không được xoá
Metadata compactedAt cho phép debug / restore về sau nếu cần

Nhược điểm

Model có thể bối rối khi gặp [tool output compacted] nếu message gần đó tham chiếu kết quả
Hard-code PROTECTED_TOOLS = ["skill"] không đủ linh hoạt cho workload đa dạng
Không có cơ chế partial prune (giữ 200 dòng đầu + cuối) — all-or-nothing

Tham khảo

Phân tích sâu T9: protected set, boundary detection, partial vs all-or-nothing, plugin config

T10. Cache-aware 2-part system prompt

B.4

File: session/system.ts + session/llm.ts:99-160

Code từ opencode

// system prompt được split thành [header, rest]: // - header: phần
        static (identity, rules, tool descriptions) → cache được // - rest: phần
        dynamic (env block, git status, skills list) → ephemeral function
        buildSystem(model: Model, ctx: Ctx): [string, string] { const header =
        renderTemplate(getModelTemplate(model)) // reusable const rest = [
        buildEnvBlock(ctx), // &lt;env&gt;directory, platform...&lt;/env&gt;
        buildSkillsList(ctx.skills), buildProjectInstructions(ctx), // AGENTS.md
        ].join("\n\n") return [header, rest] } // Khi gọi API, đặt cache_control
        vào cuối header const systemBlocks = [ { type: "text", text: header,
        cache_control: { type: "ephemeral" } }, { type: "text", text: rest }, //
        không cache ]

Tại sao quan trọng: Anthropic chỉ cho tối đa 4 cache breakpoints per request, và cache invalid ngay khi 1 byte phía trước thay đổi. Tách static vs dynamic giúp cache hit rate cao (header không đổi → 90% cost reduction). Nhiều harness khác concat toàn bộ system prompt thành 1 string → mỗi message đều invalid cache.

Code example (generic)

// Anthropic message format await anthropic.messages.create({ model:
        "claude-sonnet-4", system: [ { type: "text", text: STATIC_SYSTEM_PROMPT,
        // &gt;~1024 tokens để đủ cache cache_control: { type: "ephemeral" }, //
        breakpoint }, { type: "text", text: dynamicEnvBlock }, // không cache ],
        messages, tools: [...TOOLS, { /* last tool */ cache_control: { type:
        "ephemeral" } }], })

Ưu điểm

Giảm 80-90% cost cho input tokens trong các session dài với Anthropic
Latency thấp hơn (cache read ~100ms vs generation ~500ms)
Pattern clean, dễ apply cho các provider khác (OpenAI cũng có prompt caching)

Nhược điểm

Chỉ có hiệu quả khi header đủ dài (> 1024 token với Anthropic); nếu nhỏ hơn, không cache được
Plugin hook transform có thể vô tình thay đổi header → cache miss → plugin phải cẩn thận
Không phải provider nào cũng hỗ trợ prompt caching (Gemini, nhiều self-hosted model không)

Tham khảo

Phân tích sâu T10: static/dynamic split, cache breakpoints, plugin rejoin, cost savings

T11. Auto compaction continuation (replay/continue)

B.5

File: session/compaction.ts · Lines: 372-451

Code từ opencode

// Sau khi compact xong, phải "nhắc" model tiếp tục task đang dở async
        function continueAfterCompaction(session: Session, lastUserMsg: Message)
        { if (lastUserMsg.hasMedia()) { // Media không thể replay → inject text
        description return injectSyntheticUser(session,
        describeMedia(lastUserMsg)) } if (isUserMessage(lastUserMsg) &amp;&amp;
        !lastUserMsg.answered) { // User message chưa được trả lời → replay
        nguyên văn return replayMessage(session, lastUserMsg) } // Model đang ở
        giữa task nhưng không có user message mới → prompt continue return
        injectSyntheticUser(session, "continue", { metadata: {
        compaction_continue: true }, }) }

Tại sao quan trọng: Compact xong mà không prompt gì thêm → model thấy conversation "đã xong", dừng. Replay user message hoặc inject "continue" synthetic prompt giữ agent tiếp tục được nhiệm vụ. Xử lý media riêng tránh việc resend ảnh/PDF (tốn bandwidth) hoặc fail do media đã expire.

Code example (generic)

async function resumeAfterCompaction(session: Session) { const last =
        session.lastMessage() if (last?.role === "user" &amp;&amp;
        !last.responded) { // user vừa hỏi, model chưa kịp trả lời → replay
        await session.sendMessage(last) return } // Model đang làm giữa chừng →
        bơm "continue" await session.sendSystem({ content: "continue", metadata:
        { kind: "compaction-continue" }, }) }

Ưu điểm

Agent không "treo" sau compaction — luôn có prompt kế tiếp
Metadata compaction_continue cho phép UI hide / phân biệt với user message thật
Xử lý media riêng đảm bảo không gửi lại file lớn lần nữa

Nhược điểm

Synthetic "continue" prompt đôi khi khiến model lặp lại summary thay vì làm tiếp
Logic phân nhánh (has media / answered / else) dễ bug khi thêm loại message mới
Nếu summary thiếu thông tin, model sẽ continue sai hướng

Tham khảo

Phân tích sâu T11: 3 nhánh media/replay/continue, synthetic message metadata, edge cases

T12. Structured compaction template

B.6

File: session/compaction.ts · Lines: 277-299

Code từ opencode

const DEFAULT_COMPACTION_TEMPLATE = ` You are compacting a coding agent
        conversation. Produce a dense summary with these sections: ## Goal One
        sentence describing what the user ultimately wants. ## Instructions
        Bullet list of rules the user gave (tone, constraints, tech choices). ##
        Discoveries Facts learned about the codebase: file paths, symbols,
        patterns, gotchas. ## Accomplished What has been done so far (edits,
        commands, decisions). ## Relevant files List of files with 1-line
        description each. Be factual. No filler. Max ${MAX_SUMMARY_TOKENS}
        tokens. ` // Plugin có thể override: const template =
        plugins.compactionTemplate?.() ?? DEFAULT_COMPACTION_TEMPLATE

Tại sao quan trọng: Summary tự do thường lan man, thiếu cấu trúc → model sau compaction khó find lại thông tin. Template bắt LLM điền 5 section cố định (Goal / Instructions / Discoveries / Accomplished / Relevant files) → sau compaction, model vẫn có "mental map" của task. Plugin override cho phép customize per-workflow.

Code example (generic)

const COMPACTION_TEMPLATE = ` Summarize the conversation above. Use
        EXACTLY these sections: ### Goal (1 sentence) ### Constraints (bullet
        list — user's rules/preferences) ### Relevant files (file path + 1 line)
        ### Done (what actions have been performed) ### Next (what remains)
        `.trim() async function compact(history: Msg[]) { return llm.complete({
        system: COMPACTION_TEMPLATE, messages: history, maxTokens: 2000, }) }

Ưu điểm

Summary luôn có shape dự đoán được → dễ parse / regenerate
Template rõ ràng giúp model không quên chi tiết quan trọng (relevant files)
Plugin override cho phép tuỳ biến cho domain khác (e.g. writing, research)

Nhược điểm

Template cứng có thể không phù hợp với hội thoại non-coding
Tốn thêm token cho cấu trúc (header "### Goal", ...) so với summary tự do
Model yếu có thể bỏ sót section hoặc đặt tên sai

Tham khảo

Phân tích sâu T12: 5-section structure, plugin override, template quality, free-form comparison

C. Tool Design — Cách định nghĩa, truncate, match & thực thi tool

Tool là "tay chân" của agent. opencode đặc biệt đầu tư vào: description được tách thành file .txt riêng, output luôn được truncate, edit tool có 3-tier fuzzy matching, bash command parse bằng tree-sitter để extract permission context. Đây là nhóm kỹ thuật nhiều nhất (8 kỹ thuật).

T13. Tool description .txt pattern

C.1

File: tool/*.txt (mỗi tool có 1 file)

Code từ opencode

// tool/bash.ts import DESCRIPTION from "./bash.txt" import { Tool }
        from "./tool" export const BashTool = Tool.define("bash", { description:
        renderTemplate(DESCRIPTION, { PLATFORM: process.platform, SHELL:
        process.env.SHELL, MAX_LINES: 2000, }), parameters: z.object({ command:
        z.string(), timeout: z.number().optional(), }), // ... })

# tool/bash.txt (excerpt) Execute shell commands in a persistent {SHELL}
        session on {PLATFORM}. &lt;important-rules&gt; - DO NOT use `find` or
        `grep` — use the dedicated Grep/Glob tools instead. - Quote paths with
        spaces. - Output is truncated to {MAX_LINES} lines.
        &lt;/important-rules&gt;

Tại sao quan trọng: Tool description thường rất dài (vài trăm dòng) và đóng vai trò như "mini system prompt" cho tool đó. Tách ra file .txt với template variable giúp: (1) version control tốt hơn, (2) không cần escape trong code, (3) dễ A/B test prompt bằng cách swap file, (4) non-dev cũng có thể edit.

Code example (generic)

// edit.ts import DESCRIPTION from "./edit.txt" export const EditTool =
        defineTool({ name: "edit", description: DESCRIPTION
        .replace("{MAX_FILE_SIZE}", String(MAX_FILE_SIZE))
        .replace("{READ_REQUIREMENT}", "You MUST Read the file first"), schema:
        editSchema, handler: editHandler, })

Ưu điểm

Description dài không làm "nặng" file .ts — giữ code clean
Dễ grep, diff, review khi prompt thay đổi (mỗi thay đổi = 1 PR có thể thấy rõ)
Template interpolation cho phép description động (platform, feature flag)

Nhược điểm

Tool description không có type safety — typo trong template variable sẽ fail runtime
Cần build setup để bundle .txt vào runtime (Bun hỗ trợ sẵn, Node cần loader)
Non-trivial để test: phải render template rồi so sánh string

Tham khảo

Phân tích sâu T13 →

T14. Effect-based lazy tool init với service injection

C.2

File: tool/tool.ts · Lines: 119-142

Code từ opencode

export namespace Tool { export const define = &lt;P extends z.ZodType,
        R&gt;( id: string, init: Effect.Effect&lt;Definition&lt;P, R&gt;, never,
        Truncate | Agent&gt;, ) =&gt; { return Effect.gen(function* () { const
        def = yield* init // resolve services ONCE at definition time const
        truncate = yield* Truncate const agent = yield* Agent // Return a
        wrapped version that intercepts execute for truncation + tracing return
        wrap(id, def, truncate, agent) }) } function wrap&lt;P, R&gt;(id:
        string, def: Definition&lt;P, R&gt;, truncate: Truncate, agent: Agent) {
        return { ...def, execute: (args: z.infer&lt;P&gt;, ctx: ExecuteCtx)
        =&gt; Effect.withSpan(`Tool.execute`, { attributes: { "tool.name": id }
        })( def.execute(args, ctx).pipe( Effect.tap((result) =&gt;
        truncate.apply(id, result)), ), ), } } }

Tại sao quan trọng: Tool cần dùng Truncate service, Agent config, Logger... Nếu inject tại mỗi execute → performance tệ, test khó. Effect init resolve services 1 lần ở define-time, sau đó wrap execute để chèn truncation + OTel span tự động. Toàn bộ tool "miễn phí" observability.

Code example (generic)

function defineTool&lt;A&gt;(id: string, factory: (deps: Deps) =&gt;
        ToolDef&lt;A&gt;) { const deps = resolveDeps() // inject once const base
        = factory(deps) return { ...base, execute: async (args: A, ctx: Ctx)
        =&gt; { const span = tracer.startSpan(`tool.${id}`) try { const raw =
        await base.execute(args, ctx) return deps.truncate(raw) // auto truncate
        } finally { span.end() } }, } }

Ưu điểm

Tool author không cần nhớ truncate / trace — wrapper lo hết
Service injection rõ ràng ở type level (Effect's R channel)
Test easy — mock Truncate/Agent là đủ

Nhược điểm

Effect-TS learning curve cao — đội mới khó onboard
Wrapping có overhead nhỏ (mỗi call tạo span, dù không log)
Nếu muốn tool bỏ qua truncate (vd stream), phải có escape hatch phức tạp

Tham khảo

Phân tích sâu T14 →

T15. Output truncation với tail-keep + file spill

C.3

File: tool/truncate.ts · Lines: 64-126

Code từ opencode

const MAX_LINES = 2_000 const MAX_BYTES = 50_000 export function
        truncateOutput( output: string, opts: { direction?: "head" | "tail";
        toolName: string; callId: string } = {}, ) { const { direction = "tail",
        toolName, callId } = opts const lines = output.split("\n") if
        (lines.length &lt;= MAX_LINES &amp;&amp; output.length &lt;= MAX_BYTES)
        return output // Giữ lại "tail" (hoặc head) theo direction const keep =
        direction === "tail" ? lines.slice(-MAX_LINES).join("\n") :
        lines.slice(0, MAX_LINES).join("\n") // Ghi full output ra file để agent
        có thể đọc lại bằng Read/Grep const spillPath =
        `/tmp/opencode-overflow/${toolName}-${callId}.txt`
        fs.writeFileSync(spillPath, output) return `${keep}
        &lt;truncated-notice&gt; Output truncated to last ${MAX_LINES} lines.
        Full output saved to ${spillPath}. Use the Read or Grep tool to access
        specific parts. &lt;/truncated-notice&gt;` }

Tại sao quan trọng: Tool output như bash ls -R / hay grep -r foo có thể trả về megabyte data → fill sạch context. Truncate đơn thuần làm mất thông tin. Opencode vừa truncate (trả về dòng cuối — thường là dòng quan trọng nhất trong bash output), vừa spill full ra file để agent có thể quay lại đọc bằng Read/Grep tool — "lossy on model, lossless on disk".

Code example (generic)

async function truncateAndSpill( output: string, meta: { tool: string;
        callId: string }, ): Promise&lt;string&gt; { if (output.length &lt;=
        50_000) return output const tail = output.slice(-50_000) const path =
        `/tmp/agent-overflow/${meta.tool}-${meta.callId}.log` await
        fs.writeFile(path, output) return tail + `\n\n[Output was
        ${output.length} bytes, truncated to last 50k. ` + `Full content:
        ${path}. Use Read/Grep to inspect.]` }

Ưu điểm

Context được bảo vệ — không tool nào có thể "làm ngập" cuộc hội thoại
Agent vẫn truy cập được full output qua file → không mất thông tin
Direction head/tail linh hoạt (head cho lỗi compile, tail cho log)

Nhược điểm

File spill chỉ hữu dụng nếu agent "nhớ" path — đôi khi forget và hỏi lại
Disk IO mỗi lần truncate (dù rare) thêm latency
Limit 2000 dòng / 50KB là magic number — workload khác nhau cần khác

Tham khảo

Phân tích sâu T15 →

T16. Zod schema validation với custom error format

C.4

File: tool/tool.ts · Lines: 86-96

Code từ opencode

function validateArgs&lt;P extends z.ZodType&gt;(parameters: P, raw:
        unknown): z.infer&lt;P&gt; { const parsed = parameters.safeParse(raw) if
        (!parsed.success) { const issues = parsed.error.issues .map((i) =&gt;
        `${i.path.join(".")}: ${i.message}`) .join("; ") throw new ToolArgError(
        `Please rewrite the input with valid arguments. Errors: ${issues}`, ) }
        return parsed.data }

Tại sao quan trọng: Model thỉnh thoảng gửi args sai schema (wrong type, missing field). Nếu error đơn giản "invalid input", model không biết sửa ở đâu. Message "Please rewrite the input with valid arguments. Errors: file_path: Required; limit: expected number, got string" → model có đủ thông tin để retry đúng format.

Code example (generic)

import { z } from "zod" const EditArgs = z.object({ file_path:
        z.string().min(1), old_string: z.string(), new_string: z.string(), })
        function validateEdit(raw: unknown) { const r = EditArgs.safeParse(raw)
        if (!r.success) { const details = r.error.issues .map(i =&gt;
        `${i.path.join(".")}: ${i.message}`) .join("; ") throw new Error(`Please
        rewrite the input. Errors: ${details}`) } return r.data }

Ưu điểm

Error message là "instruction to retry" — model hiểu ngay phải fix gì
Zod schema double purpose: validation + TypeScript type inference
Field path ("options.timeout") giúp model tìm đúng field cần sửa

Nhược điểm

Zod error verbose — nếu schema lồng sâu, message có thể quá dài
Model đôi khi giữ nguyên arg và chỉ thay error message thành comment → vô dụng
Không tự retry — tool author phải bọc trong loop nếu muốn auto-retry

Tham khảo

Phân tích sâu T16 →

T17. Fuzzy edit matching 3-tier (Simple → LineTrimmed → BlockAnchor)

C.5

File: tool/edit.ts · Lines: 182-350

Code từ opencode

// 3 tier, try từ strict → loose const REPLACERS = [ SimpleReplacer, //
        exact match LineTrimmedReplacer, // ignore leading/trailing whitespace
        per line BlockAnchorReplacer, // match first + last line, fuzzy middle ]
        as const export async function applyEdit(content: string, oldStr:
        string, newStr: string) { for (const replacer of REPLACERS) { const
        match = replacer.findMatch(content, oldStr) if (match) { return
        replacer.replace(content, match, newStr) } } throw new
        NoMatchError("old_string not found in file") } // BlockAnchorReplacer:
        match dòng đầu + dòng cuối của block, // rồi Levenshtein distance các
        dòng giữa với threshold class BlockAnchorReplacer { static
        findMatch(content: string, oldStr: string) { const oldLines =
        oldStr.split("\n") if (oldLines.length &lt; 3) return null // chỉ áp
        dụng block &gt;= 3 dòng const first = oldLines[0].trim() const last =
        oldLines[oldLines.length - 1].trim() const candidates =
        findAllAnchorPairs(content, first, last) const scored =
        candidates.map((c) =&gt; ({ ...c, score: levenshteinRatio(oldStr,
        content.slice(c.start, c.end)), })) scored.sort((a, b) =&gt; b.score -
        a.score) const threshold = scored.length &gt; 1 ? 0.3 : 0.0 return
        scored[0]?.score &gt;= threshold ? scored[0] : null } }

Tại sao quan trọng: Edit tool fail khi model copy indentation sai hoặc nhớ code gần đúng — đây là loại fail phổ biến nhất của coding agents. 3-tier matching: (1) exact → nhanh, (2) trim whitespace → bắt được lỗi indent, (3) block anchor + Levenshtein → bắt được lỗi "model nhớ giữa bị mờ". Threshold 0.3 cho multiple candidates tránh match sai khi có nhiều block giống nhau.

Code example (generic)

const replacers = [exactMatch, trimmedMatch, anchorMatch] function
        replaceFlex(content: string, oldStr: string, newStr: string) { for
        (const r of replacers) { const m = r.find(content, oldStr) if (m) return
        content.slice(0, m.start) + newStr + content.slice(m.end) } throw new
        Error("No match found. Re-read the file and retry.") } function
        anchorMatch(content: string, oldStr: string) { const lines =
        oldStr.split("\n") if (lines.length &lt; 3) return null const [first,
        ...rest] = lines const last = rest.pop()! // Tìm tất cả anchor pairs,
        chọn cặp có Levenshtein ratio cao nhất // ... }

Ưu điểm

Giảm đáng kể tỉ lệ "edit fail" của agent → ít retry, trải nghiệm mượt
Tier theo thứ tự strict → loose đảm bảo không nhầm lẫn khi có exact match
Threshold adaptive (0.0 khi single, 0.3 khi multiple) tránh false positive

Nhược điểm

Fuzzy match có thể "chọn sai" block nếu threshold 0.3 quá loose với code tương tự nhau
Levenshtein O(n*m) → chậm với file lớn (>5k lines)
Model có thể lười, gửi old_string càng ngày càng mờ → rủi ro edit sai âm thầm

Tham khảo

Phân tích sâu T17 →

T18. Bash command parsing với tree-sitter WASM

C.6

File: tool/bash.ts · Lines: 150-320

Code từ opencode

import TreeSitter from "web-tree-sitter" import BashWasm from
        "tree-sitter-bash/tree-sitter-bash.wasm" with { type: "file" } //
        Lazy-load parser một lần let parserPromise:
        Promise&lt;TreeSitter.Parser&gt; | null = null function getParser() { if
        (!parserPromise) { parserPromise = (async () =&gt; { await
        TreeSitter.init() const parser = new TreeSitter()
        parser.setLanguage(await TreeSitter.Language.load(BashWasm)) return
        parser })() } return parserPromise } const DESTRUCTIVE_COMMANDS = new
        Set(["rm", "mv", "cp", "dd", "chmod", "chown"]) export async function
        analyzeCommand(cmd: string) { const parser = await getParser() const
        tree = parser.parse(cmd) const commands: Array&lt;{ name: string; args:
        string[]; paths: string[] }&gt; = [] walkTree(tree.rootNode, (node)
        =&gt; { if (node.type === "command") { const name =
        node.childForFieldName("name")?.text ?? "" const args =
        node.childrenForFieldName("argument").map((a) =&gt; a.text) const paths
        = args.filter((a) =&gt; looksLikePath(a)) commands.push({ name, args,
        paths }) } }) return { commands, modifiesFiles: commands.some((c) =&gt;
        DESTRUCTIVE_COMMANDS.has(c.name)), } }

Tại sao quan trọng: Regex để phân tích bash command là bug-farm: "rm -rf /", "echo hello | rm", "sh -c 'rm x'" cần được detect đúng. Tree-sitter là full grammar parser → chính xác, handle quoting, subshell, pipes, redirection. Kết quả parse feed vào permission system để extract paths cần approval — an toàn hơn regex hack nhiều.

Code example (generic)

// Simplified: detect các command modify file async function
        extractDangerousPaths(cmd: string): Promise&lt;string[]&gt; { const tree
        = await parseBash(cmd) const paths: string[] = [] walk(tree.rootNode,
        (node) =&gt; { if (node.type !== "command") return const name =
        node.childForFieldName("name")?.text if (!DESTRUCTIVE.has(name)) return
        for (const arg of node.namedChildren) { if (arg.type === "word"
        &amp;&amp; looksLikePath(arg.text)) paths.push(arg.text) } }) return
        paths }

Ưu điểm

Parse chính xác: subshell $(), pipe, redirection đều không fool được
WASM portable — chạy cả trong Bun, Node, browser
Lazy load, init 1 lần → cost thấp sau request đầu

Nhược điểm

WASM asset ~2MB → tăng bundle size / cold start
Tree-sitter-bash không hỗ trợ hết shell exotic (fish, zsh-specific syntax)
Deny-rule bypass risk: nếu parse miss 1 pattern, permission có thể leak — cần unit test kỹ

Tham khảo

Phân tích sâu T18 →

T19. Sub-agent delegation via Task tool

C.7

File: tool/task.ts · Lines: 69-145

Code từ opencode

export const TaskTool = Tool.define("task", Effect.gen(function* () {
        const ops = yield* Session.Ops return { description: TASK_DESCRIPTION,
        parameters: z.object({ description: z.string().describe("3-5 word task
        summary"), prompt: z.string(), subagent_type: z.enum(["general-purpose",
        "explore", "plan"]).default("general-purpose"), }), async execute(args,
        ctx) { // Child session thừa hưởng permission rules nhưng với
        restrictions const childSession = yield*
        ops.createChildSession(ctx.sessionId, { agent: args.subagent_type,
        inheritPermissions: true, deniedTools: ["task", "todowrite"], // prevent
        recursion / scope creep }) // Recursive: gọi lại ops.prompt() trên child
        session const result = yield* ops.prompt(childSession.id, { message:
        args.prompt, description: args.description, }) return { summary:
        result.finalMessage.text, toolCalls: result.toolCalls.length,
        tokensUsed: result.usage.total, } }, } }))

Tại sao quan trọng: Parent agent có context đầy tool output → nếu dùng để search / explore sẽ càng ngập. Sub-agent là "thread ephemeral": spawn với context sạch, làm task cụ thể (search, plan), trả lại compact summary. Parent context không bị ô nhiễm. Deny task, todowrite cho child tránh recursion vô hạn + scope creep.

Code example (generic)

async function spawnSubAgent(opts: { prompt: string type: "explore" |
        "plan" parent: Session }) { const child = await createSession({ agent:
        opts.type, system: getAgentSystem(opts.type), permissions:
        inheritFrom(opts.parent).deny(["task", "todowrite"]), parent:
        opts.parent.id, }) const result = await runLoop(child, opts.prompt)
        return { summary: result.finalText, // ONLY summary gửi về parent cost:
        result.cost, } }

Ưu điểm

Parent context không bị ô nhiễm bởi tool output của sub-task
Cho phép parallel: spawn nhiều sub-agent cùng lúc (search, plan, verify)
Specialized agent (explore/plan) có system prompt tối ưu cho task

Nhược điểm

Sub-agent "mù" về context của parent — cần prompt self-contained, dễ miss info
Cost x2: parent paid để frame task + sub agent paid để làm
Khó debug: khi sub-agent sai, parent chỉ thấy summary, không thấy rationale

Tham khảo

Phân tích sâu T19 →

T20. Plugin tool dynamic discovery

C.8

File: tool/registry.ts · Lines: 152-173

Code từ opencode

export async function discoverPluginTools(configDirs: string[]) { const
        tools: Tool[] = [] for (const dir of configDirs) { const files = await
        glob("{tool,tools}/*.{js,ts,mjs}", { cwd: dir, absolute: true }) for
        (const file of files) { try { const mod = await import(file) const
        toolDef = mod.default ?? mod.tool if (!toolDef) continue // Wrap với
        fromPlugin: auto-truncate + sandboxed permission
        tools.push(Tool.fromPlugin(toolDef, { source: file })) } catch (err) {
        log.warn(`Plugin tool load failed: ${file}`, err) } } } return tools }

Tại sao quan trọng: Hard-code tool list trong core = mỗi lần muốn thêm tool cần PR core. Discovery pattern: scan folder convention (tool/, tools/), dynamic import, wrap với safety wrapper. User/org có thể inject tool riêng (Slack, Jira, internal DB) mà không fork repo.

Code example (generic)

import { glob } from "fast-glob" import { pathToFileURL } from
        "node:url" async function loadUserTools(pluginDirs: string[]):
        Promise&lt;Tool[]&gt; { const out: Tool[] = [] for (const dir of
        pluginDirs) { const files = await glob(["tool/*.{js,ts}",
        "tools/*.{js,ts}"], { cwd: dir }) for (const f of files) { const mod =
        await import(pathToFileURL(f).href) const def = mod.default if
        (def?.name &amp;&amp; def?.execute) { out.push(wrapTool(def)) //
        auto-apply truncate / perm / trace } } } return out }

Ưu điểm

Extensibility — user tự thêm tool không cần core PR
Discovery pattern quen thuộc (VSCode, Next.js đều dùng)
Wrap đảm bảo plugin tool không bypass truncate / permission

Nhược điểm

Security: plugin chạy với quyền process → malicious plugin có thể exfiltrate data
Type safety yếu — runtime mới biết plugin đúng shape hay chưa
Plugin load error silent (chỉ warn) → user có thể không biết tool bị miss

Tham khảo

Phân tích sâu T20 →

D. Provider Abstraction — Multi-model trong 1 code path

opencode hỗ trợ 20+ provider (Anthropic, OpenAI, Google, Bedrock, Mistral, Groq, Together, ...). Nhưng thay vì viết lowest-common-denominator, opencode chọn cách "abstract + per-provider quirk layer": có chung format nhưng transform layer xử lý mỗi provider riêng.

T21. Multi-provider SDK lazy loading

D.1

File: provider/provider.ts · Lines: 92-117

Code từ opencode

// Mỗi provider = 1 dynamic import lazy const BUNDLED_PROVIDERS:
        Record&lt;string, () =&gt; Promise&lt;Loader&gt;&gt; = { anthropic: ()
        =&gt; import("./loaders/anthropic").then((m) =&gt; m.default), openai:
        () =&gt; import("./loaders/openai").then((m) =&gt; m.default), google:
        () =&gt; import("./loaders/google").then((m) =&gt; m.default), bedrock:
        () =&gt; import("./loaders/bedrock").then((m) =&gt; m.default), mistral:
        () =&gt; import("./loaders/mistral").then((m) =&gt; m.default), groq: ()
        =&gt; import("./loaders/groq").then((m) =&gt; m.default), // ... 15+
        more } export async function loadProvider(id: string):
        Promise&lt;Loader&gt; { const loader = BUNDLED_PROVIDERS[id] if
        (!loader) throw new UnknownProviderError(id) return await loader() //
        import chỉ khi user thật sự dùng }

Tại sao quan trọng: Nếu import eager 20 SDK (@anthropic-ai/sdk, openai, @google/generative-ai, ...), startup time cực tệ + bundle size hàng chục MB. Lazy import → chỉ load SDK của provider đang dùng. User pick Anthropic → chỉ Anthropic SDK được load.

Code example (generic)

const LOADERS = { anthropic: () =&gt; import("@anthropic-ai/sdk"),
        openai: () =&gt; import("openai"), google: () =&gt;
        import("@google/generative-ai"), } as const async function
        getClient(provider: keyof typeof LOADERS) { const mod = await
        LOADERS[provider]() return mod.default }

Ưu điểm

Startup nhanh — chỉ import cái cần
Bundle splitting tự nhiên với Bun/esbuild
Thêm provider mới = thêm 1 dòng trong map

Nhược điểm

First call chịu latency import (~50-200ms)
Type import eager vẫn cần (để có Loader interface) → phức tạp setup
Error ở loader chỉ xuất hiện runtime, không catch được lúc build

Tham khảo

Vercel AI SDK — provider abstraction reference

Phân tích sâu T21 →

T22. Provider-specific message transformation

D.2

File: provider/transform.ts · Lines: 48-150

Code từ opencode

// Anthropic: tool_use phải tách khỏi text content function
        anthropicTransform(msgs: Message[]): AnthropicMessage[] { return
        msgs.flatMap((m) =&gt; { if (m.role !== "assistant") return [m as
        AnthropicMessage] const toolUses = m.content.filter((c) =&gt; c.type ===
        "tool_use") const nonTool = m.content.filter((c) =&gt; c.type !==
        "tool_use") // Split nếu có cả hai if (toolUses.length &amp;&amp;
        nonTool.length) { return [ { role: "assistant", content: nonTool }, {
        role: "assistant", content: toolUses }, ] } return [m as
        AnthropicMessage] }) } // Mistral: toolCallId max 9 chars, alphanumeric
        function mistralTransform(msgs: Message[]) { return msgs.map((m) =&gt;
        ({ ...m, content: m.content.map((c) =&gt; { if ("toolCallId" in c) {
        return { ...c, toolCallId: padToolCallId(c.toolCallId) } // truncate +
        pad } return c }), })) } // Claude sanitize: alphanumeric only trong
        tool name function sanitizeClaudeToolName(name: string): string { return
        name.replace(/[^a-zA-Z0-9_-]/g, "_") }

Tại sao quan trọng: Mỗi provider có quirk riêng: Anthropic cần split tool_use, Mistral giới hạn toolCallId 9 chars, Gemini không thích system role ở giữa conversation, v.v. Centralize quirk vào transform layer → core logic chung; thêm provider = thêm 1 transform function, không touch core.

Code example (generic)

const TRANSFORMS: Record&lt;Provider, Transform&gt; = { anthropic:
        splitToolUse, openai: identity, mistral: truncateToolCallIds, google:
        moveSystemToStart, } function callModel(provider: Provider, msgs: Msg[])
        { const transform = TRANSFORMS[provider] ?? identity const prepared =
        transform(msgs) return getClient(provider).chat.completions.create({
        messages: prepared }) }

Ưu điểm

Core logic clean — không ifelse nhiều provider
Provider quirks documented in one place (transform.ts)
Test per provider dễ — chỉ cần verify transform output

Nhược điểm

Khi provider thay spec, phải update transform — dễ miss
Transform có thể đánh mất fidelity (truncate toolCallId → khó trace)
Debugging request cần thêm step "inspect after transform"

Tham khảo

Phân tích sâu T22 →

T23. Overflow pattern detection & retry với server headers

D.3

File: provider/error.ts:8-193 + session/retry.ts:17-52

Code từ opencode

// 25+ regex patterns detect overflow từ nhiều provider const
        OVERFLOW_PATTERNS: RegExp[] = [ /prompt is too long/i, /exceeds the
        context window/i, /max.*context.*length/i, /context_length_exceeded/i,
        /no body.*(400|413)/i, // Bedrock silent truncation
        /token.*limit.*exceeded/i, // ... 20+ more ] export function
        isContextOverflow(err: unknown): boolean { const msg =
        extractErrorMessage(err) return OVERFLOW_PATTERNS.some((r) =&gt;
        r.test(msg)) } // Retry with server-directed delay export async function
        withRetry&lt;T&gt;(fn: () =&gt; Promise&lt;T&gt;, maxAttempts = 5):
        Promise&lt;T&gt; { let attempt = 0 while (true) { try { return await
        fn() } catch (err) { attempt++ if (isContextOverflow(err)) throw err //
        không retry overflow if (attempt &gt;= maxAttempts) throw err // Tôn
        trọng server directive const serverMs = extractRetryAfter(err) //
        retry-after-ms | retry-after const delayMs = serverMs ?? Math.min(2 **
        (attempt - 1) * 2000, 30_000) await sleep(delayMs) } } }

Tại sao quan trọng: Không có lỗi overflow chuẩn HTTP — mỗi provider trả message khác nhau. 25+ regex là kinh nghiệm production. Retry phải skip overflow (retry vô ích, trigger compaction thay vì retry). Respect server retry-after-ms → tránh thunder-herd, rate-limit provider hiệu quả.

Code example (generic)

const OVERFLOW = [ /prompt is too long/i, /context.?window/i,
        /token.?limit/i, ] async function retryable&lt;T&gt;(fn: () =&gt;
        Promise&lt;T&gt;, max = 5): Promise&lt;T&gt; { for (let i = 0; i &lt;
        max; i++) { try { return await fn() } catch (e) { const msg = String((e
        as Error).message ?? e) if (OVERFLOW.some(r =&gt; r.test(msg))) throw
        new OverflowError(e) // compact, don't retry if (i === max - 1) throw e
        const serverDelay = (e as any).headers?.["retry-after-ms"] const delay =
        serverDelay ?? Math.min(2000 * 2 ** i, 30_000) await new Promise(r =&gt;
        setTimeout(r, +delay)) } } throw new Error("unreachable") }

Ưu điểm

Unified error handling qua 20+ provider — chỉ 1 retry policy
Respect retry-after → không làm provider bực, tránh ban
Skip retry overflow → immediately trigger compaction path, không lãng phí

Nhược điểm

Regex list cần maintain — provider đổi message là miss
Exponential backoff không tính jitter → multiple client thunder-herd nhẹ
Không distinguish transient (5xx) vs permanent (4xx) một cách tường minh

Tham khảo

Phân tích sâu T23 →

E. Permission Model — First-class safety

opencode coi permission là feature chính chứ không phải add-on. Wildcard matching, arity normalization, state per session — tất cả cho phép user có control mịn mà không bị spam dialog liên tục.

T24. Wildcard last-match-wins evaluation

E.1

File: permission/evaluate.ts · Lines: 1-15

Code từ opencode

import { Wildcard } from "./wildcard" export function evaluate( rules:
        PermissionRule[], permission: string, // "bash" | "edit" | "write" | ...
        pattern: string, // "rm -rf *" | "/tmp/**" | ... ): "allow" | "deny" |
        "ask" { const match = rules.findLast((rule) =&gt;
        Wildcard.match(permission, rule.permission) &amp;&amp;
        Wildcard.match(pattern, rule.pattern) ) return match?.action ?? "ask" //
        default "ask" khi không match }

Tại sao quan trọng: Permission rules cần vừa "allow broadly" vừa "deny specifically". Last-match-wins + wildcard cho user viết: allow: bash:* rồi deny: bash:rm -rf * → cho phép bash trừ rm -rf. findLast (thay vì find) đảm bảo rule sau override rule trước → predictable.

Code example (generic)

interface Rule { permission: string; pattern: string; action:
        "allow"|"deny"|"ask" } function matchWildcard(value: string, pattern:
        string): boolean { const re = new RegExp("^" + pattern.replace(/\*/g,
        ".*") + "$") return re.test(value) } function evaluate(rules: Rule[],
        perm: string, ptn: string) { const hit = [...rules].reverse().find(r
        =&gt; matchWildcard(perm, r.permission) &amp;&amp; matchWildcard(ptn,
        r.pattern) ) return hit?.action ?? "ask" }

Ưu điểm

Mental model đơn giản: "rule sau win" — giống nginx, iptables
Wildcard đủ biểu đạt (ngôn ngữ shell mà user quen)
Default "ask" an toàn — không biết thì hỏi

Nhược điểm

Order-sensitive — user kéo rule lên xuống sẽ đổi behavior
Wildcard không regex đầy đủ → pattern phức tạp không biểu đạt được
Không có "deny wins" mode → vulnerable khi user thứ tự sai

Tham khảo

Phân tích sâu T24 →

T25. Session-scoped permission state (once/always/reject)

E.2

File: permission/index.ts · Lines: 130-282

Code từ opencode

interface PermissionState { sessionId: string alwaysAllow: Array&lt;{
        permission: string; pattern: string }&gt; // session-scoped pending:
        Map&lt;string, Deferred&lt;"allow" | "deny"&gt;&gt; } export class
        Permission { async ask(req: PermissionRequest): Promise&lt;"allow" |
        "deny"&gt; { // 1. Check static rules const staticResult =
        evaluate(this.rules, req.permission, req.pattern) if (staticResult !==
        "ask") return staticResult // 2. Check session alwaysAllow const always
        = this.state.alwaysAllow.some((r) =&gt; Wildcard.match(req.permission,
        r.permission) &amp;&amp; Wildcard.match(req.pattern, r.pattern) ) if
        (always) return "allow" // 3. Ask user (UI emits event) const deferred =
        makeDeferred() this.state.pending.set(req.id, deferred)
        emit("permission.request", req) return await deferred.promise }
        reply(reqId: string, reply: "once" | "always" | "reject", pattern?:
        string) { const d = this.state.pending.get(reqId) if (!d) return if
        (reply === "always" &amp;&amp; pattern) { this.state.alwaysAllow.push({
        permission: d.req.permission, pattern }) } d.resolve(reply === "reject"
        ? "deny" : "allow") } }

Tại sao quan trọng: Nếu mỗi bash command đều popup dialog → user spam Enter, permission mất ý nghĩa. "always" cho phép user approve pattern một lần (e.g. ls *), agent execute thoải mái trong session. "reject" cancel ngay. State scoped theo session → không leak giữa sessions.

Code example (generic)

class PermissionGate { private alwaysAllow: Array&lt;{perm: string, ptn:
        string}&gt; = [] async ask(req: Request): Promise&lt;"allow"|"deny"&gt;
        { if (this.alwaysAllow.some(r =&gt; match(req, r))) return "allow" const
        answer = await ui.ask(req) // UI returns {mode, pattern?} if
        (answer.mode === "always") { this.alwaysAllow.push({ perm:
        req.permission, ptn: answer.pattern ?? req.pattern }) } return
        answer.mode === "reject" ? "deny" : "allow" } }

Ưu điểm

Balance safety vs velocity: "always" giảm ma sát, "once" cho case hiếm
Session-scope đảm bảo "always" không leak vào session khác → ít rủi ro
UI và core decoupled qua event — dễ swap UI (TUI, web, desktop)

Nhược điểm

"always" pattern không được persist → mỗi session mới phải approve lại (trade-off có chủ ý)
User vội approve pattern quá rộng → tự mở cửa cho agent
Deferred pattern cần quản lý lifecycle (timeout, interrupt) → thêm phức tạp

Tham khảo

Phân tích sâu T25 →

T26. Arity-based command normalization

E.3

File: permission/arity.ts · Lines: 1-161

Code từ opencode

// ARITY map: command → số "word" cần giữ để nhận diện "human command"
        const ARITY: Record&lt;string, number&gt; = { touch: 1, // touch
        file.txt → "touch" ls: 1, // ls -la → "ls" npm: 2, // npm install → "npm
        install" "npm run": 3, // npm run build → "npm run build" "git config":
        3, "pnpm dlx": 3, "npx create-next-app": 4, // ... 450+ entries } export
        function normalizeCommand(raw: string): string { const tokens =
        tokenize(raw) // respect quotes // Try longest prefix match for (let n =
        4; n &gt;= 1; n--) { const prefix = tokens.slice(0, n).join(" ") if
        (ARITY[prefix] === n) { return tokens.slice(0, n).join(" ") } } return
        tokens[0] ?? "" // fallback: first token } // normalizeCommand("npm run
        build --watch") === "npm run build" // normalizeCommand("git config
        --global user.email [email protected]") === "git config --global user.email"? //
        ... thực chất sẽ là "git config" với arity=3 nhưng token[2] là
        "--global" // → opencode xử lý kỹ hơn, bỏ qua flags

Tại sao quan trọng: User muốn approve "git status" 1 lần, không phải mỗi lần git khác arg. Arity normalization trích "human-readable command" (bỏ flags/paths) từ raw bash. 450+ entries cover ecosystem tool (npm, pnpm, yarn, docker, kubectl, git subcommands). Pattern gợi ý cho user = command normalized → permission "always" nhanh.

Code example (generic)

const ARITY: Record&lt;string, number&gt; = { ls: 1, cat: 1, touch: 1,
        mkdir: 1, "npm install": 2, "npm run": 3, "git status": 2, "git add": 2,
        "git commit": 2, "docker run": 2, "docker compose": 2, } function
        normalize(cmd: string): string { const parts = cmd.split(/\s+/).filter(p
        =&gt; !p.startsWith("-")) for (let n = 3; n &gt;= 1; n--) { const pref =
        parts.slice(0, n).join(" ") if (ARITY[pref] === n) return pref } return
        parts[0] ?? "" }

Ưu điểm

UX cao — user approve pattern "human" thay vì regex
Cover ecosystem: 450+ entries đồng nghĩa handled trước khi user gặp
Fallback về first token an toàn cho command không biết

Nhược điểm

Maintenance cost — tool mới (bun, deno subcommand) phải thêm vào
Edge case: sudo npm install, env X=1 git status có thể bị miss
Users với shell custom (alias, function) không thấy đúng command

Tham khảo

Phân tích sâu T26 →

F. System Prompt & Instructions — Layer của bối cảnh

System prompt không phải là 1 string cố định — nó là nhiều layer: model-specific template, env dynamic, project AGENTS.md/CLAUDE.md cascading. opencode dispatch prompt theo model family, inject env sạch mỗi call, và tìm file instruction theo cây thư mục.

T27. Model-specific system prompt dispatch + dynamic env

F.1

File: session/system.ts · Lines: 19-77

Code từ opencode

import PROMPT_ANTHROPIC from "./prompts/anthropic.txt" import PROMPT_GPT
        from "./prompts/gpt.txt" import PROMPT_BEAST from "./prompts/beast.txt"
        // gpt-4 | o1 import PROMPT_GEMINI from "./prompts/gemini.txt" import
        PROMPT_DEFAULT from "./prompts/default.txt" function
        pickTemplate(modelId: string): string { if
        (/^(gpt-4|o1|o3)/i.test(modelId)) return PROMPT_BEAST if
        (/^gpt/i.test(modelId)) return PROMPT_GPT if (/^gemini/i.test(modelId))
        return PROMPT_GEMINI if (/claude/i.test(modelId)) return
        PROMPT_ANTHROPIC return PROMPT_DEFAULT } export function
        buildSystem(model: Model, ctx: Ctx): string[] { const template =
        pickTemplate(model.id) const envBlock = ` &lt;env&gt; directory:
        ${ctx.cwd} worktree: ${ctx.worktree} git: ${ctx.gitStatusShort}
        platform: ${process.platform} &lt;/env&gt; ` const skillsBlock =
        renderSkills(ctx.skills, { permissionGate: ctx.perm }) const
        projectBlock = ctx.agentsFile ?? "" return [template, envBlock,
        skillsBlock, projectBlock] }

Tại sao quan trọng: Claude thích format khác GPT-4 khác Gemini — cùng 1 prompt cho tất cả → performance thấp nhất common denominator. Dispatch theo model family + env block fresh mỗi call (git status, platform) giúp agent luôn aware bối cảnh hiện tại (file nào đã uncommitted, working dir ở đâu).

Code example (generic)

const PROMPTS = { anthropic: ANTHROPIC_TEMPLATE, openai:
        OPENAI_TEMPLATE, gemini: GEMINI_TEMPLATE, } function systemFor(model:
        string, ctx: Ctx) { const template = PROMPTS[detectFamily(model)] ??
        DEFAULT_TEMPLATE const env = `&lt;env&gt;cwd=${ctx.cwd}; git=${ctx.git};
        os=${process.platform}&lt;/env&gt;` return [template, env,
        ctx.projectInstructions ?? ""].join("\n\n") }

Ưu điểm

Mỗi model family có prompt tuned → chất lượng tăng
Env block cập nhật realtime → agent không dựa vào env cũ
Skills list permission-gated → model chỉ thấy skill mà nó được phép dùng

Nhược điểm

4-5 template = 4-5 nơi cần sửa khi update policy → drift rủi ro
Regex detect family không bền với tên model exotic
Env block mỗi call → invalidate prompt cache (trade-off với T10)

Tham khảo

Phân tích sâu T27 →

T28. AGENTS.md / CLAUDE.md cascading (findUp)

F.2

File: session/instruction.ts · Lines: 52-62, 120-228

Code từ opencode

const INSTRUCTION_FILES = ["AGENTS.md", "CLAUDE.md", "CONTEXT.md"]
        export async function findProjectInstructions(cwd: string, worktree:
        string) { let dir = path.resolve(cwd) const stop =
        path.resolve(worktree) while (dir.startsWith(stop)) { for (const name of
        INSTRUCTION_FILES) { const candidate = path.join(dir, name) if (await
        fs.pathExists(candidate)) { return { path: candidate, content: await
        fs.readFile(candidate, "utf8") } } } const parent = path.dirname(dir) if
        (parent === dir) break // reached root dir = parent } return null } //
        Per-message "claims tracking" prevent duplicate attach const
        attachedFiles = new Set&lt;string&gt;() export function
        attachInstructionIfNew(file: InstructionFile) { if
        (attachedFiles.has(file.path)) return null // already in conversation
        attachedFiles.add(file.path) return `&lt;project-instruction
        src="${file.path}"&gt;\n${file.content}\n&lt;/project-instruction&gt;` }

Tại sao quan trọng: Multi-project monorepo + user home config. Nếu đọc tất cả AGENTS.md từ cwd lên root → confict / lượng lớn tokens. First-match-wins từ cwd lên worktree root (dừng trước home) = rule gần nhất thắng. Claims tracking tránh attach lại cùng file khi user hỏi nhiều message trong cùng session.

Code example (generic)

async function findInstructions(cwd: string, repoRoot: string) { let dir
        = cwd while (dir.startsWith(repoRoot)) { for (const name of
        ["AGENTS.md", "CLAUDE.md"]) { const p = path.join(dir, name) if (await
        exists(p)) return { path: p, body: await readFile(p, "utf8") } } const
        parent = path.dirname(dir) if (parent === dir) break dir = parent }
        return null } const attached = new Set&lt;string&gt;() function
        buildInstructionBlock(f: { path: string; body: string }) { if
        (attached.has(f.path)) return "" attached.add(f.path) return
        `&lt;project-instructions
        src="${f.path}"&gt;\n${f.body}\n&lt;/project-instructions&gt;` }

Ưu điểm

Convention over configuration — user không cần khai báo path
First-match-wins: rule gần scope nhất win (monorepo package-level override root)
Không re-attach → tiết kiệm token trong session dài

Nhược điểm

Không stack: nếu user muốn COMBINE root + subdir rules, phải duplicate
FindUp I/O mỗi session — cost nhỏ nhưng repetitive
3 filename tried theo thứ tự → user nhầm tên là miss

Tham khảo

Phân tích sâu T28 →

Kết luận & So sánh với các harness khác

Điểm đặc sắc của opencode

Sau khi đi qua 28 kỹ thuật, có thể rút ra ba đặc điểm khác biệt lớn nhất của opencode so với các coding agent harness phổ biến (Claude Code, Aider, Cursor, Cline, OpenAI Codex CLI):

Effect-TS làm nền tảng. Phần lớn harness dùng Promise / async-await trực tiếp. opencode chọn Effect để get: interruption semantics đúng đắn, retry + timeout composable, service injection type-safe, OTel tracing gần như miễn phí. Đây là lựa chọn hiếm trong coding agent ecosystem (chỉ có effect-ai và vài framework dùng).
Permission model là first-class. Wildcard last-match-wins + arity normalization (450+ entries) + session-state (once/always/reject) + tree-sitter bash parse — tổng thể cho UX control tốt nhất trong các agent hiện tại. Claude Code có hệ permission gần tương tự nhưng không có arity normalization.
Provider abstraction sâu với quirk handling. Thay vì lowest-common-denominator, opencode có transform layer cho từng provider (Anthropic split tool_use, Mistral truncate toolCallId, ...). Nhờ đó hỗ trợ 20+ provider nhưng vẫn pull được ưu điểm riêng của mỗi provider (cache_control của Anthropic, reasoning của o1, ...).

So sánh ngắn

Khía cạnh	opencode	Claude Code	Aider	Cursor	Cline
Language / Runtime	TypeScript / Bun	TypeScript / Node	Python	TypeScript (VSCode ext)	TypeScript (VSCode ext)
Multi-provider	20+ built-in, lazy	Anthropic only	Most via LiteLLM	OpenAI + Anthropic	OpenAI + Anthropic + few
Permission model	Wildcard + arity + state	Wildcard + state	Minimal (yes/no prompt)	IDE-level (user trust)	UI approval prompt
Compaction	Tail-preserve + template + protected tools	Auto + manual /compact	Sliding window	N/A (short session)	Basic summarization
Sub-agent	Task tool (restricted perms)	Task tool (very similar)	N/A	N/A	N/A
Fuzzy edit	3-tier (Simple/Line/Anchor)	2-tier	SEARCH/REPLACE blocks	Model-dependent	2-tier
Bash safety	tree-sitter WASM parse	Regex-based (has bypass vulns)	User confirm	N/A (no bash)	User approval
Plugin system	Glob discovery + SDK	MCP + plugins + hooks	N/A	VSCode extensions	MCP
Instructions file	AGENTS.md / CLAUDE.md (findUp)	CLAUDE.md (global + project)	.aider.conf.yml	.cursorrules	.clinerules

Điểm yếu chung

Một số trade-off và weak spots xuất hiện xuyên suốt design:

Effect-TS cost onboarding. Người mới đóng góp cho opencode cần học Effect — barrier cao hơn Promise thuần.
Magic numbers. 20k reserved tokens, 25% tail budget, 3 iteration doom loop, threshold 0.3 fuzzy — đều là kinh nghiệm, không phải nghiên cứu formal.
Maintenance heavy mappings. ARITY (450+), OVERFLOW_PATTERNS (25+), BUNDLED_PROVIDERS (20+) đều cần update khi ecosystem thay đổi.
Prompt caching fragility. Plugin transform có thể vô tình break header → cache miss hàng loạt. Observability cho cache hit rate chưa rõ ràng.
Sub-agent summary bottleneck. Parent chỉ thấy summary → khi sub-agent lầm lạc, parent khó debug. Cần tooling để expose child trace khi cần.

Khuyến nghị khi tái dùng

Nếu build harness riêng, các kỹ thuật nên copy đầu tiên (ROI cao, implement không quá phức tạp):

T15 — Output truncation + file spill: gần như mandatory, quá dễ overflow nếu không có.
T10 — Cache-aware 2-part system prompt: nếu dùng Anthropic, -80% chi phí input.
T7, T8 — Context overflow detection + tail-preserve compaction: session dài sẽ cần.
T16 — Zod validation với user-friendly error: giảm tool arg error đáng kể.
T25 — Session-scoped permission state (once/always): UX chốt lại từ spam → smooth.

Các kỹ thuật nên cẩn trọng khi copy (high complexity, provider-specific, hoặc dễ gây bug khi config sai): T17 (fuzzy edit — false match rủi ro), T18 (tree-sitter WASM — infra cost), T22 (provider transform — cần test per provider), T26 (arity map — maintenance burden).

Kết

opencode là một minh họa tốt về "harness engineering" có tính kỹ lưỡng: không chỉ wrap model call, mà coi mỗi khía cạnh — context, tool, permission, provider — như một module có spec, error mode, và trade-off rõ ràng. Code đọc giống Effect library hơn script gluing lại API. Phần thiết kế đáng tham khảo nhất là cách họ xử lý "agent trong môi trường thật": bash có thể xóa nhầm, model có thể loop, tool có thể overflow, provider có thể khác nhau về thông điệp lỗi — tất cả đều có kỹ thuật đối phó cụ thể chứ không phải "try-catch và hy vọng".

Với người dùng muốn xây dựng agent riêng, opencode là "đồ chơi" tốt để đọc source — mỗi file <500 dòng, service tách rời, comment vừa đủ. Đề xuất đọc theo thứ tự: session/prompt.ts → session/processor.ts → tool/tool.ts → tool/edit.ts → permission/ → provider/.

Tổng hợp nguồn tham khảo

Tất cả các link được trích dẫn trong báo cáo, nhóm theo chủ đề. File references chi tiết: sources/opencode-harness/references.md.