T15 — Output truncation với tail-keep + file spill

MAX_LINES=2000, MAX_BYTES=50KB. Giữ phần đuôi (tail) của output — thường quan trọng nhất — và spill full output ra file trên disk để agent truy cập lại. "Lossy on model, lossless on disk."

Nhóm: C — Tool DesignFile: tool/truncate.ts · Lines 64–126ID: C.3 / T15Status: Stable

Tổng quan Tool Design

Tại sao quan trọng. bash ls -R /, grep -r "" ., hay npm install verbose có thể trả về hàng megabyte output. Đưa raw vào context = context window đầy ngay, model không làm gì khác được. Truncate đơn thuần (chỉ cắt) làm mất thông tin quan trọng nhất — thường nằm ở phần cuối của bash output (kết quả cuối, lỗi cuối). opencode giải quyết bằng hai hành động đồng thời: giữ tail trong context và spill full ra file để agent đọc lại nếu cần.

Nguyên tắc "lossy on model, lossless on disk": Model chỉ thấy phần tail (đủ để quyết định bước tiếp), nhưng không có thông tin nào bị mất hoàn toàn — full output luôn có trên disk tại path được thông báo rõ ràng.

Phân tích code chi tiết Anatomy

Hai tham số giới hạn và direction

tool/truncate.ts — core logic

{`
const MAX_LINES = 2_000
const MAX_BYTES = 50_000   // ~50KB

export function truncateOutput(
  output: string,
  opts: {
    direction?: "head" | "tail"   // head cho compile errors, tail cho bash log
    toolName: string
    callId:   string
  } = {},
) {
  const { direction = "tail", toolName, callId } = opts
  const lines = output.split("\n")

  // Không cần truncate nếu trong giới hạn
  if (lines.length <= MAX_LINES && output.length <= MAX_BYTES) return output

  // Giữ lại "tail" hoặc "head" theo direction
  const kept = direction === "tail"
    ? lines.slice(-MAX_LINES).join("\n")
    : lines.slice(0,  MAX_LINES).join("\n")

  // Spill full output ra file để agent đọc lại
  const spillPath = `/tmp/opencode-overflow/${toolName}-${callId}.txt`
  fs.writeFileSync(spillPath, output)

  return `${kept}
<truncated-notice>
Output truncated to last ${MAX_LINES} lines / ${MAX_BYTES} bytes.
Full output saved to: ${spillPath}
Use the Read or Grep tool to access specific parts.
</truncated-notice>`
}
`}

bash output: 50,000 lines (5MB) │ ▼ ┌─────────────────────────────┐ │ truncateOutput() │ │ │ │ lines.slice(-2000) ──────►│──► context (tail 2000 lines) │ │ │ fs.writeFileSync() ───────►│──► /tmp/opencode-overflow/bash-abc.txt │ │ └─────────────────────────────┘ │ ▼ Model nhận được: [2000 lines của tail] <truncated-notice> Full output: /tmp/opencode-overflow/bash-abc.txt Use Read/Grep to inspect. </truncated-notice>

Khi nào dùng head vs tail

Tail (default): Hầu hết bash output — lỗi runtime, kết quả build, output cuối cùng quan trọng hơn phần đầu verbose.

Head: Compile error thường xuất hiện ở đầu. Ví dụ: TypeScript compiler trả 10,000 dòng lỗi — lỗi đầu là root cause, cuối là cascade.

Generic implementation với spill

{`
async function truncateAndSpill(
  output: string,
  meta: { tool: string; callId: string },
): Promise<string> {
  if (output.length <= 50_000) return output

  // Giữ tail 50KB
  const tail = output.slice(-50_000)

  // Spill to disk
  const path = `/tmp/agent-overflow/${meta.tool}-${meta.callId}.log`
  await fs.writeFile(path, output)

  return tail + `\n\n[Output was ${output.length} bytes, truncated to last 50k. ` +
    `Full content: ${path}. Use Read/Grep to inspect.]`
}
`}

Tương tác với kỹ thuật khác Interaction

T14 (Effect lazy tool init): truncate.apply(id, result) trong wrap() là điểm gọi T15. Toàn bộ tool registry tự động được bảo vệ.
T7 (Token overflow detection): Truncation ở T15 là biện pháp phòng thủ để tránh overflow. T7 detect khi context gần đầy và trigger compaction — T15 phòng ngừa ngay từ đầu ở tool level.
T19 (Sub-agent Task tool): Sub-agent thường làm search/explore — output search thường rất lớn. T15 quan trọng nhất ở đây: sub-agent trả về summary nhỏ gọn, không bị bulk output ô nhiễm parent context.
T18 (Bash tree-sitter): Tree-sitter parse bash command trước khi execute — nếu command rõ ràng sẽ produce large output (ls -R, find...), tool có thể warn hoặc redirect trực tiếp tới T15.

Failure modes Failure

1. Agent quên path spill

Model nhận truncation notice nhưng không dùng Read/Grep để đọc file → tiếp tục với thông tin thiếu. Đây là limitation của approach: phụ thuộc vào model "nhớ" và "muốn" đọc full output.

Mitigation: Notice phải rõ ràng và actionable — không chỉ nói "truncated" mà còn nói "Use Read tool to inspect /path/to/file". opencode viết notice dạng instruction, không dạng thông báo thuần.

2. Disk full / temp cleanup

Nếu /tmp/opencode-overflow/ không được clean up, disk sẽ đầy theo thời gian (đặc biệt với agent chạy nhiều lệnh verbose). Cần cleanup job hoặc TTL-based deletion.

3. Giới hạn magic number

2000 dòng / 50KB là ngưỡng được chọn theo kinh nghiệm. Workload khác nhau cần khác: cargo build cho Rust project lớn có thể cần nhiều hơn; simple file listing cần ít hơn. Hiện tại không có per-tool override.

So sánh với các harness khác Compare

Harness	Truncation strategy	Spill to disk	Direction
opencode	Tail 2000 lines / 50KB	✅ + path trong notice	✅ head/tail
Claude Code	Truncate + file spill tương tự	✅	✅
Aider	Byte limit, no spill	❌	❌
Cline	Char limit per tool, no spill	❌	❌
OpenHarness	Line limit nhưng không spill	❌	❌

Implementation recipe Recipe

{`
import { writeFileSync, mkdirSync } from "node:fs"
import { join } from "node:path"
import { randomUUID } from "node:crypto"

const SPILL_DIR = join(process.env.TMPDIR ?? "/tmp", "agent-overflow")
mkdirSync(SPILL_DIR, { recursive: true })

export function truncateOutput(
  output: string,
  opts: { toolName: string; direction?: "head" | "tail" } = { toolName: "tool" }
): string {
  const MAX_LINES = 2_000
  const MAX_BYTES = 50_000
  const { toolName, direction = "tail" } = opts
  const lines = output.split("\n")

  if (lines.length <= MAX_LINES && output.length <= MAX_BYTES) return output

  const kept = direction === "tail"
    ? lines.slice(-MAX_LINES).join("\n")
    : lines.slice(0,  MAX_LINES).join("\n")

  const callId  = randomUUID().slice(0, 8)
  const spillPath = join(SPILL_DIR, `${toolName}-${callId}.txt`)
  writeFileSync(spillPath, output)

  return [
    kept,
    "",
    `[Output truncated: ${lines.length} lines → ${MAX_LINES} lines kept (${direction}).`,
    ` Full output (${output.length} bytes) saved to: ${spillPath}`,
    ` Use Read or Grep tool to inspect specific sections.]`,
  ].join("\n")
}
`}

Recommendation: Áp dụng truncation sau mọi tool execute, không phải trước. Truncation ở output layer (không phải input layer) đảm bảo tool luôn chạy đầy đủ, chỉ output gửi về model mới bị giới hạn.

Tham khảo Refs

Tham khảo