Skill.md
Compact Memory Implementation Guide
Agents running long sessions face a fundamental problem: conversation history grows unboundedly, but the model's context window is fixed. Once you hit the limit, you either truncate history (losing critical context) or the session fails. This skill provides a complete engineering implementation: when context approaches the limit, automatically fork a compactor sub-agent to compress the session, inject the result into the next system prompt, and persist it across sessions.
Designed for engineers building agents with the Anthropic API (Python or TypeScript SDK) or the Claude Agent SDK.
7-Step Implementation Framework
| Step | What it covers |
|---|---|
| Step 1 | Understand your setup: SDK, agent architecture, session model |
| Step 2 | Trigger strategy: token threshold (recommended), turn count, phase boundary |
| Step 3 | Fork the compactor: separate API call, Haiku model, synchronous wait |
| Step 4 | Compact output schema: task, current_state, key_decisions, eliminated_approaches, next_steps |
| Step 5 | Memory restoration: system prompt injection (recommended) or first-message injection |
| Step 6 | Full agent loop: complete code with trigger, compact, persist, and restore |
| Step 7 | Chaining compacts: merge across sessions instead of stacking |
Why Fork Instead of Self-Compact
A main agent deep in a long task has drifted — it's focused on current details, not the full picture. A fresh compactor reads the entire history from scratch and produces a more accurate summary. Separation of concerns: executing vs. summarizing are different cognitive tasks. And practically: Haiku is good enough for compaction; reserve the expensive model for the main work.
Supported Use Cases
- Agents that need to maintain state across sessions (research, code review, multi-step execution)
- Sessions of unpredictable length where manual intervention is not acceptable
- Scenarios where key decisions, eliminated approaches, and tool results must survive compaction
- Long-running projects where the compact must stay accurate across many sessions (chaining mode)
Common Pitfalls
| Pitfall | Fix |
|---|---|
| Compact loses tool results needed later | Include summarized results in relevant_tool_results |
| New session ignores compact | Inject into system prompt, not messages |
| Compactor uses the expensive model | Use Haiku for compaction, Opus for main work |
| Compact grows unbounded across sessions | Use chaining compacts — merge, don't stack |
| Compact JSON parse failure crashes session | Add retry with error feedback + fallback compact |
Limitations
This skill provides an implementation framework and code templates, not a drop-in library. Compact quality depends on the compactor's system prompt — if the instructions are vague, the summary will miss key context. Validate the first few compacts manually in development before relying on them in production.