Skill.md
Query Loop Implementation
Most AI products start with a single model call. The moment you add tools, that is no longer enough. The model must be able to ask for a tool, receive the result, reason over the observation, and either call another tool or produce a final answer.
This skill turns that pattern into product infrastructure.
Core Architecture
Use three boundaries:
ConversationManager
Owns durable state: session id, persisted messages, user settings, auth context, usage, and budget.
QueryLoop
Owns one task turn: call the model, detect tool calls, execute tools, append tool results, and decide whether to continue or stop.
ToolRuntime
Owns registered tools: schemas, validation, permission checks, execution, error formatting, and result size limits.
Minimal Loop
The first production version should be deliberately narrow:
for (let turn = 1; turn <= maxTurns; turn++) {
const response = await model.generate({ messages, tools })
messages.push(response.message)
const toolCalls = extractToolCalls(response.message)
if (toolCalls.length === 0) {
return { status: "completed", finalMessage: response.message, messages }
}
for (const call of toolCalls) {
const result = await tools.execute(call, { signal, messages })
messages.push(makeToolResultMessage(call.id, result))
}
}
return { status: "max_turns", messages }
The model continues after tool use because the loop calls the model again with the tool_result messages appended.
Safety Requirements
- Require
maxTurns. - Support abort signals and request timeouts.
- Validate every model-produced tool input against a schema.
- Check permission before any side effect.
- Track token, cost, and runtime budgets.
- Size-limit tool output before appending it to history.
- Return recoverable tool errors as tool results.
- Stop on permission denial, budget exhaustion, repeated failure, or fatal tool errors.
When to Use
Use this skill when implementing or reviewing:
- Tool calling in a chat or workflow product
- Function-calling loops
- ReAct-style reasoning-action-observation cycles
- Query engines or agent runtimes
- Claude Code-like agent loop behavior
- Tool result feedback and continuation logic
- Guardrails for max turns, timeouts, budgets, and permissions
What It Deliberately Leaves Out
This skill does not design context-window management. Keep trimming, retrieval, summarization, and compaction in a separate layer. The query loop should accept messages as input, return updated messages, and remain focused on deterministic control flow.