Track AI agent runs and automatically detect cost waste. The AgentYield SDK captures LLM and tool calls, then analyzes them for inefficiencies.
Supported providers: OpenAI, Anthropic, Google, Meta / Llama (via Together AI), Mistral, Groq, DeepSeek, and OpenRouter (prefix upstream models with openrouter/). Using a model that isn't built in? Pass costUsd and contextWindowUsed yourself.
npm install @agentyield/sdk
Add ~10 lines to your agent. That's it.
import OpenAI from "openai"
import { AgentYield } from "@agentyield/sdk"
const openai = new OpenAI()
const ay = new AgentYield({
apiKey: process.env.AGENTYIELD_API_KEY,
agentId: "research-agent",
})
const run = ay.startRun({ metadata: { task: "Q2 competitive research" } })
// Your code — LLM call
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Summarize the competitive landscape for Q2" }],
})
// [AgentYield] Track the LLM call — SDK computes cost automatically
run.trackLLMCall({
model: response.model,
inputTokens: response.usage.prompt_tokens,
outputTokens: response.usage.completion_tokens,
costUsd: ay.computeCost(response.model, response.usage),
contextWindowUsed: response.usage.prompt_tokens / 128_000,
purpose: "competitive_summary",
})
// Your code — tool call
const searchResults = await webSearch("AI agent monitoring tools 2026")
// [AgentYield] Track tool usage — no cost for free tools, pass 0
run.trackToolCall({
tool: "web_search",
input: { query: "AI agent monitoring tools 2026" },
costUsd: 0, // web_search cost varies by provider — pass actual cost or 0
metadata: { resultCount: searchResults.length },
})
await run.end({ status: "success" })Initialize the SDK. Call once per application.
| Parameter | Type | Required | Description |
|---|---|---|---|
| apiKey | string | Yes | Your API key (ay_live_... or ay_test_...) |
| agentId | string | Yes | Identifies which agent is being tracked |
| baseURL | string | No | API endpoint override for self-hosted |
Start a new run. Returns a Run instance.
| Parameter | Type | Required | Description |
|---|---|---|---|
| runId | string | No | Custom run ID (auto-generated if omitted) |
| metadata | object | No | Arbitrary metadata attached to the run |
Record an LLM API call within the run.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model name (e.g. "gpt-4o") |
| inputTokens | number | Yes | Input token count |
| outputTokens | number | Yes | Output token count |
| costUsd | number | Yes | Cost in USD |
| contextWindowUsed | number | No | Proportion of context window used (0.0–1.0) |
| purpose | string | No | Label for this call (e.g. "initial_planning") |
Record a tool invocation. The input is hashed locally — raw data never leaves your app.
| Parameter | Type | Required | Description |
|---|---|---|---|
| tool | string | Yes | Tool name |
| input | object | Yes | Tool input (hashed locally, never sent raw) |
| costUsd | number | Yes | Cost in USD |
| redact | string[] | No | Fields to remove before hashing |
| metadata | object | No | Additional metadata |
Flush all buffered events to AgentYield as a named time window, then reset the buffer. The run stays open — use this for long-running or always-on agents that need to report metrics periodically. Returns a Promise<void>.
| Parameter | Type | Required | Description |
|---|---|---|---|
| label | string | No | Label for this window (defaults to window_1, window_2, …) |
const run = ay.startRun();
// flush metrics every 5 minutes — run stays open
setInterval(async () => {
await run.checkpoint({ label: "5m-window" });
}, 5 * 60 * 1000);
// later, when the agent shuts down
await run.end({ status: "success" });Each checkpoint sends the events collected since the last checkpoint (or since the run started). After a successful checkpoint the internal buffer is cleared, so end() only sends events recorded after the most recent checkpoint.
Flush any remaining buffered events and close the run. Returns a Promise<void>.
| Parameter | Type | Description |
|---|---|---|
| status | "success" | "error" | "timeout" | Final run status |
Not every agent starts a task and finishes minutes later. Monitoring bots, always-on assistants, and framework-based agents like OpenClaw can run for hours or indefinitely. The checkpoint() method is designed for these workloads.
ay.startRun() once when the agent starts.trackLLMCall() and trackToolCall().run.checkpoint() to flush buffered events. Each checkpoint sends only the events recorded since the last flush.run.end() to close the run and send any remaining events.If the process exits unexpectedly, the SDK's built-in exit handler will attempt to flush remaining events automatically.
The simplest pattern — flush every N minutes regardless of activity:
import OpenAI from "openai"
// [AgentYield] Import the SDK
import { AgentYield } from "@agentyield/sdk"
const openai = new OpenAI()
// [AgentYield] Initialize with your API key
const ay = new AgentYield({
apiKey: process.env.AGENTYIELD_API_KEY,
agentId: "support-monitor",
})
// [AgentYield] Start a run to track this worker
const run = ay.startRun({ metadata: { env: "production" } })
// [AgentYield] Flush to AgentYield every 5 minutes — run stays open
const flushInterval = setInterval(async () => {
await run.checkpoint({ label: `window-${new Date().toISOString()}` })
}, 5 * 60 * 1000)
// Your code — process tasks from a queue indefinitely
for await (const ticket of supportTicketQueue) {
try {
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: ticket.body }],
})
run.trackLLMCall({
model: response.model,
inputTokens: response.usage.prompt_tokens,
outputTokens: response.usage.completion_tokens,
costUsd: ay.computeCost(response.model, response.usage),
contextWindowUsed: response.usage.prompt_tokens / ay.contextWindowSize(response.model),
purpose: "ticket_response",
})
await postReply(ticket.id, response.choices[0].message.content)
} catch (err) {
// Log the error but keep the agent running
console.error(`[agent] Failed to process ticket ${ticket.id}:`, err)
}
}
// Graceful shutdown — flush remaining events and close the run
process.on("SIGTERM", async () => {
clearInterval(flushInterval)
// [AgentYield] End the run on shutdown
await run.end({ status: "success" })
process.exit(0)
})Flush after completing a unit of work — useful when activity is bursty:
import OpenAI from "openai"
import { AgentYield } from "@agentyield/sdk"
const openai = new OpenAI()
const ay = new AgentYield({
apiKey: process.env.AGENTYIELD_API_KEY,
agentId: "task-queue-agent",
})
const run = ay.startRun()
async function processTask(task: Task) {
try {
// Your code — LLM call
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: task.prompt }],
})
// [AgentYield] Track the LLM call — SDK handles cost and context math
run.trackLLMCall({
model: response.model,
inputTokens: response.usage.prompt_tokens,
outputTokens: response.usage.completion_tokens,
costUsd: ay.computeCost(response.model, response.usage),
contextWindowUsed: response.usage.prompt_tokens / ay.contextWindowSize(response.model),
purpose: "task_processing",
})
// Your code — write result to database
await db.write({ id: task.id, result: response.choices[0].message.content })
// [AgentYield] Track the tool call — no cost for internal db writes
run.trackToolCall({
tool: "db_write",
input: { id: task.id },
costUsd: 0,
})
// [AgentYield] Flush after each task completes
await run.checkpoint({ label: `task-${task.id}` })
} catch (err) {
console.error(`[agent] Failed to process task ${task.id}:`, err)
}
}
// Your code — process tasks from a queue indefinitely
for await (const task of taskQueue) {
await processTask(task)
}
// Graceful shutdown
process.on("SIGTERM", async () => {
await run.end({ status: "success" })
process.exit(0)
})Labels help you identify windows in your dashboard. If you don't provide one, the SDK auto-generates window_1, window_2, etc. Use descriptive labels when the context matters:
await run.checkpoint({ label: "morning-batch" });
await run.checkpoint({ label: "afternoon-batch" });
await run.checkpoint(); // → "window_3"task-1234 or hourly-2pm make it easier to correlate waste patterns with specific workloads.end() is more reliable.AgentYield is available as an OpenClaw skill. No SDK code required — install the skill and waste detection runs automatically inside every OpenClaw session.
In any OpenClaw session, run:
/skill install agentyield
Add your AgentYield API key to your OpenClaw environment config (.env or openclaw.json):
AGENTYIELD_API_KEY=ay_live_xxxxxxxxxxxxxxxxxxxx
[AgentYield] Checkpoint sent — Waste Score: 34. View: https://agentyield.co/runs/...The skill accepts optional config fields in the OpenClaw skill frontmatter:
| Field | Default | Description |
|---|---|---|
| checkpointEvery | 50 | Flush a checkpoint every N events |
| checkpointInterval | 30 | Flush a checkpoint every N minutes |
The OpenClaw skill is ideal when you're already using OpenClaw and want zero-code setup. For custom agents, direct integrations, or fine-grained control over event tracking, use the TypeScript SDK instead.
Use an ay_test_ prefixed key to validate your integration without sending data. Useful for CI/CD.
const ay = new AgentYield({
apiKey: "ay_test_xxxxxxxxxxxxxxxxxxxx",
agentId: "my-agent",
})
// Logs: [AgentYield] Test mode — run data not sentTool inputs are hashed locally using SHA-256. Only the hash is transmitted — raw inputs never leave your application. Use redact to exclude sensitive fields:
run.trackToolCall({
tool: "db_query",
input: {
table: "users",
filter: { email: "[email protected]", ssn: "123-45-6789" }
},
redact: ["filter.email", "filter.ssn"], // PII removed before hashing
costUsd: 0, // internal query — no direct API cost
})
// → Hash is computed on { table: "users", filter: {} }
// → Raw values never leave your applicationAlready logging your LLM traffic in another observability tool? Skip the SDK install. AgentYield Connectors pull your existing request history into AgentYield and run our waste detection on top — duplicate calls, oversized context, model mismatches, excessive retries, and redundant reads.
| Platform | Status | Notes |
|---|---|---|
| Helicone | Available | US and EU regions. Uses the bulk Clickhouse query endpoint; bucket runs by Helicone-Session-Id. |
| LangSmith | Coming soon | Sync runs and traces from LangSmith projects. |
| Langfuse | Coming soon | Sync traces and observations from Langfuse. |
Need a connector for a different platform? Let us know and we'll prioritize it.
Create a free account and generate an API key to start sending run data.
Package: @agentyield/sdk · npm