Track AI agent runs and automatically detect cost waste. The AgentYield SDK captures LLM and tool calls, then analyzes them for inefficiencies.
Supported providers: OpenAI, Anthropic, Google, Meta / Llama (via Together AI), Mistral, Groq, DeepSeek, and OpenRouter (prefix upstream models with openrouter/). Using a model that isn't built in? Pass costUsd and contextWindowUsed yourself.
npm install @agentyield/sdk
Add ~10 lines to your agent. That's it.
import OpenAI from "openai"
import { AgentYield } from "@agentyield/sdk"
const openai = new OpenAI()
const ay = new AgentYield({
apiKey: process.env.AGENTYIELD_API_KEY,
agentId: "research-agent",
})
const run = ay.startRun({ metadata: { task: "Q2 competitive research" } })
// Your code — LLM call
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Summarize the competitive landscape for Q2" }],
})
// [AgentYield] Track the LLM call — SDK computes cost automatically
run.trackLLMCall({
model: response.model,
inputTokens: response.usage.prompt_tokens,
outputTokens: response.usage.completion_tokens,
costUsd: ay.computeCost(response.model, response.usage),
contextWindowUsed: response.usage.prompt_tokens / 128_000,
purpose: "competitive_summary",
})
// Your code — tool call
const searchResults = await webSearch("AI agent monitoring tools 2026")
// [AgentYield] Track tool usage — no cost for free tools, pass 0
run.trackToolCall({
tool: "web_search",
input: { query: "AI agent monitoring tools 2026" },
costUsd: 0, // web_search cost varies by provider — pass actual cost or 0
metadata: { resultCount: searchResults.length },
})
await run.end({ status: "success" })Initialize the SDK. Call once per application.
| Parameter | Type | Required | Description |
|---|---|---|---|
| apiKey | string | Yes | Your API key (ay_live_... or ay_test_...) |
| agentId | string | Yes | Identifies which agent is being tracked |
| baseURL | string | No | API endpoint override for self-hosted |
| timeoutMs | number | No | Per-request timeout in ms (default 3000). Slow requests are aborted; the SDK warns and continues — it never throws or hangs your agent. |
Start a new run. Returns a Run instance.
| Parameter | Type | Required | Description |
|---|---|---|---|
| runId | string | No | Custom run ID (auto-generated if omitted) |
| metadata | object | No | Arbitrary metadata attached to the run |
Record an LLM API call within the run.
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | Yes | Model name (e.g. "gpt-4o") |
| inputTokens | number | Yes | Input token count |
| outputTokens | number | Yes | Output token count |
| costUsd | number | Yes | Cost in USD |
| contextWindowUsed | number | No | Proportion of context window used (0.0–1.0) |
| purpose | string | No | Label for this call (e.g. "initial_planning") |
Record a tool invocation. The input is hashed locally — raw data never leaves your app.
| Parameter | Type | Required | Description |
|---|---|---|---|
| tool | string | Yes | Tool name |
| input | object | Yes | Tool input (hashed locally, never sent raw) |
| costUsd | number | Yes | Cost in USD |
| redact | string[] | No | Fields to remove before hashing |
| metadata | object | No | Additional metadata |
Flush all buffered events to AgentYield as a named time window, then reset the buffer. The run stays open — use this for long-running or always-on agents that need to report metrics periodically. Returns a Promise<void>.
| Parameter | Type | Required | Description |
|---|---|---|---|
| label | string | No | Label for this window (defaults to window_1, window_2, …) |
const run = ay.startRun();
// flush metrics every 5 minutes — run stays open
setInterval(async () => {
await run.checkpoint({ label: "5m-window" });
}, 5 * 60 * 1000);
// later, when the agent shuts down
await run.end({ status: "success" });Each checkpoint sends the events collected since the last checkpoint (or since the run started). After a successful checkpoint the internal buffer is cleared, so end() only sends events recorded after the most recent checkpoint.
Flush any remaining buffered events and close the run immediately by sending a final checkpoint with final: true. The server marks the run completed on receipt — there is no wait on the server-side auto-close timer. Returns a Promise<void>.
| Parameter | Type | Description |
|---|---|---|
| status | "success" | "error" | "timeout" | Final run status |
The SDK is designed to never block or crash your agent on transport problems.
AbortController with a 3-second default. Override with the timeoutMs config option. Slow networks can never hang your run loop.console.warn and returns. Your agent keeps running.beforeExit / SIGINT / SIGTERM handler fires a final fire-and-forget checkpoint with status: "error" and final: true.Both checkpoint() and end() POST to the same checkpoint endpoint — end() simply sets final: true.
| SDK call | HTTP request | Body |
|---|---|---|
| run.checkpoint({ label }) | POST /v1/runs/{runId}/checkpoint | { agentId, label, events } |
| run.end({ status }) | POST /v1/runs/{runId}/checkpoint | { agentId, label: "final", events, final: true, status } |
Checkpoints are idempotent on (runId, label) — retrying the same checkpoint never double-counts events.
Not every agent starts a task and finishes minutes later. Monitoring bots, always-on assistants, and framework-based agents like OpenClaw can run for hours or indefinitely. The checkpoint() method is designed for these workloads.
ay.startRun() once when the agent starts.trackLLMCall() and trackToolCall().run.checkpoint() to flush buffered events. Each checkpoint sends only the events recorded since the last flush.run.end() to close the run and send any remaining events.If the process exits unexpectedly, the SDK's built-in exit handler will attempt to flush remaining events automatically.
The simplest pattern — flush every N minutes regardless of activity:
import OpenAI from "openai"
// [AgentYield] Import the SDK
import { AgentYield } from "@agentyield/sdk"
const openai = new OpenAI()
// [AgentYield] Initialize with your API key
const ay = new AgentYield({
apiKey: process.env.AGENTYIELD_API_KEY,
agentId: "support-monitor",
})
// [AgentYield] Start a run to track this worker
const run = ay.startRun({ metadata: { env: "production" } })
// [AgentYield] Flush to AgentYield every 5 minutes — run stays open
const flushInterval = setInterval(async () => {
await run.checkpoint({ label: `window-${new Date().toISOString()}` })
}, 5 * 60 * 1000)
// Your code — process tasks from a queue indefinitely
for await (const ticket of supportTicketQueue) {
try {
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: ticket.body }],
})
run.trackLLMCall({
model: response.model,
inputTokens: response.usage.prompt_tokens,
outputTokens: response.usage.completion_tokens,
costUsd: ay.computeCost(response.model, response.usage),
contextWindowUsed: response.usage.prompt_tokens / ay.contextWindowSize(response.model),
purpose: "ticket_response",
})
await postReply(ticket.id, response.choices[0].message.content)
} catch (err) {
// Log the error but keep the agent running
console.error(`[agent] Failed to process ticket ${ticket.id}:`, err)
}
}
// Graceful shutdown — flush remaining events and close the run
process.on("SIGTERM", async () => {
clearInterval(flushInterval)
// [AgentYield] End the run on shutdown
await run.end({ status: "success" })
process.exit(0)
})Flush after completing a unit of work — useful when activity is bursty:
import OpenAI from "openai"
import { AgentYield } from "@agentyield/sdk"
const openai = new OpenAI()
const ay = new AgentYield({
apiKey: process.env.AGENTYIELD_API_KEY,
agentId: "task-queue-agent",
})
const run = ay.startRun()
async function processTask(task: Task) {
try {
// Your code — LLM call
const response = await openai.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: task.prompt }],
})
// [AgentYield] Track the LLM call — SDK handles cost and context math
run.trackLLMCall({
model: response.model,
inputTokens: response.usage.prompt_tokens,
outputTokens: response.usage.completion_tokens,
costUsd: ay.computeCost(response.model, response.usage),
contextWindowUsed: response.usage.prompt_tokens / ay.contextWindowSize(response.model),
purpose: "task_processing",
})
// Your code — write result to database
await db.write({ id: task.id, result: response.choices[0].message.content })
// [AgentYield] Track the tool call — no cost for internal db writes
run.trackToolCall({
tool: "db_write",
input: { id: task.id },
costUsd: 0,
})
// [AgentYield] Flush after each task completes
await run.checkpoint({ label: `task-${task.id}` })
} catch (err) {
console.error(`[agent] Failed to process task ${task.id}:`, err)
}
}
// Your code — process tasks from a queue indefinitely
for await (const task of taskQueue) {
await processTask(task)
}
// Graceful shutdown
process.on("SIGTERM", async () => {
await run.end({ status: "success" })
process.exit(0)
})Labels help you identify windows in your dashboard. If you don't provide one, the SDK auto-generates window_1, window_2, etc. Use descriptive labels when the context matters:
await run.checkpoint({ label: "morning-batch" });
await run.checkpoint({ label: "afternoon-batch" });
await run.checkpoint(); // → "window_3"task-1234 or hourly-2pm make it easier to correlate waste patterns with specific workloads.end() is more reliable.AgentYield is available as an OpenClaw skill. No SDK code required — install the skill and waste detection runs automatically inside every OpenClaw session.
In any OpenClaw session, run:
openclaw skills install agentyield
Add your AgentYield API key to your OpenClaw environment config (.env or openclaw.json):
AGENTYIELD_API_KEY=ay_live_xxxxxxxxxxxxxxxxxxxx
[AgentYield] Checkpoint sent — Waste Score: 34. View: https://agentyield.co/runs/...The skill accepts optional config fields in the OpenClaw skill frontmatter:
| Field | Default | Description |
|---|---|---|
| checkpointEvery | 50 | Flush a checkpoint every N events |
| checkpointInterval | 30 | Flush a checkpoint every N minutes |
The OpenClaw skill is ideal when you're already using OpenClaw and want zero-code setup. For custom agents, direct integrations, or fine-grained control over event tracking, use the TypeScript SDK instead.
Use an ay_test_ prefixed key to validate your integration without sending data. Useful for CI/CD.
const ay = new AgentYield({
apiKey: "ay_test_xxxxxxxxxxxxxxxxxxxx",
agentId: "my-agent",
})
// Logs: [AgentYield] Test mode — run data not sentTool inputs are hashed locally using SHA-256. Only the hash is transmitted — raw inputs never leave your application. Use redact to exclude sensitive fields:
run.trackToolCall({
tool: "db_query",
input: {
table: "users",
filter: { email: "[email protected]", ssn: "123-45-6789" }
},
redact: ["filter.email", "filter.ssn"], // PII removed before hashing
costUsd: 0, // internal query — no direct API cost
})
// → Hash is computed on { table: "users", filter: {} }
// → Raw values never leave your applicationAlready logging your LLM traffic in another observability tool? Skip the SDK install. AgentYield Connectors pull your existing request history into AgentYield and run our waste detection on top — duplicate calls, oversized context, model mismatches, excessive retries, and redundant reads.
| Platform | Status | Notes |
|---|---|---|
| Helicone | Available | US and EU regions. Bulk-pulls from the Helicone Clickhouse query endpoint. Buckets runs by Helicone-Session-Id → Helicone-Job-Id → user ID + 5-minute window → daily fallback. Cost is auto-estimated for any model where Helicone returns $0. |
| LangSmith | Coming soon | Sync runs and traces from LangSmith projects. |
| Langfuse | Coming soon | Sync traces and observations from Langfuse. |
Need a connector for a different platform? Let us know and we'll prioritize it.
You can delete any individual run — and all of its events, waste findings, and recommendations — at any time. Two options:
Authenticate with your API key (the same one used by the SDK / OpenClaw skill) and send a DELETE request:
curl -X DELETE \
-H "Authorization: Bearer ay_live_xxxxxxxxxxxxxxxx" \
https://agentyield.co/api/v1/runs/{runId}{runId} may be either the AgentYield-internal run UUID (visible in your dashboard URL) or the external runId string your SDK or OpenClaw skill assigned at checkpoint time.
Response (success):
{ "deleted": true, "runId": "01HZ..." }Status codes:
200 — Run and all related data deleted.401 — Missing, invalid, or revoked API key.404 — Run not found, or not owned by the API key's account.Test-mode keys (ay_test_) return { deleted: true, testMode: true } as a no-op, since test-mode events are never persisted in the first place.
Open the run from Runs, then use the delete action on the run detail page. (Coming soon — for now, use the API.)
run_events belonging to the runrun_checkpoints for the runwaste_findings generated for the runrecommendations tied to the runSee our Privacy Policy for full retention details, and the OpenClaw skill manifest for the exact telemetry contract.
Create a free account and generate an API key to start sending run data.
Package: @agentyield/sdk · npm