Back to Blog
Protocol Deep-DiveMay 16, 2026·11 min read

Spend Tokens & Gateway Enforcement: A Reference Model for LLM Tool Calls

By Nomiqon Engineering

Spend Tokens & Gateway Enforcement: A Reference Model for LLM Tool Calls

Every billable tool invocation in an LLM agent loop is a potential unbounded spend event. We define the spend-token lifecycle — mint, attach, validate, settle, void — and show how Nomiqon's gateway intercepts outbound HTTP before funds move, with reference implementations for TypeScript and Python SDKs.

Large language model agents invoke tools — search, code execution, payment APIs — through HTTP calls initiated by the model's function-calling layer. Each invocation is potentially billable. Without an enforcement boundary between the model's intent and the network, a hallucinated tool name or injected prompt can trigger unlimited paid requests. Nomiqon's spend-token model inserts a cryptographic authorization step between tool selection and packet transmission.

A spend token is a short-lived JWT authorising exactly one spend event — amount, recipient hostname, and agent identity — validated by the gateway in under 8 ms median latency.

The Spend Token Lifecycle

TypeScript Reference Implementation

typescript
const headers = await nomiqon.agents.getSpendHeaders(agent.id);
// Returns:
// {
//   "x-nomiqon-agent-id":    "ag_01jx...",
//   "x-nomiqon-spend-token": "spt_eyJhbGciOiJFUzI1NiIs...",
// }

const response = await fetch("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: {
    ...headers,
    Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [{ role: "user", content: userPrompt }],
  }),
});

// Gateway flow:
// 1. Intercept spend token before egress
// 2. Evaluate policy against api.openai.com + estimated cost
// 3. Debit committed balance or return 402/403
// 4. Forward request only if approved

Python / httpx Tool Wrapper

python
import httpx
import nomiqon

client = nomiqon.Client(api_key=os.environ["NOMIQON_API_KEY"])

class NomiqonToolClient:
    def __init__(self, agent_id: str):
        self.agent_id = agent_id

    def post(self, url: str, **kwargs):
        headers = client.agents.get_spend_headers(self.agent_id)
        merged = {**headers, **kwargs.pop("headers", {})}
        return httpx.post(url, headers=merged, **kwargs)

tool_http = NomiqonToolClient(agent_id="ag_01jx...")
tool_http.post(
    "https://api.anthropic.com/v1/messages",
    json={"model": "claude-sonnet-4-20250514", "messages": [...]},
)

Error Semantics Agents Must Handle

Production agent loops must branch on Nomiqon error codes explicitly. Treat 402 policy_cap_exceeded as a hard stop — retrying without policy change will fail identically. Treat 403 policy_domain_blocked as a potential prompt-injection signal worth escalating. Spend token expiry (401 spend_token_expired) warrants a fresh token mint, not a blind retry of the stale header.

typescript
try {
  await agent.fetch(url, options);
} catch (err) {
  if (err.code === "policy_cap_exceeded") {
    await notifyOperator(agent.id, "Daily cap reached");
    await nomiqon.agents.pause(agent.id);
  } else if (err.code === "policy_domain_blocked") {
    await securityAlert(agent.id, url, "Unexpected domain");
  } else if (err.code === "spend_token_expired") {
    // Re-mint and retry once
    const headers = await nomiqon.agents.getSpendHeaders(agent.id);
    await agent.fetch(url, { ...options, headers });
  }
}

Integrating with LangChain and Vercel AI SDK

Framework middleware hooks — LangChain BaseCallbackHandler, Vercel AI SDK experimental_toolCallStreaming — provide interception points before tool HTTP executes. Wrap the default fetch implementation to inject spend headers universally. Centralising this in one module prevents individual tools from bypassing enforcement during rapid prototyping.

The gateway model means your LLM provider keys remain in a secrets manager while spend authority flows through per-agent tokens. Compromise of a tool implementation leaks capability scoped to that agent's policy — not your entire organisation's API budget.

Spend tokens turn every LLM tool call into an auditable, policy-gated financial event — the minimum viable control plane for autonomous software that spends real money.