Spend Tokens & Gateway Enforcement: A Reference Model for LLM Tool Calls — Nomiqon Blog

Large language model agents invoke tools — search, code execution, payment APIs — through HTTP calls initiated by the model's function-calling layer. Each invocation is potentially billable. Without an enforcement boundary between the model's intent and the network, a hallucinated tool name or injected prompt can trigger unlimited paid requests. Nomiqon's spend-token model inserts a cryptographic authorization step between tool selection and packet transmission.

A spend token is a short-lived JWT authorising exactly one spend event — amount, recipient hostname, and agent identity — validated by the gateway in under 8 ms median latency.

The Spend Token Lifecycle

Mint — agent SDK requests token for (amount, recipient) pair.
Attach — token injected into outbound HTTP headers.
Validate — gateway verifies signature, policy, and balance.
Settle — receipt confirmed; batch transfer queued on Solana.
Void — unconfirmed receipts expire after 60 s; committed balance released.

TypeScript Reference Implementation

typescript

const headers = await nomiqon.agents.getSpendHeaders(agent.id);
// Returns:
// {
//   "x-nomiqon-agent-id":    "ag_01jx...",
//   "x-nomiqon-spend-token": "spt_eyJhbGciOiJFUzI1NiIs...",
// }

const response = await fetch("https://api.openai.com/v1/chat/completions", {
  method: "POST",
  headers: {
    ...headers,
    Authorization: `Bearer ${process.env.OPENAI_API_KEY}`,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({
    model: "gpt-4o",
    messages: [{ role: "user", content: userPrompt }],
  }),
});

// Gateway flow:
// 1. Intercept spend token before egress
// 2. Evaluate policy against api.openai.com + estimated cost
// 3. Debit committed balance or return 402/403
// 4. Forward request only if approved

Python / httpx Tool Wrapper

python

import httpx
import nomiqon

client = nomiqon.Client(api_key=os.environ["NOMIQON_API_KEY"])

class NomiqonToolClient:
    def __init__(self, agent_id: str):
        self.agent_id = agent_id

    def post(self, url: str, **kwargs):
        headers = client.agents.get_spend_headers(self.agent_id)
        merged = {**headers, **kwargs.pop("headers", {})}
        return httpx.post(url, headers=merged, **kwargs)

tool_http = NomiqonToolClient(agent_id="ag_01jx...")
tool_http.post(
    "https://api.anthropic.com/v1/messages",
    json={"model": "claude-sonnet-4-20250514", "messages": [...]},
)

Error Semantics Agents Must Handle

Production agent loops must branch on Nomiqon error codes explicitly. Treat 402 policy_cap_exceeded as a hard stop — retrying without policy change will fail identically. Treat 403 policy_domain_blocked as a potential prompt-injection signal worth escalating. Spend token expiry (401 spend_token_expired) warrants a fresh token mint, not a blind retry of the stale header.

typescript

try {
  await agent.fetch(url, options);
} catch (err) {
  if (err.code === "policy_cap_exceeded") {
    await notifyOperator(agent.id, "Daily cap reached");
    await nomiqon.agents.pause(agent.id);
  } else if (err.code === "policy_domain_blocked") {
    await securityAlert(agent.id, url, "Unexpected domain");
  } else if (err.code === "spend_token_expired") {
    // Re-mint and retry once
    const headers = await nomiqon.agents.getSpendHeaders(agent.id);
    await agent.fetch(url, { ...options, headers });
  }
}

Integrating with LangChain and Vercel AI SDK

Framework middleware hooks — LangChain BaseCallbackHandler, Vercel AI SDK experimental_toolCallStreaming — provide interception points before tool HTTP executes. Wrap the default fetch implementation to inject spend headers universally. Centralising this in one module prevents individual tools from bypassing enforcement during rapid prototyping.

The gateway model means your LLM provider keys remain in a secrets manager while spend authority flows through per-agent tokens. Compromise of a tool implementation leaks capability scoped to that agent's policy — not your entire organisation's API budget.

Spend tokens turn every LLM tool call into an auditable, policy-gated financial event — the minimum viable control plane for autonomous software that spends real money.