How to Measure AI Cost per Feature (Not Just per Token)

Most organisations begin tracking LLM API costs at the token level. This is a natural starting point, as tokens are the unit of consumption exposed by all major AI providers and form the basis of billing models across the industry.

However, while token-level visibility provides a clear view of usage, it does not provide a meaningful view of cost in a business context. As AI moves from experimentation into production systems, organisations need to understand not only how much they are using, but what that usage represents in terms of product delivery, customer value, and financial performance.

This is where token-level tracking begins to fall short.

The Limitations of Token-Level Metrics

Token usage is an infrastructure metric. It describes how AI systems are being consumed at a technical level, but it does not reflect how that usage maps to the structure of a business.

For example, token-level reporting does not answer questions such as:

What does it cost to operate a specific feature?
Which features are responsible for the majority of AI spend?
How does AI usage impact product margins?
Where should optimisation efforts be focused?

As long as AI usage is viewed primarily through tokens, cost remains disconnected from the way products and services are delivered.

AI Cost Is Driven by Product Behaviour

In SaaS and ISV environments, AI usage is rarely uniform. It is driven by specific product features and workflows, each with its own usage patterns and cost profile.

Examples might include:

A summarisation capability embedded within a product
A conversational interface or chatbot
A document processing pipeline
A recommendation or classification system

Each of these features generates LLM API requests, and each request incurs cost. However, from a commercial perspective, the relevant unit is not the token, but the feature itself.

Understanding cost per feature allows organisations to connect AI usage directly to revenue, pricing, and margins.

Complexity in Multi-Provider AI Architectures

As AI adoption matures, most organisations move beyond a single provider. They begin to operate in multi-provider AI environments, combining different models and vendors to optimise for performance, cost, and capability.

This introduces additional complexity:

Different providers have different pricing models
Token usage varies significantly across models
Features may rely on multiple providers simultaneously

In this context, analysing cost at the provider level becomes increasingly limited. It does not reflect how AI is actually consumed within the product.

To manage AI effectively, organisations need to analyse cost across business dimensions, not vendor boundaries.

Moving to Feature-Level Cost Attribution

Measuring AI cost per feature requires capturing context at the point of request. Each LLM API call must be associated with the business activity that generated it.

This typically involves attaching structured metadata such as:

Feature
Product
Environment (development versus production)
Customer or tenant (where applicable)
Internal versus customer-facing usage

This approach enables AI cost attribution that aligns with how the organisation operates, rather than how providers report usage.

Measuring Cost in a Business Context

Once usage is structured in this way, organisations can begin to analyse AI cost in more meaningful terms.

For each feature, it becomes possible to measure:

Total cost of operation
Number of requests
Average cost per request
Average token consumption per request

These metrics provide a clearer view of how features behave and how they scale.

They also enable more informed decisions about where to invest, optimise, or redesign.

Identifying Inefficiencies in AI Usage

One of the most valuable insights at this level is the ability to compare usage patterns across features.

For example, differences in average tokens per request may indicate:

Inefficient prompt design
Excessive response lengths
Suboptimal model selection
Redundant or repeated processing steps

Without feature-level visibility, these inefficiencies remain hidden within aggregate LLM API costs.

Implications for SaaS Unit Economics

AI introduces variable cost into software delivery, which has direct implications for SaaS unit economics.

If organisations cannot measure cost at the feature level, they may struggle to:

Align pricing with cost
Maintain consistent margins
Forecast usage and spend accurately
Scale AI-enabled features sustainably

A feature that appears successful from a usage perspective may, in fact, be unprofitable when cost is fully understood.

Feature-level cost attribution provides the clarity needed to manage this trade-off.

Separating Development and Production Usage

Accurate measurement also requires distinguishing between development and production activity.

Development workflows often involve significant LLM API usage through testing, experimentation, and iteration. If this activity is not clearly separated from production usage, it can distort cost analysis and obscure the true economics of a feature.

Environment-level attribution ensures that:

Production costs reflect real service delivery
Development activity remains visible but separate
Forecasting is based on operational usage rather than experimentation

Internal and Customer-Facing AI Usage

In addition to environment separation, organisations must distinguish between internal and customer-facing AI usage.

Internal use cases may include:

Support automation
Sales enablement tools
Internal data processing

These serve a different purpose from product features and should be analysed separately.

Blending these categories can lead to overstated product costs and reduced clarity in profitability analysis.

The Role of an AI Gateway or Proxy Layer

Capturing consistent, structured metadata requires a central control point within the architecture.

An AI Gateway or AI Proxy layer provides this capability by routing all LLM API calls through a unified interface. This enables organisations to:

Standardise access across multiple providers
Enforce consistent request structure
Capture relevant metadata at the source
Analyse usage across multiple dimensions

In multi-provider AI environments, this approach is particularly valuable, as it provides a unified view across otherwise fragmented systems.

From Measurement to Optimisation

Once AI cost is measured at the feature level, optimisation becomes more targeted and effective.

Teams can focus on:

Reducing prompt size
Controlling response length
Selecting more efficient models
Caching repeated requests
Redesigning high-cost workflows

These decisions are difficult to prioritise without clear visibility into where cost is being incurred.

The Bottom Line

Token-level metrics provide a useful starting point for understanding AI usage, but they are not sufficient for managing cost in a business context.

To operate effectively in modern SaaS and ISV environments, organisations need to move towards feature-level cost attribution, particularly as they adopt multi-provider AI architectures.

By capturing context at the point of request and analysing LLM API costs across meaningful dimensions, organisations can establish a more robust foundation for AI FinOps.

This enables a shift from aggregate cost tracking to structured, actionable insight—supporting better pricing, stronger margins, and more scalable AI-driven products.