How to Measure AI Cost per Feature (Not Just per Token)
Most organisations begin tracking LLM API costs at the token level. This is a natural starting point, as tokens are the unit of consumption exposed by all major AI providers and form the basis of billing models across the industry.
However, while token-level visibility provides a clear view of usage, it does not provide a meaningful view of cost in a business context. As AI moves from experimentation into production systems, organisations need to understand not only how much they are using, but what that usage represents in terms of product delivery, customer value, and financial performance.
This is where token-level tracking begins to fall short.
The Limitations of Token-Level Metrics
Token usage is an infrastructure metric. It describes how AI systems are being consumed at a technical level, but it does not reflect how that usage maps to the structure of a business.
For example, token-level reporting does not answer questions such as:
- What does it cost to operate a specific feature?
- Which features are responsible for the majority of AI spend?
- How does AI usage impact product margins?
- Where should optimisation efforts be focused?
As long as AI usage is viewed primarily through tokens, cost remains disconnected from the way products and services are delivered.
AI Cost Is Driven by Product Behaviour
In SaaS and ISV environments, AI usage is rarely uniform. It is driven by specific product features and workflows, each with its own usage patterns and cost profile.
Examples might include:
- A summarisation capability embedded within a product
- A conversational interface or chatbot
- A document processing pipeline
- A recommendation or classification system
Each of these features generates LLM API requests, and each request incurs cost. However, from a commercial perspective, the relevant unit is not the token, but the feature itself.
Understanding cost per feature allows organisations to connect AI usage directly to revenue, pricing, and margins.
Complexity in Multi-Provider AI Architectures
As AI adoption matures, most organisations move beyond a single provider. They begin to operate in multi-provider AI environments, combining different models and vendors to optimise for performance, cost, and capability.
This introduces additional complexity:
- Different providers have different pricing models
- Token usage varies significantly across models
- Features may rely on multiple providers simultaneously
In this context, analysing cost at the provider level becomes increasingly limited. It does not reflect how AI is actually consumed within the product.
To manage AI effectively, organisations need to analyse cost across business dimensions, not vendor boundaries.
Moving to Feature-Level Cost Attribution
Measuring AI cost per feature requires capturing context at the point of request. Each LLM API call must be associated with the business activity that generated it.
This typically involves attaching structured metadata such as:
- Feature
- Product
- Environment (development versus production)
- Customer or tenant (where applicable)
- Internal versus customer-facing usage
This approach enables AI cost attribution that aligns with how the organisation operates, rather than how providers report usage.
Measuring Cost in a Business Context
Once usage is structured in this way, organisations can begin to analyse AI cost in more meaningful terms.
For each feature, it becomes possible to measure:
- Total cost of operation
- Number of requests
- Average cost per request
- Average token consumption per request
These metrics provide a clearer view of how features behave and how they scale.
They also enable more informed decisions about where to invest, optimise, or redesign.
Identifying Inefficiencies in AI Usage
One of the most valuable insights at this level is the ability to compare usage patterns across features.
For example, differences in average tokens per request may indicate:
- Inefficient prompt design
- Excessive response lengths
- Suboptimal model selection
- Redundant or repeated processing steps
Without feature-level visibility, these inefficiencies remain hidden within aggregate LLM API costs.
Implications for SaaS Unit Economics
AI introduces variable cost into software delivery, which has direct implications for SaaS unit economics.
If organisations cannot measure cost at the feature level, they may struggle to:
- Align pricing with cost
- Maintain consistent margins
- Forecast usage and spend accurately
- Scale AI-enabled features sustainably
A feature that appears successful from a usage perspective may, in fact, be unprofitable when cost is fully understood.
Feature-level cost attribution provides the clarity needed to manage this trade-off.
Separating Development and Production Usage
Accurate measurement also requires distinguishing between development and production activity.
Development workflows often involve significant LLM API usage through testing, experimentation, and iteration. If this activity is not clearly separated from production usage, it can distort cost analysis and obscure the true economics of a feature.
Environment-level attribution ensures that:
- Production costs reflect real service delivery
- Development activity remains visible but separate
- Forecasting is based on operational usage rather than experimentation
Internal and Customer-Facing AI Usage
In addition to environment separation, organisations must distinguish between internal and customer-facing AI usage.
Internal use cases may include:
- Support automation
- Sales enablement tools
- Internal data processing
These serve a different purpose from product features and should be analysed separately.
Blending these categories can lead to overstated product costs and reduced clarity in profitability analysis.
The Role of an AI Gateway or Proxy Layer
Capturing consistent, structured metadata requires a central control point within the architecture.
An AI Gateway or AI Proxy layer provides this capability by routing all LLM API calls through a unified interface. This enables organisations to:
- Standardise access across multiple providers
- Enforce consistent request structure
- Capture relevant metadata at the source
- Analyse usage across multiple dimensions
In multi-provider AI environments, this approach is particularly valuable, as it provides a unified view across otherwise fragmented systems.
From Measurement to Optimisation
Once AI cost is measured at the feature level, optimisation becomes more targeted and effective.
Teams can focus on:
- Reducing prompt size
- Controlling response length
- Selecting more efficient models
- Caching repeated requests
- Redesigning high-cost workflows
These decisions are difficult to prioritise without clear visibility into where cost is being incurred.
The Bottom Line
Token-level metrics provide a useful starting point for understanding AI usage, but they are not sufficient for managing cost in a business context.
To operate effectively in modern SaaS and ISV environments, organisations need to move towards feature-level cost attribution, particularly as they adopt multi-provider AI architectures.
By capturing context at the point of request and analysing LLM API costs across meaningful dimensions, organisations can establish a more robust foundation for AI FinOps.
This enables a shift from aggregate cost tracking to structured, actionable insight—supporting better pricing, stronger margins, and more scalable AI-driven products.