Create a generalized shared cost billing model
Type
Metrics / Dimensions
Proposed Change
Provide a normalized spec for exposing software-defined allocated costs that easily tie these costs to corresponding virtual or dedicated hardware.
Context / Supporting information
There exists service-based costs and rollups in AWS’ CUR and GCP GKE cost datasets but there does not exist an overarching, normalized model for representing service-based cost allocation nor attribution to corresponding virtual or dedicated hardware.
Given numerous orchestrators and their similar operating models (allocating subset of resources as processing entities), there must be a generalized model proposed.
@cnharris10 Could you please chase this down and see if this should become a work item or a discussion topic?
@cnharris10 The group asked for further information during the TF-1 call on May 28.
I'm interested in this one for v1.1
Some context for the AWS and GCP implementations of this idea:
https://docs.aws.amazon.com/cur/latest/userguide/split-cost-allocation-data.html https://cloud.google.com/kubernetes-engine/docs/how-to/cost-allocations
Would the scope of this be only for containers, or would we be looking to expand it beyond that to include any service with usage-based breakdowns? For example, here's a bunch of hoops one can jump through to allocate shared Fabric costs. Painful, but super important for allocation purposes.
https://pbi-guy.com/2024/03/30/how-to-extract-data-from-the-fabric-metrics-app-part-1/
Some context for the AWS and GCP implementations of this idea:
https://docs.aws.amazon.com/cur/latest/userguide/split-cost-allocation-data.html https://cloud.google.com/kubernetes-engine/docs/how-to/cost-allocations
Would the scope of this be only for containers, or would we be looking to expand it beyond that to include any service with usage-based breakdowns? For example, here's a bunch of hoops one can jump through to allocate shared Fabric costs. Painful, but super important for allocation purposes.
https://pbi-guy.com/2024/03/30/how-to-extract-data-from-the-fabric-metrics-app-part-1/
AWS supports tasks (ECS), pods (EKS), and jobs (Batch). I'd hope we could solve generally across various orchestration systems: K8s, EMR, Dataproc/Dataflow, Spark, Flink, etc.
Discussed in Oct 22 TF1 call. Need to talk about this one a little bit more to align on scope: is it just orchestration and/or container services, or is it more holistic than that? This is a big conceptual topic, and I believe Chris' proposal is sticking to a narrow scope for 1.2 -- but let's discuss more in calls to ensure that everyone understands the overall concept.
K8S is taking more and more weightage for the Cloud costs (passing the 50% mark). Better breaking down of those costs is becoming essential.
In the Oct 29 TF1 call today, we discussed revising the scope of this Work Item to be able to generically handle not only for compute clusters (e.g. AWS ECS, GCP GKE), but also other types of services that can be allocated (e.g. OCI pluggable databases), as well as any other services that may be attributed down the road. This would holistically handle for the use case of Consume allocated costs as calculated by the provider. @cnharris10, do you agree with this approach, and if so, are you amenable to revising this Work Item to reflect that? Happy to huddle and discuss if you like.
FYI that we also discussed a net-new Work Item to handle for a separate but related use case of Consume usage metrics that facilitate cost allocations as calculated by the practitioner, which would like result in supporting content rather than a spec change, and which @ahullah and @tobrien will craft.
The intention of this work item is to classify an approach for similar shared allocation models. If "compute" is too narrow and can be expanded to other examples that closely relate, then I'm in support.
Action Items from TF-1 call on Oct 29:
- [ ] [#72] Alex @ahullah & Tim @tobrien : Draft a work item detailing concepts for future holistic allocation patterns beyond computing clusters.
- [ ] [#72] Chris @cnharris10 & Shawn @shawnalpay : Expand the current work item to outline patterns for generic cost allocation across cloud services, ensuring the scalability of new services as they emerge.
@cnharris10 I have now modified this Work Item to more holistically include all provider-generated shared cost allocations, not just compute clusters. Give it a look and let me know if it looks alright to you.
Action Items from Members' call on Oct 31:
What about direct cost allocation (not shared)? Allocation vision and strategy in general with FOCUS needs to be discussed
Maintainers notes from Nov 4 call:
Context: This task involves developing a model for shared cost allocation within compute clusters. Initial discussions focused on the broader concept of shared cost allocation but were narrowed down to provider-generated data to simplify the scope. This distinction helps streamline the process and make implementation feasible within a single release. Level of Effort Required: Very High — Handling shared costs for compute clusters, especially in containerized environments, involves complex many-to-many relationships and provider-specific solutions, necessitating decomposition of the task. **Level of Impact: ** Very High – This work item has a significant impact on practitioners, as shared cost allocation is essential for accurate cost distribution, particularly in complex, containerized environments. Effective cost allocation is a key metric for resource optimization in FinOps.
Action Items from the TF-1 call on November 5:
Comments from the Members' call on November 7:
#72: TF-1 is working on cost allocation strategies for multi-provider models, addressing cases where multiple resources feed into a single service element, such as clustered resources. The current focus is on allowing providers to share their allocation metadata within the specification.