telemetry-analysis-service
telemetry-analysis-service copied to clipboard
Billing e-mail for Spark and map-reduce jobs
Originally reported in https://bugzilla.mozilla.org/show_bug.cgi?id=1173429
"We have a lot more people running telemetry analyses than we used to do. That's a great, but we should inform users of the cost of their analyses after their cluster/machine is terminated. It's rather easy to spawn larger clusters for no good reason and/or keeping them running without actually utilising it." — @vitillo
Since we are now recording date/times of when the job was scheduled, run, and terminated we could estimate the cost as long as we can get the spot instance cost at the time of job run. I was looking at the EMR boto3 docs but only saw one API that returned the bid price, however that API is being deprecated. But perhaps there's another way to capture the price info and store it with the job run. Once that's there it should be relatively easy to calculate costs from redash.
The ec2 client module has the ability to get the spot price for a given date/time range: https://boto3.readthedocs.io/en/latest/reference/services/ec2.html#EC2.Client.describe_spot_price_history