integreat-cms icon indicating copy to clipboard operation
integreat-cms copied to clipboard

Track DeepL API usage

Open svenseeberg opened this issue 2 years ago • 1 comments

Motivation

We have an automatic translation feature via DeepL. However, we have to pay for translations. We do want to keep track of our used quota.

Proposed Solution

Query our used words quota before and after sending translation requests and store the difference in our database (InfluxDB?)

Alternatives

The total amount of words translated could be stored in an attribute for the region.

svenseeberg avatar Apr 06 '22 15:04 svenseeberg

See https://www.deepl.com/de/docs-api/other-functions/monitoring-usage/

ulliholtgrave avatar Apr 06 '22 15:04 ulliholtgrave

@svenseeberg I'm a bit unsure why we'd want to store the used quota. Wouldn't it make more sense to just query the DeepL API (see Ulli's link) for our usage any time it's relevant (e.g. before sending translation request, in the planned translation budget widget, ...)? That way it's always up to date.

So for now I'd add a function to check the usage, and raise an exception in the deepl_translation function if the translation would exceed the limit.

charludo avatar Nov 28 '22 12:11 charludo

Wouldn't it make more sense to just query the DeepL API (see Ulli's link) for our usage any time it's relevant (e.g. before sending translation request, in the planned translation budget widget, ...)? That way it's always up to date.

I think DeepL only tracks our global credit count, but we want to keep track of which portion of that is used by which region... this is not only done to prevent errors during the automatic translation, but also for our business calculation to see whether the usage roughly matches the price we charge the regions for this feature. This way, we could identify e.g. "power-user" regions which use the feature above average and maybe need an individual rate or whatever...

So for now I'd add a function to check the usage, and raise an exception in the deepl_translation function if the translation would exceed the limit.

Not sure whether that's required - can we just "try" to translate and then handle the error thrown by DeepL when the quota is reached? I'm also not sure whether we're able to calculate the limit in exactly the same way as DeepL... what about special HTML characters, white spaces etc., are they billed exactly like translatable text?

timobrembeck avatar Nov 28 '22 12:11 timobrembeck

keep track of which portion of that is used by which region

Ah, I see, that makes sense.

I'm also not sure whether we're able to calculate the limit in exactly the same way as DeepL... what about special HTML characters, white spaces etc., are they billed exactly like translatable text?

That's a fair point. Related question: from what I can see in the wiki, the plan is to offer a budget of 50.000 words. However, as far as I can tell, DeepL tracks usage by character. Is this an oversight, or are we actually tracking and displaying the translation usage of each community by word, regardless of word lengths/character counts?

charludo avatar Nov 28 '22 13:11 charludo

from what I can see in the wiki, the plan is to offer a budget of 50.000 words. However, as far as I can tell, DeepL tracks usage by character. Is this an oversight, or are we actually tracking and displaying the translation usage of each community by word, regardless of word lengths/character counts?

I'm not 100% sure, but yes, how I understood it this is intentional, so we're offering the regions a different payment plan than how we pay DeepL. We somehow have to calculate internally that the external per-character costs for DeepL + our support hours + our developer hours are covered by the per-word fees by the regions. Maybe @dkehne knows more...

timobrembeck avatar Nov 28 '22 13:11 timobrembeck

But yes good point, this kind of means that we need to keep track of two separate counters? The character count by DeepL which we need for our internal calculations and the word count which we need to show the regions their limit?

Then your suggestion earlier to check the limit beforehand also makes absolutely sense, because the DeepL API won't throw an exception if there is enough global credits left, but an individual region has exceeded the limit we gave them? :thinking:

timobrembeck avatar Nov 28 '22 13:11 timobrembeck

Yeah, if this is how it's supposed to be tracked, then I think we have to check the word usage beforehand, and then update it in the DB after successfull translation.

charludo avatar Nov 28 '22 13:11 charludo

@charludo @timoludwig maybe to clarify the word/character counting mechanism. We are aware that DeepL counts characters but bcs the translation offices charge per word and all our translation reports are word based, we decided to go with words for our DeepL package as well. We took the average of 7 characters per word which is why we are buying 7 Mio character from DeepL per Region and giving them 1 Mio words. What we need to be able to track is how many words each region has used/uses, bcs 50.000 words are free of charge and after that the Kommunen need to sign a contract to get 1 Mio words for 1000€. I am not sure whether it is actually necessary to track the character usage in DeepL? But what we need to be able to do is tell a Kommune that the amount of words they are trying to translate exceeds their budget of words (if that is ever the case) so I think we need to track usage/know usage before sending it for translation, right?

osmers avatar Dec 01 '22 11:12 osmers

I am not sure whether it is actually necessary to track the character usage in DeepL?

If you're not interested in knowing whether your assumption of 7 characters per word is correct, then not. I just have a few small concerns that our own calculation might be completely off and we need to adjust our business calculation in the future.

E.g., consider the following page fragment:

<a href='https://thisisanextremelylonglink/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>word</a>

Then, this would be counted as two "words" (<a and href='https://thisisanextremelylonglink/AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA'>word</a>) of length 2 and 138, so the average char count per word would be off by the factor of 10.

Maybe we can improve this method during PR review, but the underlying problem will stay: I's always hard to calculate the number of words in HTML source code and we don't exactly know how much DeepL charges for sending special characters to the API which are not translated (maybe this has to be tested), so in this case I kind of expect our calculations to be wrong no matter what and it could be useful to track the character count from DeepL so e.g. after half a year, we can say "ok, X translated words were billed by DeepL with Y characters, maybe the value Z would be a better average word count than what we thought initially.

But what we need to be able to do is tell a Kommune that the amount of words they are trying to translate exceeds their budget of words (if that is ever the case) so I think we need to track usage/know usage before sending it for translation, right?

Yes, this is exactly what @charludo implemented in #1909 I think :+1:

timobrembeck avatar Dec 01 '22 12:12 timobrembeck

@timoludwig yes, that's basically what I implemented, though I now think there's some potential for improvement, e.g. storing the translated character count as well, like you said.

I'm not sure of we should calculate the character count ourselves (potentially error-prone compared to DeepL's calculation), or query DeepL (also potentially error-prone if another region happens to translate at the same time).

So maybe storing the total amount of characters/words over all regions makes the most sense?

charludo avatar Dec 01 '22 12:12 charludo

I just have a few small concerns that our own calculation might be completely off and we need to adjust our business calculation in the future.

True, I always forget that we are sending HTML code - so maybe it would be interesting to know, also to know whether we are right in our assumption of 7 characters per word or not. If we can test how DeepL behaves with HTML code, even better!

osmers avatar Dec 01 '22 12:12 osmers

So maybe storing the total amount of characters/words over all regions makes the most sense?

But then we would not be able to determine how many words each region used, right?

osmers avatar Dec 01 '22 12:12 osmers

query DeepL (also potentially error-prone if another region happens to translate at the same time).

Theoretically we could use a simple threading.Lock() for this, shouldn't be too much effort.

So maybe storing the total amount of characters/words over all regions makes the most sense?

But yes, this is the simpler version. I'm just not sure whether we even need to store this - then we can probably just query the usage after a specific point in time and just sum all consumed words of all regions to calculate the global average?

But then we would not be able to determine how many words each region used, right?

@osmers No, I think the word counter for each region is not affected by the new field where we track the global character count from DeepL.

timobrembeck avatar Dec 01 '22 13:12 timobrembeck

we don't exactly know how much DeepL charges for sending special characters to the API which are not translated (maybe this has to be tested)

I just did some quick tests. The following event description:

<div>
<p>Das ist ein Test, um zu sehen, ob DeepL HTML Zeichen berechnet.</p>
<p><a href="https://www.das-sind-deutsche-worte.de">Beschreibung des Links</a></p>
<p>Und hier noch ein Bild: <img src="https://integreat.com/Mittagessen" alt="Das Bild zeigt mein Mittagessen"></p>
</div>

has a length of 277 characters excluding spaces, and that is exactly the amount of credits DeepL consumes for the translation (I already accounted for the event title), even though far less characters actually get translated:

<div>
<p>This is a test to see if DeepL calculates HTML characters.</p>
<p><a href="https://www.das-sind-deutsche-worte.de">Description of the link</a></p>
.
<p>And here's another picture: <img src="https://integreat.com/Mittagessen" alt="The picture shows my lunch"></p>
</div>

As you can see, DeepL is context-aware enough to translate the alt text of an image, but leave e.g. the link alone, but the characters that make up the link still get counted.

According to TinyMCE, the original snippet contains 20 words, meaning in this (not representative) example, the average word length is just under 14 characters.

charludo avatar Dec 02 '22 08:12 charludo

Two other "weird" things I noticed:

  • I don't know why the . is added between the last two <p> tags. Adding a . after Beschreibung des Links to mark the end of the "sentence" doesn't stop this.
  • Words with Umlaute are not translated, at all, but I think that's due to us sending HTML-espaced characters instead of the actual Umlaut. @timoludwig should I open a new issue for this or fix it in my current PR?

charludo avatar Dec 02 '22 08:12 charludo

By the way, the character count for actual words, including the img alt text, is 141, so pretty much exactly 7 characters/word.

As I said, I don't think my example is representative, since most pages contain a lot more translatable text than non-translatable links, images, and so on; but add in a couple of links that look like the one @timoludwig mentioned, and we could be looking at significantly more than 7 characters per "word".

@osmers @timoludwig I know HTML parsing isn't exactly easy, but do you think the discrepancy here is large enough to justify implementing some kind of (e.g.) beautifulsoup solution to gather all the actual text ourselves, then only send those words off to DeepL and but the result back into the original HTML structure?

I can see a very real (monetary) benfit in this, but also some danger in fragmenting text and grammatical structure in e.g. inline-links.

charludo avatar Dec 02 '22 08:12 charludo

has a length of 277 characters excluding spaces, and that is exactly the amount of credits DeepL consumes for the translation (I already accounted for the event title), even though far less characters actually get translated:

Ok, I was afraid of that, but it kind of makes sense and at least makes it easy for us to know exactly how much credits a translation consumes.

Words with Umlaute are not translated, at all, but I think that's due to us sending HTML-espaced characters instead of the actual Umlaut. @timoludwig should I open a new issue for this or fix it in my current PR?

Since this is an independent problem, please open another issue for it, but feel free to fix it in your PR as well :smile:

@osmers @timoludwig I know HTML parsing isn't exactly easy, but do you think the discrepancy here is large enough to justify implementing some kind of (e.g.) beautifulsoup solution to gather all the actual text ourselves, then only send those words off to DeepL and but the result back into the original HTML structure?

This is pretty much what I did for SUMM.AI since their API does not support HTML yet. For SUMM.AI, we just accepted the fact that inline formatting is gone, but I guess we could also improve this solution to work with smaller placeholders which can be re-inserted after the translation is complete? Definitely not without trip hazards, but I think it's worth considering this possibility.

timobrembeck avatar Dec 02 '22 08:12 timobrembeck

Ok, I was afraid of that, but it kind of makes sense and at least makes it easy for us to know exactly how much credits a translation consumes.

Does it make sense? They are charging for characters they are intentionally discarding.

since this is an independent problem, please open another issue for it, but feel free to fix it in your PR as well smile

👍🏼

This is pretty much what I did for SUMM.AI since their API does not support HTML yet. For SUMM.AI, we just accepted the fact that inline formatting is gone, but I guess we could also improve this solution to work with smaller placeholders which can be re-inserted after the translation is complete? Definitely not without trip hazards, but I think it's worth considering this possibility.

Neat, I didn't know that!

Fortunately, I think I just stumbled upon an easier solution: Their API offers the option to directly translate a number of document formats, including HTML, and using that method to translate the same HTML snippet as before, only 110 character credits are used! 🥳

The discrepancy to the 141 from before is most likely the now not-translated img alt tag, though it isn't adding up exactly, I'm not sure why.

Also I just tried a translation the normal way and simply put a single German word as the href in a link, and this actually does get translated, which is far from ideal.

I think the most sensible way forward is to either:

  • write the description/content of a page/poi/event to a tempfile, send it to DeepL, the python DeepL library automatically writes the result to a new file. Read it, delete both. This is probably the more simple solution, but I'm not crazy about constantly creating tempfiles. (Though we could probably just wrap the text in StringIO? Not 100% sure on this.)
  • ditch the DeepL python library for this one type of translation, post the request manually, get the result without ever storing it in a file. If DeepL significantly changes their API, we will have to manually adjust this, though I'm not sure that is entirely likely.

AFAIK image alt tags are the only thing that won't be translated with this solution. Maybe that is acceptable @osmers?

charludo avatar Dec 02 '22 09:12 charludo

write the description/content of a page/poi/event to a tempfile, send it to DeepL, the python DeepL library automatically writes the result to a new file. Read it, delete both. This is probably the more simple solution, but I'm not crazy about constantly creating tempfiles. (Though we could probably just wrap the text in StringIO? Not 100% sure on this.)

As I understand the Readme of the Python client, the translate_document() works with IO objects and does not write the files to disk?

Also, there is this tag_handling="html" option which can be passed to the translate_text() function, maybe also worth trying out? (seems to be in beta though)

But both opinions definitely sound better than doing the html parsing ourselves 😅

timobrembeck avatar Dec 02 '22 10:12 timobrembeck

the translate_document() works with IO objects and does not write the files to disk

Does it not? For that example, an input and output file are opened before calling translate_document, but it looks like that function then handles reading and writing? Anyways, doesn't really matter, I'll just try it out.

Also, there is this tag_handling="html" option which can be passed to the translate_text() function, maybe also worth trying out? (seems to be in beta though)

Once again, reading the entire docs seems like it would have been a good idea... 😅

charludo avatar Dec 02 '22 15:12 charludo

Does it not? For that example, an input and output file are opened before calling translate_document, but it looks like that function then handles reading and writing?

Exactly, in the example the files from the disk are converted to IO objects via with open(input_path, "rb") as in_file, but we can just skip this step and directly create in-memory IO objects.

Once again, reading the entire docs seems like it would have been a good idea... 😅

Well, yeah, don't ask me why I didn't notice this in the review of the initial implementation 🙈

timobrembeck avatar Dec 02 '22 16:12 timobrembeck

AFAIK image alt tags are the only thing that won't be translated with this solution. Maybe that is acceptable @osmers?

I'm not even sure they get translated right now with MemoQ and DeepL translations - I'd have to check. Also, most Kommunen don't use that many pictures which is why it should be ok. Would it be something that can be changed in case there are a lot of complaints?

osmers avatar Dec 12 '22 07:12 osmers