Revise support for UTF-8-valid string truncation
Problem Statement
As discussed in https://github.com/open-telemetry/opentelemetry-proto/issues/426, we currently do not have clarity from the specification for how to implement correct truncation when string-valued attributes exceed the specified limits, and yet the OTLP protobuf encoding requires valid UTF-8.
While https://github.com/open-telemetry/opentelemetry-go/pull/3156 offered a quick fix meant to alleviate the pain for users, this deserves careful consideration. It is possible to implement an O(1) truncation, if that is desired, although even with UTF-8-correct truncation, users can still enter invalid UTF-8 that we have not specified how to handle.
Proposed Solution
In the next release cycle (after 1.10.x), consider either faster support with less validation (i.e., an O(1) truncation approach) or a more-comprehensive approach to validation (i.e., ensure valid UTF-8 for all strings, not only truncated attribute values).
Alternatives
Discussed in https://github.com/open-telemetry/opentelemetry-proto/issues/426#issuecomment-1242337687
I asked the OTel-Java group how this is handled. Because the Java String.substring() method counts UTF-8 characters, I believe it matches the behavior introduced in #3156.