semantic-conventions icon indicating copy to clipboard operation
semantic-conventions copied to clipboard

LLM: Standardized fields for LLM Security and protection [LLM detection rules category]

Open susan-shu-c opened this issue 1 year ago • 2 comments
trafficstars

Area(s)

area:gen-ai, llm

Is your change request related to a problem? Please describe.

Continuation of https://github.com/open-telemetry/semantic-conventions/issues/1007

To prevent threats to LLM systems, such as misuse, and to log content filters, proposing standardized fields for the purpose of secure and safe LLM usage. Based on frameworks such as OWASP’s LLM Top 10 and MITRE ATLAS.

An example is that a user may be using various LLM vendors or their own deployments, and wish to log all of them in a standardized manner. Our team has published a blog proposing standardized fields for LLM Security, led by @Mikaayenson.

From previous discussion in https://github.com/open-telemetry/semantic-conventions/issues/1007, to make this proposal easier to move forward, here's a prioritized, much narrowed-down subset of proposed fields.

The code of our shipped detection rules can be viewed here:

Describe the solution you'd like

Proposing the following fields that are used in our shipped detection rules. The rules detect and prevent DoS, inappropriate usage, LLMJacking.

Category Field Type Description Rules (details linked in above section) Comments
Policy Enforcement Fields gen_ai.policy.name keyword Name of the specific policy that was triggered. * aws_bedrock_guardrails_multiple_violations_in_single_request
gen_ai.policy.action keyword Action taken due to a policy violation, such as blocking, alerting, or modifying the content. * aws_bedrock_guardrails_multiple_violations_in_single_request * aws_bedrock_high_confidence_misconduct_blocks_detected
gen_ai.policy.confidence float Confidence level in the policy match that triggered the action, quantifying how closely the identified content matched the policy criteria. * aws_bedrock_high_confidence_misconduct_blocks_detected
Compliance Fields gen_ai.compliance.violation_detected boolean Indicates if any compliance violation was detected during the interaction. * aws_bedrock_guardrails_multiple_violations_by_single_user
gen_ai.compliance.violation_code keyword Code identifying the specific compliance rule that was violated. * aws_bedrock_high_confidence_misconduct_blocks_detected
Performance Metric Fields gen_ai.performance.request_size long Size of the request payload in bytes. * llm_dos_resource_exhaustion_detection

Describe alternatives you've considered

Alternatives are to submit these fields only to ECS, but since the donation of ECS, the standard is to discuss and propose to OTel.

Additional context

No response

susan-shu-c avatar May 13 '24 18:05 susan-shu-c

Thanks for the discussion during the working group, will address some feedback and create a PR. Capturing some of the points to answer/address below.

OTel: something already existing for user.id. Can reuse those docs and guidelines around it, as it can be PII.

  • https://opentelemetry.io/docs/specs/semconv/attributes-registry/enduser/
  • Will need to check if there is documentation on whether it's opt in or not
  • this enduser stays the same across sessions

Removing our gen_ai.performance.start_response_time as the pending https://github.com/open-telemetry/semantic-conventions/pull/955 gen_ai.request.duration has start, end times from which this can be derived.

Our proposal'serror code: perhaps we can reuse the existing error.type (spans/metrics/finish reason exists)

[Edit] Updated the main body of this issue with this feedback.

susan-shu-c avatar May 16 '24 15:05 susan-shu-c

Updating with discussions here.

Why gen_ai.performance.request_size vs token size?

gen_ai.performance.request_size: size of actual content which is different from token count. Token counts may depend on how the embedding is generated, which may differ from algorithm to algorithm. From a detections perspective, using request size could be more generic over all LLMs.

Policy vs. compliance fields

"Compliance": aimed more toward external compliance; while "Policy" refers more to a user and organization's internal policies, internal system specific factors (AWS Bedrock guardrails).

Difference between error.code vs. response_finish_reason?

Azure response has separate error.code and innererror.code which isn't the same as stop etc. finish_reason

susan-shu-c avatar Jul 31 '24 18:07 susan-shu-c

Hi, I am closing this issue as 4/6 originally proposed fields have been merged with add new namespace “security_rule.*" #903. I've updated the table at the top-level description (comments column) but here they are for reference:

Field proposed by this issue Covered by security_rule field
gen_ai.policy.name security_rule.ruleset
gen_ai.policy.action security_rule.name
gen_ai.compliance.violation_detected security_rule.category (can propose others)
gen_ai.compliance.violation_code security_rule.name

susan-shu-c avatar Mar 25 '25 13:03 susan-shu-c

cc @peasead

susan-shu-c avatar Mar 25 '25 14:03 susan-shu-c