Azure AI Evaluation SDK v1.13.7: start_red_team_run() produces runs with target=null, causing completed external scans to be invisible in Foundry portal
Environment
SDK & package: Azure AI Evaluation SDK azure-ai-evaluation v1.13.7 Python: 3.10.x (also observed on 3.9.x) Auth: DefaultAzureCredential() Project architecture:
Legacy Hub: MachineLearningServices/workspaces (primary) Also tested with Foundry: CognitiveServices/accounts via project endpoint
Region: (e.g., eastus) OS: Windows 11 / Ubuntu 22.04 (observed on both) Portal: Azure AI Foundry (Evaluations → AI red teaming)
Description When initiating AI Red Teaming scans against non‑Azure‑hosted targets (external callbacks, Kong, LM Studio, etc.) using RedTeam.start_red_team_run() / scan(), the SDK uploads a RedTeamUpload that contains a displayName but leaves the target field as null.
The run completes successfully and is retrievable via az rest. The run does not appear in the Foundry portal’s AI red teaming list, which appears to filter on target. Portal‑initiated runs (GUI) against Foundry‑deployed models populate target and display correctly. All 10 SDK‑initiated external runs show target=null and are invisible despite status=Completed.
This behavior is reproducible primarily in legacy Hub projects; the Foundry architecture using AIProjectClient with model configs sets target and shows up, but the external callback path remains affected. Impact
Portal visibility gap: Completed external scans are invisible in Foundry UI, breaking centralized reporting, review workflows, and compliance evidence gathering. Operational friction: Teams must use az rest/raw APIs to find runs; portal drill‑downs (risk breakdown, row‑level attack/response) are unavailable. CI/CD & governance: Automation that expects portal surfacing of evaluation artifacts cannot rely on SDK for external targets.
Repro steps
Create a RedTeam using the Evaluation SDK pointing to a Hub project (legacy ML workspace) or Foundry project endpoint: Pythonfrom azure.ai.evaluation.red_team import RedTeam, RiskCategoryfrom azurefrom azure.identity import DefaultAzureCredentialazure_ai_project = { "subscription_id": "...", "resource_group_name": "...", "project_name": "..."} # or Foundry project URLred_team = RedTeam( azure_ai_project=azure_ai_project, credential=DefaultAzureCredential(), risk_categories=[RiskCategory.Violence, RiskCategory.HateUnfairness])# External (non‑Azure‑hosted) callback targetasync def external_callback(prompt: str) -> str: return "external system response"Show more lines
Confirm completion via CLI: Shellaz rest --method get --url "https://{account}.services.ai.azure.com/api/projects/{project}/redteams/runs/{runId}"Show more lines
Observe payload: JSON{ "id": "4b4ce898-045f-48b3-9335-9b18c338a97f", "displayName": "scan_20251205_195954", "target": null, "status": "Completed"Show more lines
Open Foundry portal → Project → Evaluations → AI red teaming. Actual: Run does not appear. Expected: Completed run should be listed with drill‑down.
Expected behavior For any completed red team run—including external callback targets—the SDK should persist a non‑null target descriptor (or the portal should not require target for visibility). The run should appear in the Foundry portal’s AI red teaming tab with full drill‑down. Actual behavior
RedTeamUpload.target == null for SDK‑initiated external scans. Portal hides those runs; only GUI‑initiated or SDK runs with explicit target configs display.
Additional observations & references
Foundry docs show portal visibility when a target is set via AIProjectClient and model configuration; external callback examples do not demonstrate a persisted target value, aligning with our observations.
Cloud run with AIProjectClient and target config: Learn: “Run AI Red Teaming Agent in the cloud” (payload includes target) [AIProjectC...soft Learn | Learn.Microsoft.com] Local agent & external callback pattern (no persisted target shown): Learn: “Run AI Red Teaming Agent locally” [Run AI Red...soft Learn | Learn.Microsoft.com] Portal viewing guidance (assumes run is logged with usable metadata): Learn: “View AI red teaming results” / “Run scans…”
Hypothesis
The SDK’s serialization path for external callback targets does not construct a target object (e.g., ExternalCallback or GenericEndpoint), leaving it null. The Foundry portal UI filters/queries rely on target to display runs and do not include runs with target=null.
Requested fix
SDK enhancement:
Provide a target schema for external/non‑Azure targets (e.g., type: ExternalCallback, with identifier fields), and populate it when scan(target=callable) is used. Alternatively, allow setting a target descriptor explicitly in RedTeam.scan()/start_red_team_run() for callbacks.
Portal tolerance (optional):
Do not filter out runs where target == null; display them with a generic “External target” label, so results remain discoverable.
Backfill/patch API (nice to have):
Support updating run metadata to attach a target after creation, enabling visibility for already completed runs.
Workarounds we tested
AIProjectClient with model configuration → visible (sets target). External callback with SDK → invisible (no target). Manual review via az rest → possible but not viable for teams relying on portal analytics.
Logs / payloads
Example run (IDs sanitized above). We can provide full request/response traces privately if needed.
Severity High for customers using external endpoints: prevents usage of Foundry portal for red teaming result review, reporting, and governance.
Thanks for the feedback! We are routing this to the appropriate team for follow-up. cc @luigiw @needuv @singankit.
@ctava-msft red teaming will not work in eastus region. Please use a project in a supported region: https://learn.microsoft.com/en-us/azure/ai-foundry/how-to/develop/run-scans-ai-red-teaming-agent?view=foundry&preserve-view=true#region-support . Once you have switched regions please confirm if the issue still occurs. The UI does not filter by target, so you should be unblocked once you are in a supported region.
Hi @ctava-msft. Thank you for opening this issue and giving us the opportunity to assist. To help our team better understand your issue and the details of your scenario please provide a response to the question asked above or the information requested above. This will help us more accurately address your issue.
@slister1001 @ctava-msft
Comprehensive Testing Results & SDK Code Trace
We completed extensive A/B testing across multiple dimensions and traced the issue to a specific location in the SDK code. This response provides evidence that the issue is not related to regional support or external vs internal targets.
1. Test Environment
| Component | Value |
|---|---|
| SDK Version | azure-ai-evaluation==1.13.7 |
| Python | 3.12.8 |
| Architecture | New Foundry (Microsoft.CognitiveServices/accounts with AIServices kind) |
| Regions Tested | East US 2 (recommended), Sweden Central (recommended) |
2. A/B Test Matrix
We tested 4 combinations to isolate variables:
| Test | Region | Target Type | Target Endpoint | SDK Console Output | API target Field |
|---|---|---|---|---|---|
| A | East US 2 | External | LM Studio (localhost:1234) | Track in AI Foundry: None |
null |
| A2 | East US 2 | External | LM Studio (localhost:1234) | Track in AI Foundry: None |
null |
| B | East US 2 | Foundry-hosted | Azure OpenAI gpt-4o-mini | Track in AI Foundry: None |
null |
| B2 | Sweden Central | Foundry-hosted | Azure OpenAI gpt-4o-mini | Track in AI Foundry: None |
null |
Test B/B2 Setup (Foundry-Hosted Target)
To isolate whether the issue was specific to external callbacks, we deployed gpt-4o-mini in the same Foundry project and created a callback targeting it:
# Deployed gpt-4o-mini in same Foundry project
AZURE_OPENAI_ENDPOINT = "https://airtagent-ai-eastus2.openai.azure.com"
DEPLOYMENT_NAME = "gpt-4o-mini"
def create_azure_openai_callback():
def callback(query: str) -> str:
url = f"{AZURE_OPENAI_ENDPOINT}/openai/deployments/{DEPLOYMENT_NAME}/chat/completions?api-version=2024-08-01-preview"
headers = {"Content-Type": "application/json", "api-key": AZURE_OPENAI_API_KEY}
payload = {"messages": [{"role": "user", "content": query}], "temperature": 0.7, "max_tokens": 2048}
with httpx.Client(timeout=120.0) as client:
response = client.post(url, json=payload, headers=headers)
response.raise_for_status()
return response.json()["choices"][0]["message"]["content"]
return callback
# SDK scan using Foundry-hosted model as target
red_team = RedTeam(
azure_ai_project="https://airtagent-ai-eastus2.services.ai.azure.com/api/projects/redteam",
credential=DefaultAzureCredential(),
risk_categories=[RiskCategory.Violence],
)
await red_team.scan(target=create_azure_openai_callback(), scan_name="test-foundry-target", num_objectives=3)
Result
Both external AND Foundry-hosted targets produce target=null. The issue is not related to where the target is hosted.
3. Variables Eliminated
| Hypothesis | Tested | Result |
|---|---|---|
External targets cause target=null |
✅ Tested Foundry-hosted Azure OpenAI | Same behavior |
| Non-recommended regions cause issue | ✅ Tested East US 2 + Sweden Central | Same behavior |
| Legacy Hub architecture causes issue | ✅ Tested new CognitiveServices/accounts | Same behavior |
| Regional API differences | ✅ Both regions accept SDK uploads | Same behavior |
4. Version History Analysis
We analyzed the SDK across multiple versions:
| Version | red_team Module |
_mlflow_integration.py |
RedTeamUpload.target Field |
Code Populates target? |
|---|---|---|---|---|
| 1.0.0 | ❌ Not present | N/A | N/A | N/A |
| 1.8.0 | ✅ Present | ❌ Not present | N/A | N/A |
| 1.12.0 | ✅ Present | ✅ Present | ❌ Field NOT in model | N/A |
| 1.13.0 | ✅ Present | ✅ Present | ✅ Field ADDED to model | ❌ NO |
| 1.13.7 | ✅ Present | ✅ Present | ✅ Present | ❌ NO |
Finding: The target field was added to RedTeamUpload model in v1.13.0, but the SDK code that creates RedTeamUpload instances was never updated to populate it.
5. SDK Code Trace
Location: _mlflow_integration.py lines 111-116
if self._one_dp_project:
response = self.generated_rai_client._evaluation_onedp_client.start_red_team_run(
red_team=RedTeamUpload(
display_name=run_name or f"redteam-agent-{datetime.now().strftime('%Y%m%d-%H%M%S')}",
) # <-- ONLY display_name is passed
)
The RedTeamUpload Model (_models.py lines 5074-5148)
class RedTeamUpload(_Model):
display_name: Optional[str] = rest_field(name="displayName", ...)
target: Optional["_models.TargetConfig"] = rest_field(...) # <-- Field exists but not populated
Additional Context
MLflowIntegration.__init__() (line 44-62) does not receive target information from RedTeam.scan(). The target would need to be passed through the call chain.
6. API Schema Analysis
We reviewed the API specification to understand what target values are valid.
Source: TypeSpec Definition
@discriminator("type")
model TargetConfig {
type: string;
}
model AzureOpenAIModelConfiguration extends TargetConfig {
type: "AzureOpenAIModel";
modelDeploymentName: string;
}
Source: REST API - Red Teams Create
The API documentation confirms TargetConfig uses a discriminator pattern with AzureOpenAIModelConfiguration as the only documented subtype.
Implication
The API schema currently has one TargetConfig subtype: AzureOpenAIModelConfiguration (discriminator: "AzureOpenAIModel"). There is no subtype for callback-based or external endpoint targets.
7. Root Cause Analysis
The issue has two layers:
| Layer | Issue | Evidence |
|---|---|---|
| SDK | _mlflow_integration.py doesn't populate target field |
Code trace in Section 5 |
| API | No TargetConfig subtype exists for callback targets |
TypeSpec shows only AzureOpenAIModelConfiguration |
For callback-based scans (the SDK's primary use case for external LLMs), there is currently no valid TargetConfig type to populate.
8. API Evidence
Runs are accessible via REST API but invisible in portal:
{
"value": [
{
"id": "6fd37129-c620-4f61-8578-058378bcff4f",
"displayName": "test-166-foundry-target-20251213-174501",
"target": null,
"status": "Completed",
"properties": {
"AiStudioEvaluationUri": "https://ai.azure.com/resource/build/redteaming/..."
}
}
]
}
Portal shows: "No red teams found"
9. Questions
-
Portal visibility: You mentioned "The UI does not filter by target." What field(s) determine whether a run appears in the portal? Our runs have
status: "Completed"and validAiStudioEvaluationUribut remain invisible. -
Callback target support: Is there a plan to add a
TargetConfigsubtype for callbacks? The SDK documentation shows callbacks as a primary target type, but the API has no way to represent them. -
Acceptable behavior for
target=null: Should SDK-initiated callback scans display in the portal with a placeholder (e.g., "Custom Target") rather than being hidden?
Summary
| Finding | Evidence |
|---|---|
| Issue is NOT regional | Tested East US 2 + Sweden Central |
| Issue is NOT external vs internal targets | Tested LM Studio + Azure OpenAI gpt-4o-mini |
| Issue is NOT Legacy vs New Foundry | Tested new CognitiveServices/accounts |
SDK doesn't populate target |
_mlflow_integration.py:112-116 |
| API has no callback target type | TypeSpec shows only AzureOpenAIModelConfiguration |
Hi @ctava-msft, we're sending this friendly reminder because we haven't heard back from you in 7 days. We need more information about this issue to help address it. Please be sure to give us your input. If we don't hear back from you within 14 days of this comment the issue will be automatically closed. Thank you!