rig icon indicating copy to clipboard operation
rig copied to clipboard

bug: Cant't extract data from extractor

Open locez opened this issue 9 months ago • 9 comments

  • [x] I have looked for existing issues (including closed) about this

Bug Report

I use extractor to get data from LLM but it always returns Err(Nodata). Manually using agent to get json deserialized to struct works fine.

Reproduction

use rig::providers::openai;
use schemars::JsonSchema;

#[derive(Debug, serde::Serialize, serde::Deserialize, schemars::JsonSchema)]
pub struct VideoMetadata {
    pub title: String,
    pub season: Option<u32>,
    pub episode: Option<u32>,
}

#[tokio::main]
async fn main() {
   // use https://www.volcengine.com service
    let client = openai::Client::from_url(
        "TOKEN",
        "https://ark.cn-beijing.volces.com/api/v3",
    );

  // use deekseek-v3 or other model not work too
    let extractor = client
        .extractor::<VideoMetadata>("deepseek-v3-241226")
        .preamble("你是一个专业的视频元数据提取助手。你的任务是从视频文件名中提取标题、季数和集数。请分析文件名并提取这些信息。")
        .build();

    let filename =
        "[Airota&VCB-Studio] Yuru Camp Season 3 [01][Ma10p_1080p][x265_flac].mkv".to_string();

    let result = extractor.extract(&filename).await;
    println!("{:?}", result)
}

cargo.toml

[package]
name = "bug-report"
version = "0.1.0"
edition = "2024"

[dependencies]
rig-core = "0.9.1"
schemars = "0.8.22"
serde = { version = "1.0.219", features = ["derive"] }
serde_json = "1.0.140"
tokio = { version = "1.44.1", features = ["full", "macros", "rt-multi-thread"] }

Expected behavior

extractor can extract correctly

Screenshots

Image

Additional context

locez avatar Mar 16 '25 09:03 locez

Hi, can you provide a URL for the documentation of the service you are using? This appears to be an issue related to the API and not rig's extract API

joshua-mo-143 avatar Mar 18 '25 00:03 joshua-mo-143

Your API key has an error; it needs to be set as an environment variable to use the extractor.

deep60 avatar Mar 21 '25 05:03 deep60

Hi, can you provide a URL for the documentation of the service you are using? This appears to be an issue related to the API and not rig's extract API

I used a third-party deepseek-v3-chat model service, which provides an openai compatible API method

locez avatar Mar 21 '25 14:03 locez

Your API key has an error; it needs to be set as an environment variable to use the extractor.

let extractor = client
        .extractor::<VideoMetadata>("deepseek-v3-241226")
...

The client has been initialized with the API KEY. Does the Extractor obtained in this way still not work?

locez avatar Mar 21 '25 14:03 locez

Hi, can you provide a URL for the documentation of the service you are using? This appears to be an issue related to the API and not rig's extract API

I used a third-party deepseek-v3-chat model service, which provides an openai compatible API method

Is there any URL for an API reference? I suspect the root cause might be that the API is not actually 100% compliant.

joshua-mo-143 avatar Mar 21 '25 17:03 joshua-mo-143

API reference URL: https://www.volcengine.com/docs/82379/1298454

locez avatar Mar 23 '25 11:03 locez

I cannot reproduce as I'm unable to obtain an api key

0xMochan avatar Apr 24 '25 02:04 0xMochan

Maybe you could tell me how to debug this, add some output and I can try to fix it by myself

locez avatar May 01 '25 15:05 locez

I don’t have time to try it now, so I plan to close this issue temporarily. I have recently learned more ways to build agents, but that is based on Python. In the future, I will come back and try to use Rig to build related applications, but it should not be my previous crude usage.

locez avatar Jul 03 '25 17:07 locez