langgraph icon indicating copy to clipboard operation
langgraph copied to clipboard

Required lcel-teacher-eval dataset configuration for langgraph_code_assistant.ipynb is unclear

Open donbr opened this issue 11 months ago • 4 comments

Checked other resources

  • [X] I added a very descriptive title to this issue.
  • [X] I searched the LangChain documentation with the integrated search.
  • [X] I used the GitHub search to find a similar question and didn't find it.
  • [X] I am sure that this is a bug in LangChain rather than my code.

Example Code

The following code:

# Run eval on base chain
run_id = uuid.uuid4().hex[:4]
project_name = "context-stuffing-no-langgraph"
client.run_on_dataset(
    dataset_name="lcel-teacher-eval",
    llm_or_chain_factory= lambda: (lambda x: x["question"]) | chain_base_case,
    evaluation=evaluation_config,
    project_name=f"{run_id}-{project_name}",
)

Error Message and Stack Trace (if applicable)

{
	"name": "ValueError",
	"message": "Dataset lcel-teacher-eval has no example rows.",
	"stack": "---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[23], line 38
     36 run_id = uuid.uuid4().hex[:4]
     37 project_name = \"context-stuffing-with-langgraph\"
---> 38 client.run_on_dataset(
     39     dataset_name=\"lcel-teacher-eval\",
     40     llm_or_chain_factory=model,
     41     evaluation=evaluation_config,
     42     project_name=f\"{run_id}-{project_name}\",
     43 )

File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langsmith/client.py:3394, in Client.run_on_dataset(self, dataset_name, llm_or_chain_factory, evaluation, concurrency_level, project_name, project_metadata, verbose, tags, input_mapper, revision_id)
   3389 except ImportError:
   3390     raise ImportError(
   3391         \"The client.run_on_dataset function requires the langchain\"
   3392         \"package to run.\
Install with pip install langchain\"
   3393     )
-> 3394 return _run_on_dataset(
   3395     dataset_name=dataset_name,
   3396     llm_or_chain_factory=llm_or_chain_factory,
   3397     concurrency_level=concurrency_level,
   3398     client=self,
   3399     evaluation=evaluation,
   3400     project_name=project_name,
   3401     project_metadata=project_metadata,
   3402     verbose=verbose,
   3403     tags=tags,
   3404     input_mapper=input_mapper,
   3405     revision_id=revision_id,
   3406 )

File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langchain/smith/evaluation/runner_utils.py:1297, in run_on_dataset(client, dataset_name, llm_or_chain_factory, evaluation, concurrency_level, project_name, project_metadata, verbose, tags, revision_id, **kwargs)
   1289     warn_deprecated(
   1290         \"0.0.305\",
   1291         message=\"The following arguments are deprecated and \"
   (...)
   1294         removal=\"0.0.305\",
   1295     )
   1296 client = client or Client()
-> 1297 container = _DatasetRunContainer.prepare(
   1298     client,
   1299     dataset_name,
   1300     llm_or_chain_factory,
   1301     project_name,
   1302     evaluation,
   1303     tags,
   1304     input_mapper,
   1305     concurrency_level,
   1306     project_metadata=project_metadata,
   1307     revision_id=revision_id,
   1308 )
   1309 if concurrency_level == 0:
   1310     batch_results = [
   1311         _run_llm_or_chain(
   1312             example,
   (...)
   1317         for example, config in zip(container.examples, container.configs)
   1318     ]

File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langchain/smith/evaluation/runner_utils.py:1125, in _DatasetRunContainer.prepare(cls, client, dataset_name, llm_or_chain_factory, project_name, evaluation, tags, input_mapper, concurrency_level, project_metadata, revision_id)
   1123         project_metadata = {}
   1124     project_metadata.update({\"revision_id\": revision_id})
-> 1125 wrapped_model, project, dataset, examples = _prepare_eval_run(
   1126     client,
   1127     dataset_name,
   1128     llm_or_chain_factory,
   1129     project_name,
   1130     project_metadata=project_metadata,
   1131     tags=tags,
   1132 )
   1133 tags = tags or []
   1134 for k, v in (project.metadata.get(\"git\") or {}).items():

File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langchain/smith/evaluation/runner_utils.py:971, in _prepare_eval_run(client, dataset_name, llm_or_chain_factory, project_name, project_metadata, tags)
    969 examples = list(client.list_examples(dataset_id=dataset.id))
    970 if not examples:
--> 971     raise ValueError(f\"Dataset {dataset_name} has no example rows.\")
    973 try:
    974     git_info = get_git_info()

ValueError: Dataset lcel-teacher-eval has no example rows."
}

Description

  • I'm trying to compare the behavior and quality of code assistants both with and without langgraph
  • the major roadblock has to do with proper configuration of the langsmith dataset
  • I have had to manually create the lcel-teacher-eval dataset to get around an initial error regarding the dataset not existing
    • currently I'm getting the error captured above
    • I don't remember the specific difference in configuration (or dependencies) but at one point I was seeing it write to the personal lcel-teacher-eval dataset consistently. In this case, the issue was that the input element was being captured as input:input and not input:question.

System Info

System Information
------------------
> OS:  Linux
> OS Version:  #1 SMP Thu Oct 5 21:02:42 UTC 2023
> Python Version:  3.10.13 (main, Feb  7 2024, 15:27:48) [GCC 11.4.0]

Package Information
-------------------
> langchain_core: 0.1.28
> langchain: 0.1.8
> langchain_community: 0.0.21
> langsmith: 0.1.4
> langchain_openai: 0.0.8
> langchainhub: 0.1.14
> langgraph: 0.0.26

Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:

> langserve

donbr avatar Mar 04 '24 05:03 donbr

@rlancemartin - I created an issue for langgraph_code_assistant.ipynb based on our discussion on X. The issue isn't with LangGraph behavior - it's with the integration with the (personal) LangSmith dataset. This is a critical usecase - hoping to have a working code example to go with it.

What I was expecting to see on the public dataset that was referenced was the ability to clone it to a personal dataset. Does that capability exist and I just missed it?

donbr avatar Mar 04 '24 05:03 donbr

There was a commit midday on Sunday that overwrote some of your changes and changed the behavior. (It was working better prior to the changes)

https://github.com/langchain-ai/langgraph/commits/main/examples/code_assistant/langgraph_code_assistant.ipynb

LangSmith tracing was configured prior to the change and I was able to add additional items to the dataset from the trace.

donbr avatar Mar 04 '24 05:03 donbr

@rlancemartin when you have a moment, please:

  1. Move the dataset to langchain-blogs tenant (so we don't accidentally delete at some point)
  2. Publish the dataset (share as public dataset)
  3. Add a client.clone_public_dataset() call with the URL to the top of the notebook to make it usable externally

hinthornw avatar Mar 04 '24 17:03 hinthornw

thanks @hinthornw . this is a great use of langgraph and langsmith. appreciate your comments and follow-up

donbr avatar Mar 04 '24 23:03 donbr

Will do!

rlancemartin avatar Mar 06 '24 01:03 rlancemartin

Can not emphasize this enough ♥️♥️♥️

I've been following the YouTube channel and working through many of the notebooks (more than 800 in the repo now!) And this one is the most interesting one

  • LangGraph integration
  • LangSmith used with evals
  • A public evals dataset that the community can use to start creating benchmarks across the different components of chains and graphs

Hopefully, an evals leaderboard integrated into LangSmith is in the pipeline 🤞

Keep up the great work! I'm a huge fanboy

ht55ght55 avatar Mar 06 '24 19:03 ht55ght55