langgraph
langgraph copied to clipboard
Required lcel-teacher-eval dataset configuration for langgraph_code_assistant.ipynb is unclear
Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain documentation with the integrated search.
- [X] I used the GitHub search to find a similar question and didn't find it.
- [X] I am sure that this is a bug in LangChain rather than my code.
Example Code
The following code:
# Run eval on base chain
run_id = uuid.uuid4().hex[:4]
project_name = "context-stuffing-no-langgraph"
client.run_on_dataset(
dataset_name="lcel-teacher-eval",
llm_or_chain_factory= lambda: (lambda x: x["question"]) | chain_base_case,
evaluation=evaluation_config,
project_name=f"{run_id}-{project_name}",
)
Error Message and Stack Trace (if applicable)
{
"name": "ValueError",
"message": "Dataset lcel-teacher-eval has no example rows.",
"stack": "---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
Cell In[23], line 38
36 run_id = uuid.uuid4().hex[:4]
37 project_name = \"context-stuffing-with-langgraph\"
---> 38 client.run_on_dataset(
39 dataset_name=\"lcel-teacher-eval\",
40 llm_or_chain_factory=model,
41 evaluation=evaluation_config,
42 project_name=f\"{run_id}-{project_name}\",
43 )
File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langsmith/client.py:3394, in Client.run_on_dataset(self, dataset_name, llm_or_chain_factory, evaluation, concurrency_level, project_name, project_metadata, verbose, tags, input_mapper, revision_id)
3389 except ImportError:
3390 raise ImportError(
3391 \"The client.run_on_dataset function requires the langchain\"
3392 \"package to run.\
Install with pip install langchain\"
3393 )
-> 3394 return _run_on_dataset(
3395 dataset_name=dataset_name,
3396 llm_or_chain_factory=llm_or_chain_factory,
3397 concurrency_level=concurrency_level,
3398 client=self,
3399 evaluation=evaluation,
3400 project_name=project_name,
3401 project_metadata=project_metadata,
3402 verbose=verbose,
3403 tags=tags,
3404 input_mapper=input_mapper,
3405 revision_id=revision_id,
3406 )
File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langchain/smith/evaluation/runner_utils.py:1297, in run_on_dataset(client, dataset_name, llm_or_chain_factory, evaluation, concurrency_level, project_name, project_metadata, verbose, tags, revision_id, **kwargs)
1289 warn_deprecated(
1290 \"0.0.305\",
1291 message=\"The following arguments are deprecated and \"
(...)
1294 removal=\"0.0.305\",
1295 )
1296 client = client or Client()
-> 1297 container = _DatasetRunContainer.prepare(
1298 client,
1299 dataset_name,
1300 llm_or_chain_factory,
1301 project_name,
1302 evaluation,
1303 tags,
1304 input_mapper,
1305 concurrency_level,
1306 project_metadata=project_metadata,
1307 revision_id=revision_id,
1308 )
1309 if concurrency_level == 0:
1310 batch_results = [
1311 _run_llm_or_chain(
1312 example,
(...)
1317 for example, config in zip(container.examples, container.configs)
1318 ]
File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langchain/smith/evaluation/runner_utils.py:1125, in _DatasetRunContainer.prepare(cls, client, dataset_name, llm_or_chain_factory, project_name, evaluation, tags, input_mapper, concurrency_level, project_metadata, revision_id)
1123 project_metadata = {}
1124 project_metadata.update({\"revision_id\": revision_id})
-> 1125 wrapped_model, project, dataset, examples = _prepare_eval_run(
1126 client,
1127 dataset_name,
1128 llm_or_chain_factory,
1129 project_name,
1130 project_metadata=project_metadata,
1131 tags=tags,
1132 )
1133 tags = tags or []
1134 for k, v in (project.metadata.get(\"git\") or {}).items():
File ~/forks/langchain/langgraph/.venv/lib/python3.10/site-packages/langchain/smith/evaluation/runner_utils.py:971, in _prepare_eval_run(client, dataset_name, llm_or_chain_factory, project_name, project_metadata, tags)
969 examples = list(client.list_examples(dataset_id=dataset.id))
970 if not examples:
--> 971 raise ValueError(f\"Dataset {dataset_name} has no example rows.\")
973 try:
974 git_info = get_git_info()
ValueError: Dataset lcel-teacher-eval has no example rows."
}
Description
- I'm trying to compare the behavior and quality of code assistants both with and without langgraph
- the major roadblock has to do with proper configuration of the langsmith dataset
- I have had to manually create the lcel-teacher-eval dataset to get around an initial error regarding the dataset not existing
- currently I'm getting the error captured above
- I don't remember the specific difference in configuration (or dependencies) but at one point I was seeing it write to the personal lcel-teacher-eval dataset consistently. In this case, the issue was that the input element was being captured as input:input and not input:question.
System Info
System Information
------------------
> OS: Linux
> OS Version: #1 SMP Thu Oct 5 21:02:42 UTC 2023
> Python Version: 3.10.13 (main, Feb 7 2024, 15:27:48) [GCC 11.4.0]
Package Information
-------------------
> langchain_core: 0.1.28
> langchain: 0.1.8
> langchain_community: 0.0.21
> langsmith: 0.1.4
> langchain_openai: 0.0.8
> langchainhub: 0.1.14
> langgraph: 0.0.26
Packages not installed (Not Necessarily a Problem)
--------------------------------------------------
The following packages were not found:
> langserve
@rlancemartin - I created an issue for langgraph_code_assistant.ipynb based on our discussion on X. The issue isn't with LangGraph behavior - it's with the integration with the (personal) LangSmith dataset. This is a critical usecase - hoping to have a working code example to go with it.
What I was expecting to see on the public dataset that was referenced was the ability to clone it to a personal dataset. Does that capability exist and I just missed it?
There was a commit midday on Sunday that overwrote some of your changes and changed the behavior. (It was working better prior to the changes)
https://github.com/langchain-ai/langgraph/commits/main/examples/code_assistant/langgraph_code_assistant.ipynb
LangSmith tracing was configured prior to the change and I was able to add additional items to the dataset from the trace.
@rlancemartin when you have a moment, please:
- Move the dataset to langchain-blogs tenant (so we don't accidentally delete at some point)
- Publish the dataset (share as public dataset)
- Add a
client.clone_public_dataset()
call with the URL to the top of the notebook to make it usable externally
thanks @hinthornw . this is a great use of langgraph and langsmith. appreciate your comments and follow-up
Will do!
Can not emphasize this enough ♥️♥️♥️
I've been following the YouTube channel and working through many of the notebooks (more than 800 in the repo now!) And this one is the most interesting one
- LangGraph integration
- LangSmith used with evals
- A public evals dataset that the community can use to start creating benchmarks across the different components of chains and graphs
Hopefully, an evals leaderboard integrated into LangSmith is in the pipeline 🤞
Keep up the great work! I'm a huge fanboy