langchaingo icon indicating copy to clipboard operation
langchaingo copied to clipboard

Structured parser doesn't parse array

Open abdallamourad opened this issue 2 years ago • 7 comments
trafficstars

Hi, I am trying to parse an array of data using structured parser. The json I am trying to parse is

{
  "ids": [number]
}

I am getting this error map[] json: cannot unmarshal array into Go value of type string I tried to change the data type of the parser to map[string]any and it worked perfectly.

I was wondering if this case is handled somewhere else? Or can we add it to the existing parser?

abdallamourad avatar Jul 05 '23 03:07 abdallamourad

The current structured parser only works with key value pairs of strings, but maybe the comma separated list output parser works for your case? Having a parser that can handle arbitrary JSON is definitely needed.

FluffyKebab avatar Jul 05 '23 18:07 FluffyKebab

Thank you @FluffyKebab for your response!

I was wondering if this feature would benefit the project in the time being! I could probably pick it up?

abdallamourad avatar Jul 07 '23 07:07 abdallamourad

Yes, please do! Here are some references from the other versions: TS version, Python version. What do you think is the best golang version of a ZOD schema in the TS version and pydantic in the python version?

FluffyKebab avatar Jul 07 '23 10:07 FluffyKebab

A PR was opened that used a better openai client and had the structure output parser, but it wasn't merged ontime and became dead

The better openai client has a structured output parser https://github.com/sashabaranov/go-openai/blob/1153eb2595d1529927757dd6df4de71faaafde02/jsonschema/json.go#L23C2-L23C2

I also raised a similar issue if it's the right way to go https://github.com/tmc/langchaingo/issues/197

steinathan avatar Jul 26 '23 01:07 steinathan

The openai client that was targeted by the PR was what I was using before deciding to switch to langchain. I would definitely recommend finishing the PR, sasha's openai client is really nice to use.

dallman2 avatar Sep 05 '23 21:09 dallman2

I hit this limit too and was wondering about what the best solution would be. I think the most efficient solution would not to use an external schema language Zod or JSON Schema, but instead rely on native struct tagging. That is, one would create a Go type and tag the fields, just like one always does. The output parser should then use that type as input for configuration and return a variable of that type as response.

ewintr avatar Jan 20 '24 09:01 ewintr

Also a fan of aiming for a native struct but I'm open to suggested alternatives.

tmc avatar Mar 19 '24 23:03 tmc

@tmc I have a solution similar to what is described here, using struct tagging. This provides a similar developer experience to using Pydantic to extract structured data in LangChain.

See PydanticOutputParser in the LangChain API docs.

The developer experience parallels what's described in this Build an Extraction Chain tutorial.

Would you like me to send a pull request?

erictse avatar May 24 '24 18:05 erictse

+1

xuning888 avatar May 27 '24 08:05 xuning888

+1

lucaronca avatar Jun 14 '24 16:06 lucaronca