langchain icon indicating copy to clipboard operation
langchain copied to clipboard

ValueError for tutorial about JSONLoader

Open labdmitriy opened this issue 1 year ago • 2 comments

System Info

LangChain version: 0.0.200 Platform: Ubuntu 20.04 LTS Python version: 3.10.4

Who can help?

No response

Information

  • [X] The official example notebooks/scripts
  • [ ] My own modified scripts

Related Components

  • [ ] LLMs/Chat Models
  • [ ] Embedding Models
  • [ ] Prompts / Prompt Templates / Prompt Selectors
  • [ ] Output Parsers
  • [X] Document Loaders
  • [ ] Vector Stores / Retrievers
  • [ ] Memory
  • [ ] Agents / Agent Executors
  • [ ] Tools / Toolkits
  • [ ] Chains
  • [ ] Callbacks/Tracing
  • [ ] Async

Reproduction

  1. Reproduce section "Using JSONLoader" for tutorial about JSONLoader

  2. After executing the following code:

loader = JSONLoader(
    file_path='./example_data/facebook_chat.json',
    jq_schema='.messages[].content'
)

data = loader.load()

the following error is displayed: ValueError: Expected page_content is string, got <class 'NoneType'> instead. Set `text_content=False` if the desired input for `page_content` is not a string

  1. If we try to get not the list, but just string:
loader = JSONLoader(
    file_path='./example_data/facebook_chat.json',
    jq_schema='.title'
)

data = loader.load()

there are no errors

  1. If we set text_content to False in original code:
loader = JSONLoader(
    file_path='./example_data/facebook_chat.json',
    jq_schema='.messages[].content',
    text_content=False
)

data = loader.load()

then there are also no errors.

Expected behavior

  • The code and documentation must match each other
  • Argument text_content must have more clear description in which cases it has to be used

labdmitriy avatar Jun 14 '23 08:06 labdmitriy

This problem is also in further sections of the same tutorial.

labdmitriy avatar Jun 14 '23 10:06 labdmitriy

I am using langchain the latest version 0.0.225 I am getting an error when trying to pass the text_content parameter

loader = JSONLoader(
    file_path='file_path',
    jq_schema='.',
    text_content=False)

the error: JSONLoader.__init__() got an unexpected keyword argument 'text_content'

Do you know if any changes were made to the method signature?! I can't see any!!

abdallamourad avatar Jul 07 '23 04:07 abdallamourad

I am using langchain the latest version 0.0.225 I am getting an error when trying to pass the text_content parameter

loader = JSONLoader(
    file_path='file_path',
    jq_schema='.',
    text_content=False)

the error: JSONLoader.__init__() got an unexpected keyword argument 'text_content'

Do you know if any changes were made to the method signature?! I can't see any!!

I'm facing the same issue...

kshiv05 avatar Aug 03 '23 13:08 kshiv05

Hi, @labdmitriy! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

From what I understand, you raised an issue regarding a ValueError that occurs when using the JSONLoader in a tutorial. The issue seems to be related to the text_content argument, which requires a clearer description in the documentation. Other users, such as @abdallamourad and @kshiv05, have also reported facing the same issue and are seeking clarification on any changes made to the method signature.

Before we close this issue, we wanted to check with you if it is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or it will be automatically closed in 7 days.

Thank you for your contribution to the LangChain project!

dosubot[bot] avatar Nov 02 '23 16:11 dosubot[bot]