langchain icon indicating copy to clipboard operation
langchain copied to clipboard

ObsidianLoader doesn't Parse Frontmatter of Obsidian Files according to YAML, but line-by-line and only for key-value pairs

Open effi opened this issue 1 year ago • 0 comments

System Info

Python 3, Google Colab/Mac-OS Conda Env, Langchain v0.0.173, Dev2049/obsidian patch #4204

Who can help?

No response

Information

  • [ ] The official example notebooks/scripts
  • [X] My own modified scripts

Related Components

  • [ ] LLMs/Chat Models
  • [ ] Embedding Models
  • [ ] Prompts / Prompt Templates / Prompt Selectors
  • [ ] Output Parsers
  • [X] Document Loaders
  • [ ] Vector Stores / Retrievers
  • [ ] Memory
  • [ ] Agents / Agent Executors
  • [ ] Tools / Toolkits
  • [ ] Chains
  • [ ] Callbacks/Tracing
  • [ ] Async

Reproduction

See Colab: https://colab.research.google.com/drive/1YSfKTQ92RZEJ-z-hlwHL7f5rQ-evTgx4?usp=sharing

  1. Have Obsidian Files with Front_Matter other then simple key_values
  2. Import ObsidianLoader and create loader with the Obsidian Vault
  3. load documents

YAML frontmatter is parsed to: metadata={'source': 'Demo-Note.md', 'path': 'Obsidian_Vault/Demo-Note.md', 'created': 1684513990.4561925, 'last_modified': 1684513964.4224126, 'last_accessed': 1684514007.4610083, 'normal': 'frontmatter is just key-value based', 'list_frontmatter': '', 'dictionary_frontmatter': '', 'can': 'contain', 'different': 'key-value-paris', 'and': '[even, nest, these, types]'}

Expected behavior

YAML Frontmatter should be parsed according to YAML rules

normal: frontmatter is just key-value based list_frontmatter:

  • can
  • be
  • an array dictionary_frontmatter: can: contain different: key-value-paris and: [even, nest, these, types]

-->

{ 
    'norma'l: 'frontmatter is just key-value based,
    'list_frontmatter': ['can', 'be', 'an array'],
    'dictionary_frontmatter' : {
        'can' : 'contain',
        'different': 'key-value-pairs',
        'and' : ['even', 'nest', 'theses', 'types]
    }
}

effi avatar May 19 '23 16:05 effi