langchain
langchain copied to clipboard
ObsidianLoader doesn't Parse Frontmatter of Obsidian Files according to YAML, but line-by-line and only for key-value pairs
System Info
Python 3, Google Colab/Mac-OS Conda Env, Langchain v0.0.173, Dev2049/obsidian patch #4204
Who can help?
No response
Information
- [ ] The official example notebooks/scripts
- [X] My own modified scripts
Related Components
- [ ] LLMs/Chat Models
- [ ] Embedding Models
- [ ] Prompts / Prompt Templates / Prompt Selectors
- [ ] Output Parsers
- [X] Document Loaders
- [ ] Vector Stores / Retrievers
- [ ] Memory
- [ ] Agents / Agent Executors
- [ ] Tools / Toolkits
- [ ] Chains
- [ ] Callbacks/Tracing
- [ ] Async
Reproduction
See Colab: https://colab.research.google.com/drive/1YSfKTQ92RZEJ-z-hlwHL7f5rQ-evTgx4?usp=sharing
- Have Obsidian Files with Front_Matter other then simple key_values
- Import ObsidianLoader and create loader with the Obsidian Vault
- load documents
YAML frontmatter is parsed to: metadata={'source': 'Demo-Note.md', 'path': 'Obsidian_Vault/Demo-Note.md', 'created': 1684513990.4561925, 'last_modified': 1684513964.4224126, 'last_accessed': 1684514007.4610083, 'normal': 'frontmatter is just key-value based', 'list_frontmatter': '', 'dictionary_frontmatter': '', 'can': 'contain', 'different': 'key-value-paris', 'and': '[even, nest, these, types]'}
Expected behavior
YAML Frontmatter should be parsed according to YAML rules
normal: frontmatter is just key-value based list_frontmatter:
- can
- be
- an array dictionary_frontmatter: can: contain different: key-value-paris and: [even, nest, these, types]
-->
{
'norma'l: 'frontmatter is just key-value based,
'list_frontmatter': ['can', 'be', 'an array'],
'dictionary_frontmatter' : {
'can' : 'contain',
'different': 'key-value-pairs',
'and' : ['even', 'nest', 'theses', 'types]
}
}