langchainrb icon indicating copy to clipboard operation
langchainrb copied to clipboard

Add ability to parse code

Open santib opened this issue 2 years ago • 3 comments

Something like this is needed so we can load codebases in Vector DBs

santib avatar Aug 12 '23 19:08 santib

@santib This opens up an interesting problem -- how do we chunk code?

andreibondarev avatar Aug 13 '23 09:08 andreibondarev

@santib This opens up an interesting problem -- how do we chunk code?

Yeah, I just used the Text one for simplicity, but checking https://python.langchain.com/docs/modules/data_connection/document_transformers/text_splitters/code_splitter seems like they get the separators for each language, and that's it.

I can change this PR to do something similar if you want

santib avatar Aug 14 '23 00:08 santib

@santib Yeah, I like that!

andreibondarev avatar Aug 14 '23 17:08 andreibondarev

Closing due to inactivity.

andreibondarev avatar Oct 23 '24 20:10 andreibondarev