langchain
langchain copied to clipboard
Basic html loader with crawly
- Added Document and DocumentLoader Behaviours
- Added Crawly DocumentLoader
Hey @brainlid I wanted to split up my work into smaller chunks so I can get it in (and others can play with the blocks/revamp/etc.). How does this one look?
@brainlid I see this has been sitting for a while. I am planning on doing some data loading from api's soon, and was wondering if there are plans to integrate this PR or some sort of document in general?
I think this effort has stalled out. I’m open to new work in this area. What do you need?
On Sat, Aug 24, 2024 at 6:54 AM Matt Husby @.***> wrote:
@brainlid https://github.com/brainlid I see this has been sitting for a while. I am planning on doing some data loading from api's soon, and was wondering if there are plans to integrate this PR or some sort of document in general?
— Reply to this email directly, view it on GitHub https://github.com/brainlid/langchain/pull/22#issuecomment-2308384632, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGFQGDZBYWTO5VAS7BPL3DZTB7ANAVCNFSM6AAAAABNBSKKS2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBYGM4DINRTGI . You are receiving this because you were mentioned.Message ID: @.***>
I am not doing anything too fancy, just planning to pull in some jira tickets and maybe github issues.
My main question is what do you think of using the Document model that is in this PR? I would like to stick to a standard way of doing the document loading etc, at first glance this seems fine - but wanted to make sure I wasn't missing something.
what do you think of using the Document model that is in this PR
I think the Document model was incomplete. The idea was to base it on the TS/Python LangChain Document idea. I'm not using it personally nor do I have any short-term needs for it. However, I'm open to that approach.