Lance Martin comments

Results 87 comments of


                                            Lance Martin

Add new types of document transformers

> Added the notebook examples - let me know if it's good to merge @rlancemartin and I can add the blob stuff in a followup PR, or if you want...

Add new types of document transformers

Also, functionality here is: 1/ QA 2/ Translate Looks like Doctran can [do more](https://github.com/psychic-api/doctran/tree/main) in terms of metadata extraction. Plans to add that in a follow-up? We can discuss, too....

Add new types of document transformers

> Added translate and QA as parsers, as well as metadata extraction since you mentioned that was valuable. Will clean up the old files and update the notebooks early this...

Add new types of document transformers

> re: arguments, ah yeah nevermind - didn't notice that the loader accepts an instance of a parser rather than the class, so I can just include set them in...

Add new types of document transformers

> @rlancemartin I might be missing something here, but is there an easy way to load strings into Blobs? Right now I have to define a new BlobLoader class in...

Add new types of document transformers

> @rlancemartin Good to know! It wasn't a large amount of data so I just kept it in the notebook :) > > > > All the changes are in!...

Add new types of document transformers

Also, see tests: ``` poetry run mypy . tests/integration_tests/test_document_transformers.py:2: error: Module "langchain.document_transformers" has no attribute "EmbeddingsClusteringFilter" [attr-defined] ```

Add new types of document transformers

> Fixed the test failure as well! It was a result of moving `document_transformers.py` into a directory with an `__init__.py` Thanks for catching these @rlancemartin hmm, pulled your latest but...

Add new types of document transformers

> @rlancemartin sorry about that I was running the notebooks with the API key passed in as a param so wasn't running into that bug, but tested it with the...

Add new types of document transformers

@jasonwcfan i chatted w/ @baskaryan. this is better as a `document transformer` since it clearly ingests a document (rather than loading from a novel source) and transforms it. i moved...