Lance Martin
Lance Martin
> Added the notebook examples - let me know if it's good to merge @rlancemartin and I can add the blob stuff in a followup PR, or if you want...
Also, functionality here is: 1/ QA 2/ Translate Looks like Doctran can [do more](https://github.com/psychic-api/doctran/tree/main) in terms of metadata extraction. Plans to add that in a follow-up? We can discuss, too....
> Added translate and QA as parsers, as well as metadata extraction since you mentioned that was valuable. Will clean up the old files and update the notebooks early this...
> re: arguments, ah yeah nevermind - didn't notice that the loader accepts an instance of a parser rather than the class, so I can just include set them in...
> @rlancemartin I might be missing something here, but is there an easy way to load strings into Blobs? Right now I have to define a new BlobLoader class in...
> @rlancemartin Good to know! It wasn't a large amount of data so I just kept it in the notebook :) > > > > All the changes are in!...
Also, see tests: ``` poetry run mypy . tests/integration_tests/test_document_transformers.py:2: error: Module "langchain.document_transformers" has no attribute "EmbeddingsClusteringFilter" [attr-defined] ```
> Fixed the test failure as well! It was a result of moving `document_transformers.py` into a directory with an `__init__.py` Thanks for catching these @rlancemartin hmm, pulled your latest but...
> @rlancemartin sorry about that I was running the notebooks with the API key passed in as a param so wasn't running into that bug, but tested it with the...
@jasonwcfan i chatted w/ @baskaryan. this is better as a `document transformer` since it clearly ingests a document (rather than loading from a novel source) and transforms it. i moved...