pytorch-ie
pytorch-ie copied to clipboard
PyTorch-IE: State-of-the-art Information Extraction in PyTorch
[Rendered](https://github.com/ChristophAlt/pytorch-ie/tree/docs_pie_concepts#-concepts--architecture)
The reason is that there may exist datasets that are so large they can't be iterated over in a reasonable time. For efficiency reasons we should allow the user to...
`encode_inputs` should not do anything depending on the state of `is_training`, respective code can live in `encode_targets`. This will ease separation of concerns and testing. To implement this it may...
This was broken because pytorch-lightning tries to move the output of `TransformerSeq2SeqTaskModule.collate` to a device via `pytorch_lightning.core.datamodule.LightningDataModule.transfer_batch_to_device` that internally uses [`pytorch_lightning.utilities.apply_func.apply_to_collection`](https://pytorch-lightning.readthedocs.io/en/stable/api/pytorch_lightning.utilities.apply_func.html#pytorch_lightning.utilities.apply_func.apply_to_collection). This method fails if any part of the input...
Until #183 is implemented, we need at least descriptions for the most relevant parts of PyTorch-IE in the readme to make it usable by the public. This may require the...
Approach: The base taskmodule now has also a `_prepare()` method which should be overwritten in derived classes instead of `prepare()`. `prepare()` does now the following: 1. it checks, if the...
If `documents` of type `Dataset` is passed to the pipeline, use `documents.map` to add the predictions. In this case, a `Dataset` is returned instead of `Sequence[Document]`. Note: Builds on top...
For instance `dataset.train_test_split(...)` returns a HF Dataset, which then breaks serialization, deserialization logic. Not sure if there's a better solution but the quick fix would be to wrap the methods...