Peter
Peter
Feel free to open issues/PRs! Sometimes it's hard to make stuff clear from my side because I wrote those code and they all look straightforward to me. Let me know...
Yes, that's on the plan (#83). I'm currently working on fixing the hugginface part so that would be supported in near future. Also the `tfckpt2bson` would eventually be removed.
it's on the roadmap, but won't be covered by the GSoC project. I'm considering wrapping the rust implementation in [huggingface/tokenizer](https://github.com/huggingface/tokenizers) with BinaryBuilder.jl.
Yes. I'll love to see more people contributing to this project. Currently I'm quite busy (working and studying) and therefore I can't spend too much effort on this project. I...
@SeanLee97 Actually, we already have word piece tokenizer in Transformers.jl. See [here](https://github.com/chengchingwen/Transformers.jl/blob/master/src/bert/tokenizer.jl) and [here](https://github.com/chengchingwen/Transformers.jl/blob/master/src/bert/wordpiece.jl)
It is written on the docstring for `@pretrain_str`, but I agree that might be a little misleading.
The code in tutorial is outdated, please refer to the example folder for latest workable code https://github.com/chengchingwen/Transformers.jl/tree/master/example/AttentionIsAllYouNeed
I think it's doable but would take some effort to integrate the two. There are some problems and behavior differences need to be resolved. Things like: 1. What do we...
@oxinabox I just tag a new release for OhMyArtifacts.jl. It should be able to handle directory now. I can try to make a PR to switch the backend to it,...
For some unknown reason the datadep storage path is not writable on the mac CI https://github.com/oxinabox/DataDeps.jl/runs/7153763089?check_suite_focus=true#step:6:83