Jiaming Shen

Results 12 comments of Jiaming Shen

I suspect that if we reverse the document and conduct keyword matching in the reversed order, we can get both. ``` document = "ABC DE FGHI" keywords = ["ABC DE",...

Hi @pvcastro, Which version of the transformer library are you using? In my local environment with Transformer version 2.8.0, the tokenizer works fine. I put a screen below for your...

Indeed it looks strange. I don't know what happens here. Maybe you can restart the IPython kernel and run this pipeline from scratch again?

My tokenizer library version is 0.5.2

I don't think I cache the tokenizer intentionally. Maybe the transformer library automatically do that? I think you can ask this question in the transformer library and get some supports...

Below is the transformers-cli env output: - `transformers` version: 2.8.0 - Platform: Linux-4.15.0-72-generic-x86_64-with-debian-buster-sid - Python version: 3.7.4 - PyTorch version (GPU?): 1.4.0 (True) - Tensorflow version (GPU?): not installed (NA)...

Hi @liuyaduo, Thanks for the detailed comparison. Indeed, this code does not have this additional fully connected layer + activation function. You can easily add this function as follows: ```...

Thanks for this comment. I initially chose to select this six possible skipgrams in order to somehow align with existing literature. You can definitely change to other positions and I...

Thanks for pointing this out. The seed entities need to appear in the generated entity2id.txt file. I think the phrases are connected with "_" during the embedding learning and corpus...

The default Elasticsearch 5.4.0 configuration should suffice. One thing is to enable the script scoring function by adding following two lines in the config/elasticsearch.yml file `` script.inline: on`` ``script.indexed: on``