gensim icon indicating copy to clipboard operation
gensim copied to clipboard

Gensim's FastText model reads in unsupported modes from Facebook's FastText

Open mpenkov opened this issue 3 years ago • 11 comments

In gensim/models/fasttext.py:

    model = FastText(
        vector_size=m.dim,
        vector_size=m.dim,
        window=m.ws,
        window=m.ws,
        epochs=m.epoch,
        epochs=m.epoch,
        negative=m.neg,
        negative=m.neg,
        # FIXME: these next 2 lines read in unsupported FB FT modes (loss=3 softmax or loss=4 onevsall,
        # or model=3 supervised) possibly creating inconsistent gensim model likely to fail later. Displaying
        # clear error/warning with explanatory message would be far better - even if there might be some reason
        # to continue with the load - such as providing read-only access to word-vectors trained those ways. (See:
        # https://github.com/facebookresearch/fastText/blob/2cc7f54ac034ae320a9af784b8145c50cc68965c/src/args.h#L19
        # for FB FT mode definitions.)
        hs=int(m.loss == 1),
        hs=int(m.loss == 1),
        sg=int(m.model == 2),
        sg=int(m.model == 2),
        bucket=m.bucket,
        bucket=m.bucket,
        min_count=m.min_count,
        min_count=m.min_count,
        sample=m.t,
        sample=m.t,
        min_n=m.minn,
        min_n=m.minn,
        max_n=m.maxn,
        max_n=m.maxn,
    )

mpenkov avatar Jun 22 '21 01:06 mpenkov

may I work on this?

amin110314 avatar Jun 22 '21 02:06 amin110314

Sure.

mpenkov avatar Jun 22 '21 06:06 mpenkov

Since there is no commits for a month, can I take up this task? (I am just starting with open source contribution, any references or redirection is welcomed)

kitrakrev avatar Aug 23 '21 04:08 kitrakrev

#3222 I have tried to add error handling before loading another fasttext model to gensim/fasttext. Please review and tell me will it work? image

Tewatia5355 avatar Aug 30 '21 16:08 Tewatia5355

@mpenkov this is my 1st contribution to this project, can you take a look at PR #3223 and let me know if it's good to go?

karan121bhukar avatar Aug 31 '21 08:08 karan121bhukar

Hi, is this issue still available to be worked on? Would love to get started with contributions on this project!

ryandelano avatar Nov 30 '23 18:11 ryandelano