openfst icon indicating copy to clipboard operation
openfst copied to clipboard

Problem reading FAR file from farcompilestrings

Open mmcauliffe opened this issue 7 years ago • 7 comments

Hello,

I'm using the latest commit generated from CMake (version 3.12.2) along with OpenGRM-ngram 1.3.4 (https://github.com/MontrealCorpusTools/opengrm-ngram), both compiled using VS 2017. When I generate a FAR file using farcompilestrings, it completes successfully, but using it as input to any of the ngram binaries such as ngramcount ( etc gives the following error:

ERROR: Fst::Read: Unknown FST type vector (arc type = standard): <unspecified>

From the OpenFST changelog (http://www.openfst.org/twiki/pub/FST/FstDownload/NEWS) it looks like this was addressed somewhat in 1.6.8 (or at least the reporting), and the same versions of everything work on Linux, and the OpenGRM-ngram binaries compiled on Linux can read the generated FAR just fine, so I'm a bit stumped.

mmcauliffe avatar Sep 20 '18 22:09 mmcauliffe

The FAR extension is not compiled into the current build. Let me see what I can do over the weekend.

kkm000 avatar Sep 21 '18 19:09 kkm000

I'm using the latest commit generated from CMake

@mmcauliffe, I am a little bit unclear: are you using CMake to build OpenFST from this repo? Or VS projects/solutions?

kkm000 avatar Sep 23 '18 10:09 kkm000

/cc @jtrmal Yenda, if you have some spare cycles, could you please chime in? I am not building the far extensions binaries at all (yet? I probably should as it appears there is a demand out there). So I am almost sure this is CMake-related. Do you understand what is going on? I was getting almost exactly the same error before we added the /WHOLEARCHIVE trick last time. But that was different: the mainstream library tore off a file with static initialization only, so the library member was never pulled into executables on link. With FAR, it's different, the declarations are in a file that also has some code. I do not understand though if these functions, FarReaderClass::Open and others in the same source file, are in fact dependency of something that is actually compiled in. If not, we have the same problem with the FAR executables.

kkm000 avatar Sep 23 '18 10:09 kkm000

I'm gonna try to find a windows box and try. Yes, it seems to me as a registration issue (/wholearchive) as well. Y.

On Sun, Sep 23, 2018, 12:55 Kirill Katsnelson [email protected] wrote:

/cc @jtrmal https://github.com/jtrmal Yenda, if you have some spare cycles, could you please chime in? I am not building the far extensions binaries at all (yet? I probably should as it appears there is a demand out there). So I am almost sure this is CMake-related. Do you understand what is going on? I was getting almost exactly the same error before we added the /WHOLEARCHIVE trick last time. But that was different: the mainstream library tore off a file with static initialization only, so the library member was never pulled into executables on link. With FAR, it's different, the declarations are in a file that also has some code https://github.com/kkm000/openfst/blob/d4dd88e17/src/extensions/far/far-class.cc#L38. I do not understand though if these functions, FarReaderClass::Open and others in the same source file, are in fact dependency of something that is actually compiled in. If not, we have the same problem with the FAR executables.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/kkm000/openfst/issues/16#issuecomment-423807993, or mute the thread https://github.com/notifications/unsubscribe-auth/AKisX6R2AqplaM9JwZfnsgHsA9eyv9yFks5ud2i-gaJpZM4WzKXE .

jtrmal avatar Sep 23 '18 12:09 jtrmal

@kkm000 Right, so the VS projects/solutions checked into the repo don't include the FAR extensions, so I was following the CMake instructions from the kaldi/windows (https://github.com/kaldi-asr/kaldi/blob/master/windows/INSTALL.md#compiling-openfst) as those do generate the FAR extensions (and they seem to work alright with each other).

mmcauliffe avatar Sep 23 '18 14:09 mmcauliffe

I'll be working on adding them to the VS build then. No ETA at this moment, I'll try my best to do it this week.

kkm000 avatar Sep 24 '18 00:09 kkm000

I finally got come breathing space to work on this. @mmcauliffe, I'd really appreciate if you could check if the version I have so far works for you. It builds far extensions with Visual Studio.

Extensions are compiled into the same libfst and libfstscript libraries, and binaries go into the same build_output/ directory as the fst*.exe executables.

The branch name is wip-build-optional-features. far is an optional feature that is currently enabled in this branch, check openfst.user.props. You can just build and see if it does the trick for you. My quick command-line checks apparently worked:

c:\projects\openfst\src>..\build_output\x64\Debug\bin\farcreate mdy_nosym.fst number_nosym.fst test.far

c:\projects\openfst\src>..\build_output\x64\Debug\bin\farinfo test.far
far type                                          sttable
arc type                                          standard
fst type                                          vector
# of FSTs                                         2
total # of states                                 24805
total # of arcs                                   54994
total # of final states                           2

I changed the MSBuild build process quite significantly. The project files are moved to src/ from its subdirectories, and all objects are placed into a tree under an obj/ directory under project root. If you are reusing the working directory, it makes sense to delete the whole src/ directory then do a git reset --hard to resurrect it in clean state--so you won't end up with 2G of orphaned object files. Then do git fetch and git checkout wip-build-optional-features.

kkm000 avatar Oct 06 '18 09:10 kkm000