kaldi icon indicating copy to clipboard operation
kaldi copied to clipboard

[WIP] Fix/update kaldi for openfst 1.8

Open auroraanon38 opened this issue 2 years ago • 17 comments

WIP to bring Kaldi into a workable state with openfst 1.8 as per this issue

Discussion and feedback required. As well as help with more complicated C++ modifications difficult with my current skills.

auroraanon38 avatar Mar 24 '22 05:03 auroraanon38

Apparently the Travis template will also need updating, as it isn't aware of C++17

auroraanon38 avatar Mar 24 '22 05:03 auroraanon38

Currently encountering an issue with operator+ for type conversion between basic string & basic string view.

./lat/arctic-weight.h:52:59: error: no match for ‘operator+’ (operand types are ‘std::string’ {aka ‘std::__cxx11::basic_string<char>’} and ‘std::string_view’ {aka ‘std::basic_string_view<char>’})
   52 |     static const std::string type = std::string("arctic") +
      |                                          ~~~~~~~~~~~~~~~~~^
   53 |         FloatWeightTpl<T>::GetPrecisionString();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~            
In file included from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_algobase.h:67,
                 from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/vector:60,
                 from ../lat/lattice-functions.h:27,
                 from kws-functions2.cc:21:
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:533:5: note: candidate: ‘template<class _Iterator> constexpr std::reverse_iterator<_Iterator> std::operator+(typename std::reverse_iterator<_Iterator>::difference_type, const std::reverse_iterator<_Iterator>&)’
  533 |     operator+(typename reverse_iterator<_Iterator>::difference_type __n,
      |     ^~~~~~~~
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:533:5: note:   template argument deduction/substitution failed:
In file included from ../kws/kaldi-kws.h:25,
                 from ../kws/kws-functions.h:27,
                 from kws-functions2.cc:22:
../lat/arctic-weight.h:52:59: note:   ‘std::string_view’ {aka ‘std::basic_string_view<char>’} is not derived from ‘const std::reverse_iterator<_Iterator>’
   52 |     static const std::string type = std::string("arctic") +
      |                                          ~~~~~~~~~~~~~~~~~^
   53 |         FloatWeightTpl<T>::GetPrecisionString();
      |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~            
In file included from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_algobase.h:67,
                 from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/vector:60,
                 from ../lat/lattice-functions.h:27,
                 from kws-functions2.cc:21:
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:1578:5: note: candidate: ‘template<class _Iterator> constexpr std::move_iterator<_IteratorL> std::operator+(typename std::move_iterator<_IteratorL>::difference_type, const std::move_iterator<_IteratorL>&)’
 1578 |     operator+(typename move_iterator<_Iterator>::difference_type __n,
      |     ^~~~~~~~

Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?

auroraanon38 avatar Mar 24 '22 05:03 auroraanon38

The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....

danpovey avatar Mar 24 '22 07:03 danpovey

.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).

danpovey avatar Mar 24 '22 07:03 danpovey

Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.

danpovey avatar Mar 24 '22 08:03 danpovey

I have certain doubts about enforcing C++17 y.

On Thu, Mar 24, 2022 at 4:16 AM Daniel Povey @.***> wrote:

Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1077352876, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX67VHUUOKIKN2B653LVBQQG3ANCNFSM5RQAE42A . You are receiving this because you are subscribed to this thread.Message ID: @.***>

jtrmal avatar Mar 24 '22 13:03 jtrmal

The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....

Right, that's reasonable. I'll look to see if it has such a flag. The configure file has the version and might be putting it somewhere useful, but it might not be adding any such symbol inside the library itself.

.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).

Noted, will do.

Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.

Yeah, I've gotten too used to multimethod patterns.

I have certain doubts about enforcing C++17

Is it possible to build C++ libraries with C++17 but use them in programs compiled with lower versions?

In any case you have a point that it could and should be turned into a conditional variable dependent on openfst use inside those Makefiles, rather than hardcoded as I did.

auroraanon38 avatar Mar 24 '22 18:03 auroraanon38

Ad the C++17 Lots of the OpenFST stuff is templated and now written with the C++17 features, so I'm afraid you will need C++17 compliant compiler every time you will need to compile something. Unless there will be some compatibility layer or something.... y.

On Thu, Mar 24, 2022 at 2:46 PM Aurora @.***> wrote:

The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....

Right, that's reasonable. I'll look to see if it has such a flag. The configure file has the version and might be putting it somewhere useful, but it might not be adding any such symbol inside the library itself.

.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).

Noted, will do.

https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1077352876Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.

Yeah, I've gotten too used to the Lisp multimethod patterns.

I have certain doubts about enforcing C++17

Is it possible to build C++ libraries with C++17 but use them in programs compiled with lower versions?

In any case you have a point that it could and should be turned into a conditional variable dependent on openfst use inside those Makefiles, rather than hardcoded as I did.

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1077936841, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX44Z3L4K5FYIO35SXTVBS2BFANCNFSM5RQAE42A . You are receiving this because you commented.Message ID: @.***>

jtrmal avatar Mar 24 '22 18:03 jtrmal

Noting that this PR also pertains to https://github.com/kaldi-asr/kaldi/issues/4131 and https://github.com/kaldi-asr/kaldi/issues/4393

auroraanon38 avatar Mar 24 '22 18:03 auroraanon38

Ad the C++17 Lots of the OpenFST stuff is templated and now written with the C++17 features, so I'm afraid you will need C++17 compliant compiler every time you will need to compile something. Unless there will be some compatibility layer or something.... y.

Alright.

auroraanon38 avatar Mar 24 '22 18:03 auroraanon38

I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?

danpovey avatar Mar 25 '22 05:03 danpovey

Daniel Povey @.***> writes:

I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?

That would likely be the simplest and cleanest option.

auroraanon38 avatar Mar 25 '22 15:03 auroraanon38

I like many of these changes, not even related to OpenFST per se. We should certainly divorce Kaldi from OpenFST primitive types.

So far in this changeset, there is not so much stuff that would require complex shimming. 3 or 4 small functions will do. We support much larger divergence in CUDA. I do not think maintaining a forked branch is a viable option. Someone has to be responsible for replicating changes to it, and then our test pipelines are barely holding, and the amount of testing immediately doubles. We'll require a lower version of OpenFST in the out-of-the-box, "standard" build, but support 1.8+ and C++17 for custom builds. C++17 is fully supported already.

@danpovey, @jtrmal, what do you think? There was another issue with changed command line arguments to one of the FST binaries, breaking scripts, somewhere between 1.7.2 and 1.7.7.

I do not really understand the benefits of using FST 1.8 as the default. Is it faster, or fixing any issues? If we're not using features added to higher versions, then the default for the Kaldi-as-toolkit can be 1.7.x, where x is the version right before the breaking command line change. Kaldi-as-library, which the users integrate, should not use command line tools, and is not affected.

kkm000 avatar Apr 01 '22 03:04 kkm000

I'm ok with that

On Fri, Apr 1, 2022 at 6:38 AM kkm000 @.***> wrote:

I like many of these changes, not even related to OpenFST per se. We should certainly divorce Kaldi from OpenFST primitive types.

So far in this changeset, there is not so much stuff that would require complex shimming. 3 or 4 small functions will do. We support much larger divergence in CUDA. I do not think maintaining a forked branch is a viable option. Someone has to be responsible for replicating changes to it, and then our test pipelines are barely holding, and the amount of testing immediately doubles. We'll require a lower version of OpenFST in the out-of-the-box, "standard" build, but support 1.8+ and C++17 for custom builds. C++17, is fully supported.

@danpovey https://github.com/danpovey, @jtrmal https://github.com/jtrmal, what do you think? There was another issue with changed command line arguments to one of the FST binaries, breaking scripts, somewhere between 1.7.2 and 1.7.7.

I do not really understand the benefits of using FST 1.8 as the default. Is it faster, or fixing any issues? If we're not using features added to higher versions, then the default for the Kaldi-as-toolkit can be 1.7.x, where x is the version right before the breaking command line change. Kaldi-as-library, which the users integrate, should not use command line tools, and is not affected.

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1085382642, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX5RZRREUBWE6DNDDFLVCZVTPANCNFSM5RQAE42A . You are receiving this because you were mentioned.Message ID: @.***>

jtrmal avatar Apr 01 '22 07:04 jtrmal

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

stale[bot] avatar Jun 04 '22 03:06 stale[bot]

Sorry I missed this earlier, @kkm000 I think your proposal is OK. I suppose you are talking about making our codebase support both older and newer version of OpenFST.

danpovey avatar Jun 04 '22 06:06 danpovey

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

stale[bot] avatar Aug 12 '22 01:08 stale[bot]

makes sense to me. Apologies if I sounded too negative. I dont even oppose to switching to C++17, it just feels it would cut off quite a large portion of the user base. y..

On Fri, Mar 25, 2022 at 11:19 AM Aurora @.***> wrote:

Daniel Povey @.***> writes:

I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?

That would likely be the simplest and cleanest option.

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1079132724, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX4SUJJHW7QIJZTQVATVBXKQVANCNFSM5RQAE42A . You are receiving this because you commented.Message ID: @.***>

jtrmal avatar Oct 11 '22 08:10 jtrmal

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

stale[bot] avatar Dec 16 '22 04:12 stale[bot]

Bad bot! Boo!

kkm000 avatar Dec 17 '22 02:12 kkm000

This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.

stale[bot] avatar Feb 18 '23 03:02 stale[bot]

ping

danpovey avatar Feb 18 '23 07:02 danpovey

@auroraanon38, is there some work still going on in this issue? We have some patches that fix compatibility with newer versions of OpenFST.

I might upstream them though they will not bring perfect compatibility out of the box.

@danpovey, is there any kind of a todolist fixed somewhere which describes the whole migration process?

georgthegreat avatar Feb 18 '23 20:02 georgthegreat

Yuriy Chernyshov @.***> writes:

@auroraanon38, is there some work still going on in this issue? We have some patches that fix compatibility with newer versions of OpenFST. Nope, no work is currently ongoing, it is legitimately stale.

auroraanon38 avatar Feb 23 '23 17:02 auroraanon38

Nope, no work is currently ongoing, it is legitimately stale.

Could you, please, elaborate a bit? What is the rationale behind the stale?

Would you (or maybe someone else in the project) accept the PRs supporting newer versions of OpenFST (I assume this would mean dropping the support for any other version though).

georgthegreat avatar Feb 23 '23 18:02 georgthegreat

He, we would support but it would have to be done more carefully than just dropping old code, I think the reason is that the new OpenFST changes some interfaces and needs C++17(?) standard So resolving this is not trivial -- probably would need conditional compilation and stuff But if you just check how much code changes is in in C++ codes only, we/I could then try to figure out the rest y.

On Thu, Feb 23, 2023 at 1:08 PM Yuriy Chernyshov @.***> wrote:

Nope, no work is currently ongoing, it is legitimately stale.

Could you, please, elaborate a bit? What is the rationale behind the stale?

Would you (or maybe someone else in the project) accept the PRs supporting newer versions of OpenFST (I assume this would mean dropping the support for any other version though).

— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1442214774, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX67OF47WYGUNTBAYG3WY6RQPANCNFSM5RQAE42A . You are receiving this because you were mentioned.Message ID: @.***>

jtrmal avatar Feb 23 '23 18:02 jtrmal

@jtrmal, I have just submitted PR #4829 with a chain of commits needed to support openfst version up to 1.8.2.

georgthegreat avatar Feb 23 '23 18:02 georgthegreat

I apologize I fell out of the loop, but I want to remind that a command line of one tool has changed somewhere between minor v1.7 points, most probably in [1.7.2, 1.7.8] range. I forgot which binary was that.

I have not only nothing against bumping Kaldi to C++17, but all for it. We haven't yet even embraced full C++11 support for system-independent language features, really: some stuff like aligned memory allocation and memory model in general, and PRNG have been standardized, so we can get rid of quite a few #if WIN32 and such stuff. In fact, we wrote two very multithreaded and lock-contending Kaldi-based production servers; the first in C++14, xplat Linux/Windows; the second together with @jtrmal in C++17, without a single system call in the main logic paths¹

I don't see much value in sticking with a 10-year-C++. Or even 5-year-C++, to that matter. In any case, Git tags and maintenance branches exist, and for a good measure.

¹ The three exceptions outside of the main course are (1) service startup/termination, of which there are five standards: init, upstart, systemd and containers on Linux and Service Control Manager on Windows, (2) Windows-only built-in ultra-high throughput metric collector, with no Linux counterpart of similar performance, and (3) three different logging systems on Linux: syslog, systemd journal and in-container Docker logging, and yet the fourth on Windows.

kkm000 avatar Feb 23 '23 19:02 kkm000

OK, but is it possible to use C++17 with the older OpenFST version? I think there is some desirability for keeping back-compatibility, and if it's not possible to support the newer OpenFST without a seamless upgrade process (e.g. enabled via #ifdefs on the OpenFST version number) then I'm not convinced the benefits outweigh the costs.

danpovey avatar Feb 24 '23 06:02 danpovey

We used to compile OpenFST with c++17 (we currently compile it with c++20).

We are maintaining a huge monorepo with hundreds of opensource projects. The only way to make opensource work at scale we were able to find so far is update everytihng to the most recent version and fix small incompatibilities by patches, then upstream patches.

So here I am, upstreaming our patches.

georgthegreat avatar Feb 24 '23 11:02 georgthegreat