kaldi
kaldi copied to clipboard
[WIP] Fix/update kaldi for openfst 1.8
WIP to bring Kaldi into a workable state with openfst 1.8 as per this issue
Discussion and feedback required. As well as help with more complicated C++ modifications difficult with my current skills.
Apparently the Travis template will also need updating, as it isn't aware of C++17
Currently encountering an issue with operator+
for type conversion between basic string & basic string view.
./lat/arctic-weight.h:52:59: error: no match for ‘operator+’ (operand types are ‘std::string’ {aka ‘std::__cxx11::basic_string<char>’} and ‘std::string_view’ {aka ‘std::basic_string_view<char>’})
52 | static const std::string type = std::string("arctic") +
| ~~~~~~~~~~~~~~~~~^
53 | FloatWeightTpl<T>::GetPrecisionString();
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_algobase.h:67,
from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/vector:60,
from ../lat/lattice-functions.h:27,
from kws-functions2.cc:21:
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:533:5: note: candidate: ‘template<class _Iterator> constexpr std::reverse_iterator<_Iterator> std::operator+(typename std::reverse_iterator<_Iterator>::difference_type, const std::reverse_iterator<_Iterator>&)’
533 | operator+(typename reverse_iterator<_Iterator>::difference_type __n,
| ^~~~~~~~
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:533:5: note: template argument deduction/substitution failed:
In file included from ../kws/kaldi-kws.h:25,
from ../kws/kws-functions.h:27,
from kws-functions2.cc:22:
../lat/arctic-weight.h:52:59: note: ‘std::string_view’ {aka ‘std::basic_string_view<char>’} is not derived from ‘const std::reverse_iterator<_Iterator>’
52 | static const std::string type = std::string("arctic") +
| ~~~~~~~~~~~~~~~~~^
53 | FloatWeightTpl<T>::GetPrecisionString();
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_algobase.h:67,
from /gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/vector:60,
from ../lat/lattice-functions.h:27,
from kws-functions2.cc:21:
/gnu/store/069aq2v993kpc41yabp5b6vm4wb9jkhg-gcc-10.3.0/include/c++/bits/stl_iterator.h:1578:5: note: candidate: ‘template<class _Iterator> constexpr std::move_iterator<_IteratorL> std::operator+(typename std::move_iterator<_IteratorL>::difference_type, const std::move_iterator<_IteratorL>&)’
1578 | operator+(typename move_iterator<_Iterator>::difference_type __n,
| ^~~~~~~~
Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?
The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....
.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).
Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.
I have certain doubts about enforcing C++17 y.
On Thu, Mar 24, 2022 at 4:16 AM Daniel Povey @.***> wrote:
Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.
— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1077352876, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX67VHUUOKIKN2B653LVBQQG3ANCNFSM5RQAE42A . You are receiving this because you are subscribed to this thread.Message ID: @.***>
The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....
Right, that's reasonable. I'll look to see if it has such a flag. The configure file has the version and might be putting it somewhere useful, but it might not be adding any such symbol inside the library itself.
.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).
Noted, will do.
Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.
Yeah, I've gotten too used to multimethod patterns.
I have certain doubts about enforcing C++17
Is it possible to build C++ libraries with C++17 but use them in programs compiled with lower versions?
In any case you have a point that it could and should be turned into a conditional variable dependent on openfst use inside those Makefiles, rather than hardcoded as I did.
Ad the C++17 Lots of the OpenFST stuff is templated and now written with the C++17 features, so I'm afraid you will need C++17 compliant compiler every time you will need to compile something. Unless there will be some compatibility layer or something.... y.
On Thu, Mar 24, 2022 at 2:46 PM Aurora @.***> wrote:
The tricky thing, and the reason we hvaen't done this earlier, is that we don't want to break existing builds with older OpenFST versions. I assume OpenFST must export some kind of flag saying what the version is; we might be able to use ifdefs based on that....
Right, that's reasonable. I'll look to see if it has such a flag. The configure file has the version and might be putting it somewhere useful, but it might not be adding any such symbol inside the library itself.
.. and you can probably fix the operator+ problem by casting the second arg to std::string, i.e replace x.GetPrecisionString() with std::string(x.GetPrecisionString()).
Noted, will do.
https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1077352876Responding to: "Is it preferable for me to extend the operator to cover this specific case, or replace the calls with explicit conversions?", which is not appearing on the webpage for some reason.. Explicit conversions. You shouldn't be extending the standard library for sure.
Yeah, I've gotten too used to the Lisp multimethod patterns.
I have certain doubts about enforcing C++17
Is it possible to build C++ libraries with C++17 but use them in programs compiled with lower versions?
In any case you have a point that it could and should be turned into a conditional variable dependent on openfst use inside those Makefiles, rather than hardcoded as I did.
— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1077936841, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX44Z3L4K5FYIO35SXTVBS2BFANCNFSM5RQAE42A . You are receiving this because you commented.Message ID: @.***>
Noting that this PR also pertains to https://github.com/kaldi-asr/kaldi/issues/4131 and https://github.com/kaldi-asr/kaldi/issues/4393
Ad the C++17 Lots of the OpenFST stuff is templated and now written with the C++17 features, so I'm afraid you will need C++17 compliant compiler every time you will need to compile something. Unless there will be some compatibility layer or something.... y.
Alright.
I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?
Daniel Povey @.***> writes:
I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?
That would likely be the simplest and cleanest option.
I like many of these changes, not even related to OpenFST per se. We should certainly divorce Kaldi from OpenFST primitive types.
So far in this changeset, there is not so much stuff that would require complex shimming. 3 or 4 small functions will do. We support much larger divergence in CUDA. I do not think maintaining a forked branch is a viable option. Someone has to be responsible for replicating changes to it, and then our test pipelines are barely holding, and the amount of testing immediately doubles. We'll require a lower version of OpenFST in the out-of-the-box, "standard" build, but support 1.8+ and C++17 for custom builds. C++17 is fully supported already.
@danpovey, @jtrmal, what do you think? There was another issue with changed command line arguments to one of the FST binaries, breaking scripts, somewhere between 1.7.2 and 1.7.7.
I do not really understand the benefits of using FST 1.8 as the default. Is it faster, or fixing any issues? If we're not using features added to higher versions, then the default for the Kaldi-as-toolkit can be 1.7.x, where x is the version right before the breaking command line change. Kaldi-as-library, which the users integrate, should not use command line tools, and is not affected.
I'm ok with that
On Fri, Apr 1, 2022 at 6:38 AM kkm000 @.***> wrote:
I like many of these changes, not even related to OpenFST per se. We should certainly divorce Kaldi from OpenFST primitive types.
So far in this changeset, there is not so much stuff that would require complex shimming. 3 or 4 small functions will do. We support much larger divergence in CUDA. I do not think maintaining a forked branch is a viable option. Someone has to be responsible for replicating changes to it, and then our test pipelines are barely holding, and the amount of testing immediately doubles. We'll require a lower version of OpenFST in the out-of-the-box, "standard" build, but support 1.8+ and C++17 for custom builds. C++17, is fully supported.
@danpovey https://github.com/danpovey, @jtrmal https://github.com/jtrmal, what do you think? There was another issue with changed command line arguments to one of the FST binaries, breaking scripts, somewhere between 1.7.2 and 1.7.7.
I do not really understand the benefits of using FST 1.8 as the default. Is it faster, or fixing any issues? If we're not using features added to higher versions, then the default for the Kaldi-as-toolkit can be 1.7.x, where x is the version right before the breaking command line change. Kaldi-as-library, which the users integrate, should not use command line tools, and is not affected.
— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1085382642, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX5RZRREUBWE6DNDDFLVCZVTPANCNFSM5RQAE42A . You are receiving this because you were mentioned.Message ID: @.***>
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
Sorry I missed this earlier, @kkm000 I think your proposal is OK. I suppose you are talking about making our codebase support both older and newer version of OpenFST.
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
makes sense to me. Apologies if I sounded too negative. I dont even oppose to switching to C++17, it just feels it would cut off quite a large portion of the user base. y..
On Fri, Mar 25, 2022 at 11:19 AM Aurora @.***> wrote:
Daniel Povey @.***> writes:
I wonder whether, if it's not practical to make everything conditional, we could create a separate branch of Kaldi for use with the more recent OpenFST versions?
That would likely be the simplest and cleanest option.
— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1079132724, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX4SUJJHW7QIJZTQVATVBXKQVANCNFSM5RQAE42A . You are receiving this because you commented.Message ID: @.***>
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
Bad bot! Boo!
This issue has been automatically marked as stale by a bot solely because it has not had recent activity. Please add any comment (simply 'ping' is enough) to prevent the issue from being closed for 60 more days if you believe it should be kept open.
ping
@auroraanon38, is there some work still going on in this issue? We have some patches that fix compatibility with newer versions of OpenFST.
I might upstream them though they will not bring perfect compatibility out of the box.
@danpovey, is there any kind of a todolist fixed somewhere which describes the whole migration process?
Yuriy Chernyshov @.***> writes:
@auroraanon38, is there some work still going on in this issue? We have some patches that fix compatibility with newer versions of OpenFST. Nope, no work is currently ongoing, it is legitimately stale.
Nope, no work is currently ongoing, it is legitimately stale.
Could you, please, elaborate a bit? What is the rationale behind the stale?
Would you (or maybe someone else in the project) accept the PRs supporting newer versions of OpenFST (I assume this would mean dropping the support for any other version though).
He, we would support but it would have to be done more carefully than just dropping old code, I think the reason is that the new OpenFST changes some interfaces and needs C++17(?) standard So resolving this is not trivial -- probably would need conditional compilation and stuff But if you just check how much code changes is in in C++ codes only, we/I could then try to figure out the rest y.
On Thu, Feb 23, 2023 at 1:08 PM Yuriy Chernyshov @.***> wrote:
Nope, no work is currently ongoing, it is legitimately stale.
Could you, please, elaborate a bit? What is the rationale behind the stale?
Would you (or maybe someone else in the project) accept the PRs supporting newer versions of OpenFST (I assume this would mean dropping the support for any other version though).
— Reply to this email directly, view it on GitHub https://github.com/kaldi-asr/kaldi/pull/4716#issuecomment-1442214774, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACUKYX67OF47WYGUNTBAYG3WY6RQPANCNFSM5RQAE42A . You are receiving this because you were mentioned.Message ID: @.***>
@jtrmal, I have just submitted PR #4829 with a chain of commits needed to support openfst version up to 1.8.2.
I apologize I fell out of the loop, but I want to remind that a command line of one tool has changed somewhere between minor v1.7 points, most probably in [1.7.2, 1.7.8] range. I forgot which binary was that.
I have not only nothing against bumping Kaldi to C++17, but all for it. We haven't yet even embraced full C++11 support for system-independent language features, really: some stuff like aligned memory allocation and memory model in general, and PRNG have been standardized, so we can get rid of quite a few #if WIN32
and such stuff. In fact, we wrote two very multithreaded and lock-contending Kaldi-based production servers; the first in C++14, xplat Linux/Windows; the second together with @jtrmal in C++17, without a single system call in the main logic paths¹
I don't see much value in sticking with a 10-year-C++. Or even 5-year-C++, to that matter. In any case, Git tags and maintenance branches exist, and for a good measure.
¹ The three exceptions outside of the main course are (1) service startup/termination, of which there are five standards: init, upstart, systemd and containers on Linux and Service Control Manager on Windows, (2) Windows-only built-in ultra-high throughput metric collector, with no Linux counterpart of similar performance, and (3) three different logging systems on Linux: syslog, systemd journal and in-container Docker logging, and yet the fourth on Windows.
OK, but is it possible to use C++17 with the older OpenFST version? I think there is some desirability for keeping back-compatibility, and if it's not possible to support the newer OpenFST without a seamless upgrade process (e.g. enabled via #ifdefs on the OpenFST version number) then I'm not convinced the benefits outweigh the costs.
We used to compile OpenFST with c++17 (we currently compile it with c++20).
We are maintaining a huge monorepo with hundreds of opensource projects. The only way to make opensource work at scale we were able to find so far is update everytihng to the most recent version and fix small incompatibilities by patches, then upstream patches.
So here I am, upstreaming our patches.