snowfall icon indicating copy to clipboard operation
snowfall copied to clipboard

WIP: add compute-post.

Open csukuangfj opened this issue 4 years ago • 9 comments
trafficstars

Usage:

$ snowfall net compute-post -m /ceph-fj/model-jit.pt -f exp/data/cuts_test-clean.json.gz -o exp

I find that there is one issue with the Torch Scripted module: We have to know the signature of the forward function of the model as well as its subsampling factor.


Working on compute-ali and will submit them together.

csukuangfj avatar Jun 10 '21 09:06 csukuangfj

I just created a pull-request in Lhotse https://github.com/lhotse-speech/lhotse/pull/319 to add posteriors to the class Cut. The motivation is to reuse the serialization and dataset code from it.


Also, I find the alignment information contained in the supervision is too simple, see https://github.com/lhotse-speech/lhotse/blob/ef7a037426f1b602a54f4d9ea43e711007e85719/lhotse/supervision.py#L24

    symbol: str    
    start: Seconds    
    duration: Seconds

Can we move the alignment class from snowfall to lhotse? https://github.com/k2-fsa/snowfall/blob/bce73304f40c321a6dad809058b12e559962c321/snowfall/tools/ali.py#L20-L28

csukuangfj avatar Jun 10 '21 09:06 csukuangfj

The usage of compute-ali:

$ snowfall  ali compute-ali -l data/lang_nosp -p ./exp/cuts_post.json  --max-duration=500 -o exp

csukuangfj avatar Jun 10 '21 13:06 csukuangfj

Also, I find the alignment information contained in the supervision is too simple

Can you describe the issue more? I'm not sure I understand what's missing there. We could move Snowfall's frame-wise alignment to Lhotse but I'm not sure how to make the two representations compatible with each other (the CTM-like description seems more general to me as you can cast it to frame-wise representation with different frame shifts).

pzelasko avatar Jun 10 '21 14:06 pzelasko

BTW I wonder if we should support piping these programs together, Kaldi-style. Click easily allows doing that with file type arguments.

We could do that by writing/reading JSONL-serialized manifests in a streaming manner. Since most operations on CutSet refer to individual operations on Cut, this seems feasible without the need to re-write too much code. There is a function in Lhotse that tries to figure out the right manifest type from a dict, which can be used to parse individual lines (BTW @csukuangfj I just realized that you might need to extend that function to handle the posterior manifests in your Lhotse PR).

WDYT?

pzelasko avatar Jun 10 '21 18:06 pzelasko

... there is also some code for line-by-line incremental JSONL writing in Lhotse that could be extended to support this.

pzelasko avatar Jun 10 '21 18:06 pzelasko

This cool; I'm afraid I'm not following it in detail. Just a reminder; this is more an "experimental direction" at this point. We'll have to learn from experience whether these kinds of command line utilites are actually a useful thing.

danpovey avatar Jun 11 '21 16:06 danpovey

Fair enough. The idea is to allow sth like:

snowfall net compute-post <some-inputs-args..> - | snowfall net compute-ali - <some-more-args..>

but I just realized that with the current way things are done in Lhotse, we would have store the actual arrays/tensors on disk and just pass the manifests around, which might not be optimal. Maybe it's not relevant for now and we can see how to do that in the future, if needed at all.

pzelasko avatar Jun 11 '21 16:06 pzelasko

BTW, I tend to think being able to do something at all tends to be more important than that thing being efficient-- premature optimization being the root of all evil etc., although I did plenty of it in Kaldi. I don't know what the optimal solution is here, I am afraid I have not been following this PR closely enough.

danpovey avatar Jun 11 '21 16:06 danpovey

Agreed. But for the record, the full quote is actually:

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%."

pzelasko avatar Jun 11 '21 16:06 pzelasko