croissant icon indicating copy to clipboard operation
croissant copied to clipboard

Preserve native sample rate, don't auto-resample to 22050

Open tombagby opened this issue 9 months ago • 6 comments

The default in librosa is sr=22050, which forces resampling, it has to be set explicitly to None to preserve the original sample rate. I doubt this is intentional?

If resampling is desired, the field description should support an explicit target sample rate to enable.

tombagby avatar Apr 15 '25 04:04 tombagby

MLCommons CLA bot All contributors have signed the MLCommons CLA ✍️ ✅

github-actions[bot] avatar Apr 15 '25 04:04 github-actions[bot]

recheck

tombagby avatar May 06 '25 01:05 tombagby

Hi @tombagby , thank you for your contribution!

I think you'd need to ru:

mlcroissant load --jsonld ../../datasets/0.8/audio_test/metadata.json --record_set records --num_records 2 --debug --update_output
mlcroissant load --jsonld ../../datasets/1.0/audio_test/metadata.json --record_set records --num_records 2 --debug --update_output

to update the test outputs, which should hopefully fix the failing tests.

ccl-core avatar May 07 '25 12:05 ccl-core

@ccl-core in case you have a maintainer right to re-run the test, you can try that as well. I think it should work since we already have #863 merged into the main branch (which is the branch that this PR trying to merge to).

Click on "..." -> "View details" Screenshot 2568-05-07 at 15 07 34

Click on "Re-run this job" Screenshot 2568-05-07 at 15 08 40

bact avatar May 07 '25 14:05 bact

Uhm, I thought that the change in this PR would also affect the output records? In any case, I rerun the CI tests and there still are errors: https://github.com/mlcommons/croissant/actions/runs/14850204797/job/41806335710?pr=849

ccl-core avatar May 07 '25 15:05 ccl-core

@ccl-core You're right. It could work with other PRs but not this one.

bact avatar May 07 '25 16:05 bact