unsupervised_topic_segmentation Share code that takes the AMI data and formats it for your internal db please

Share code that takes the AMI data and formats it for your internal db please

Open KeithYJohnson opened this issue 2 years ago • 1 comments

Hey guys,

It'd be wonderful to have a snippet of whatever code transforms your source of the AMI dataset, and also that source. Did you start from https://huggingface.co/datasets/ami? Or maybe just unzipped and transformed https://groups.inf.ed.ac.uk/ami/download/? It looks like you computed sentence start times—ami just having segment/word level timings—but I've noticed some that the AMI test start has duplicate segment start times. So it'd be interesting to see how you handled such cases.

Nov 23 '22 22:11 KeithYJohnson

I am not one of authors but after long search I found that you can use this dataset https://github.com/Yale-LILY/QMSum

Feb 24 '23 14:02 BMukhtar

unsupervised_topic_segmentation unsupervised_topic_segmentation copied to clipboard

Share code that takes the AMI data and formats it for your internal db please

unsupervised_topic_segmentation
unsupervised_topic_segmentation copied to clipboard