ml-stuttering-events-dataset icon indicating copy to clipboard operation
ml-stuttering-events-dataset copied to clipboard

Improvements to avoid errors

Open Vyaas99 opened this issue 1 year ago • 11 comments

When I tried running the code today, I encountered a lot of errors and I just improved a few parts.

I improved the downloading of files using the requests library.

I improved the conversion of .mp4 to .wav in a more efficient way. The current one was throwing errors for me.

Vyaas99 avatar Jan 22 '24 01:01 Vyaas99

Hello, I noticed that you have been using this repository recently, have you encountered a download failure caused by the failure of the url of the stutteringiscool and strongvoices section in /SEP-28k_episodes.csv

alanshaoTT avatar Jan 23 '24 10:01 alanshaoTT

When I tried running the code today, I encountered a lot of errors and I just improved a few parts.

I improved the downloading of files using the requests library.

I improved the conversion of .mp4 to .wav in a more efficient way. The current one was throwing errors for me.

alanshaoTT avatar Jan 23 '24 10:01 alanshaoTT

Hello, I noticed that you have been using this repository recently, have you encountered a download failure caused by the failure of the url of the stutteringiscool and strongvoices section in /SEP-28k_episodes.csv

I am able to recollect downloading those episodes from SEP-28k_episodes.csv but for some reason I actually deleted the whole thing because it was taking too long to run and I was trying to download from fluencybank_episodes.

I am still not able to run the whole thing because it just stops out of nowhere for no reason. Were you able to run the whole thing? Also, were you able to run the whole thing using the original code or my code?

Vyaas99 avatar Jan 24 '24 00:01 Vyaas99

Hello, I noticed that you have been using this repository recently, have you encountered a download failure caused by the failure of the url of the stutteringiscool and strongvoices section in /SEP-28k_episodes.csv

I am able to recollect downloading those episodes from SEP-28k_episodes.csv but for some reason I actually deleted the whole thing because it was taking too long to run and I was trying to download from fluencybank_episodes.

I am still not able to run the whole thing because it just stops out of nowhere for no reason. Were you able to run the whole thing? Also, were you able to run the whole thing using the original code or my code?

I'm using the original code, with changes to table = np.genfromtxt(episode_uri, dtype=str, delimiter=", ", encoding='utf-8'). I am downloading the data set, the code is working, but the problem is that some urls are not working

alanshaoTT avatar Jan 24 '24 05:01 alanshaoTT

Hello, I noticed that you have been using this repository recently, have you encountered a download failure caused by the failure of the url of the stutteringiscool and strongvoices section in /SEP-28k_episodes.csv

I am able to recollect downloading those episodes from SEP-28k_episodes.csv but for some reason I actually deleted the whole thing because it was taking too long to run and I was trying to download from fluencybank_episodes.

I am still not able to run the whole thing because it just stops out of nowhere for no reason. Were you able to run the whole thing? Also, were you able to run the whole thing using the original code or my code?

it's still downloading, but its very slow,

alanshaoTT avatar Jan 24 '24 05:01 alanshaoTT

Hi! I tried running the code again to check and you are right. The strongvoices and stutteringiscool sections are not downloading. The links are not working. It is good to know that the original code with just that one change is working for you though.

For me, the files seem to be downloading and converting very fast. Can you try running my code and letting me know if there is any improvement in the speed?

Vyaas99 avatar Jan 26 '24 02:01 Vyaas99

Hi! I tried running the code again to check and you are right. The strongvoices and stutteringiscool sections are not downloading. The links are not working. It is good to know that the original code with just that one change is working for you though.

For me, the files seem to be downloading and converting very fast. Can you try running my code and letting me know if there is any improvement in the speed?

i have just tried your code, but it seems like something is wrong. you can see the error report as below. and as for my problem that its downloading slowly, i think it's beacause i'm in china, and there is some network limits for my machine. so we cant have the missing data? is there some ways to get it? i'll try to contact with authors whose paper just used SEP-28K, hoping that they have the whole dataset.

$ python new_dowanload.py --episodes SEP-28k_episodes_new.csv --wavs /home/work_nfs10/mcshao/SEP-28K/Raw_Data Download failed for: https://stutterrockstar.files.wordpress.com/2013/08/episode-108-with-roisin.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2013/08/episode-108-with-roisin.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2013/10/episode-109-with-nelly.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2013/10/episode-109-with-nelly.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2013/11/episode-111-with-lois-cookie.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2013/11/episode-111-with-lois-cookie.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/02/episode-113-with-sarah-o.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/02/episode-113-with-sarah-o.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/03/episode-114-with-courteny.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/03/episode-114-with-courteny.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/03/episode-115-with-cora.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/03/episode-115-with-cora.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/05/episode-117-with-jamila.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/05/episode-117-with-jamila.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/05/episode-118-with-natalie.mp3 Error: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) Download failed for: https://stutterrockstar.files.wordpress.com/2014/07/episode-122-with-yousra.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/07/episode-122-with-yousra.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/07/episode-123-with-carmen.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/07/episode-123-with-carmen.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/07/episode-124-with-natalie-b.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/07/episode-124-with-natalie-b.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/08/episode-125-with-satu.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/08/episode-125-with-satu.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/08/episode-126-w-christine-b.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/08/episode-126-w-christine-b.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/08/episode-127-with-annie-b.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/08/episode-127-with-annie-b.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/08/episode-128-with-farah.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/08/episode-128-with-farah.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/09/episode-129-with-lashanda.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/09/episode-129-with-lashanda.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/10/episode-130-with-debbie.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/10/episode-130-with-debbie.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/11/episode-131-with-vanna.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/11/episode-131-with-vanna.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2014/12/episode-132-with-emma.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2014/12/episode-132-with-emma.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2015/01/episode-133-with-shilpa.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2015/01/episode-133-with-shilpa.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2015/01/episode-134-with-margaret.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2015/01/episode-134-with-margaret.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2015/03/episode-136-with-dori.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2015/03/episode-136-with-dori.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2015/03/episode-137-with-autumn.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2015/03/episode-137-with-autumn.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2015/04/episode-138-with-mery.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2015/04/episode-138-with-mery.mp3 Download failed for: https://stutterrockstar.files.wordpress.com/2015/04/episode-139-with-heidi.mp3 Error: 404 Client Error: Not Found for url: https://stutterrockstar.files.wordpress.com/2015/04/episode-139-with-heidi.mp3

alanshaoTT avatar Jan 26 '24 04:01 alanshaoTT

I see. I guess different machines work differently. I faced a lot of errors in the original code but this one is working very well for me. Do send me a direct message on any platform as I would like to hear what work you are planning to do. Maybe we can even work together. It is always nice to collaborate internationally as it showcases good teamwork skills on our resumes.

Vyaas99 avatar Jan 26 '24 17:01 Vyaas99

Hi! Sorry for the slow reply. Yes, over the years some of the files have been deleted from their sources. At some point I tried to find updated links but some episodes were seemingly deleted entirely from the web.

Your best bet is to try getting a copy from someone who has used the dataset recently. Sorry I'm not able to provide a better solution!

colincsl avatar Jan 26 '24 18:01 colincsl

Hi! Sorry for the slow reply. Yes, over the years some of the files have been deleted from their sources. At some point I tried to find updated links but some episodes were seemingly deleted entirely from the web.

Your best bet is to try getting a copy from someone who has used the dataset recently. Sorry I'm not able to provide a better solution!

Hi Colin,

Would you have any Idea on who to contact to get the extracted clips ? Or is there any way to these extract clips, without needing StutteringIsCool and Strong Voices episodes?

Thanks

LocknLoad95 avatar Jan 28 '24 07:01 LocknLoad95

I see. I guess different machines work differently. I faced a lot of errors in the original code but this one is working very well for me. Do send me a direct message on any platform as I would like to hear what work you are planning to do. Maybe we can even work together. It is always nice to collaborate internationally as it showcases good teamwork skills on our resumes.

Of course! i will send an email to your address, talking about my plan to do with this dataset.

alanshaoTT avatar Jan 29 '24 04:01 alanshaoTT