Samsul Rahmadani (munggok)

Results 10 issues of Samsul Rahmadani (munggok)

Hi @NielsRogge i'm trying to use external ocr ( paddleocr or googlevision) for processing with layoutlmv2 the [docs](https://huggingface.co/docs/transformers/model_doc/layoutlmv2) here state that you need to normalize each word's bounding box with...

hi thank you for releasing tool! since cc dump very size/disk demanding can we have optional pipeline step like this: 1. immediately process(pipeline step) for each file in download command...

enhancement

Thank you for releasing the code since this implementation require less memory than other implementation adding VAD (Voice activity detection) should be more suitable Voice activity detection make whisper more...

hi..first of all... awesome apps AFAIK, right now only "official" whisper.cpp model supported in app. It would be really great if we can specify/load any custom whisper.cpp converted [format](https://github.com/ggerganov/whisper.cpp/tree/master/models#fine-tuned-models) this...

it work fine if i use gpt-j i guess this because of tokenizer and this https://github.com/databrickslabs/dolly/blob/03bf3852daa42e6091a39483dda0714c02de7382/training/trainer.py#L52 any tips to adjust it so it can use other model than gpt-j ?...

first of all thanks for releasing the code i have dataset(mc4) size about 110 GB my machine specs is 96 cores cpu and 350 GB RAM i've successfully created 524GB...

congrats to release v2 parler-tts @sanchit-gandhi @ylacombe or anyone involve i am trying to explore reproduce multilinguality training, some question to ask if i want to train it multilingual 1....

thanks for creating emilia pipeline i am using it now for my language, and so far so good is there a way to speeding it up? right now,1 hour audio...

hey @adelacvg thank for sharing the code after reading the code i want to ask you few question about new 24k model if you dont mind 1. what make different...

hello , i notice in script , you provide pretrain for speech encoder and video (CMIIW), but i cant seem to find code for training tts (vita/model/vita_tts) AFAIK, vita 1.5...