Nikhil Ranjan

Results 3 issues of Nikhil Ranjan

Hi, I am trying to use this model for inference but the plain text tagging comes out to be very weird. Input: "Hello, today is Monday but not Tuesday, maybe...

I am trying to convert Redpajama-github dataset to streaming format but getting the error as below. To replicate: python llm-foundry/scripts/data_prep/convert_dataset_json.py \ --path github/split1 \ --out_root github/split1 --split train \ --concat_tokens...

bug

I think the PMC urls has not been updated since 2016. Is it possible to get bulk data latest one in jsonl format ?