Enrico Shippole
Enrico Shippole
**Describe the issue**: I am having an error when trying to run NNI with a tensor dataset and parquet file. Is there a way to serialize this easily and properly?...
## Describe the bug Following the example in CodeParrot, I receive an array size limitation error when deduplicating larger datasets. ## Steps to reproduce the bug ```python dataset_name = "the_pile"...
Hi @marcvanzee , There seemed to be another issue with the number of commits in the original pull request after resolving the initial branch conflicts. When following the troubleshooting section...
Hello @lvwerra, I have a few questions about the iterable dataset class in the Code Parrot blog post. 1. How is `num_of_sequences` chosen? 2. If I am using a `seq_length`...
Hi all, Per the request of @ver217 from this [discussion](https://github.com/hpcaitech/ColossalAI/discussions/1474), I am opening an issue with the same name, comments, and question. Recent discoveries from GLM-130 and researchers at Tsinghua...
Hi Phil, I recently spoke to Aohan Zeng of Tsinghua. Aohan was kind enough to provide me with detailed information regarding the architecture of GLM for a basic PyTorch implementation....
Hi Phil, Firstly, Thank you for the amazing work yet again! I was wondering if you had done any benchmarking with mid-tier GPUs. I ran the benchmarks on my local...
I believe I am currently having an issue when training from both scratch and the pre-trained tacotron2 model. I have collected 14 to 17 hours of pre-processed wav files of...
Hi all, Firstly, thank you for the awesome work. I was wondering if there were any plans to release the full dataset? I believe this would be immensely beneficial for...
Hi @hwchase17, Per our previous discussion, I am updating the Google and Bing Search API utilities to include a method for returning metadata in the form `[{'snippet':'hello world', 'title': 'foo',...