Knut(Ke) Chen comments

Results 20 comments of


                                            Knut(Ke) Chen

Does this framework's output have been compared with other features?

Hi, No really, because HTS-AT itself is our proposed audio transformer, in this paper, we just use it for audio classification and SED tasks. But we use this HTS-AT architecture...

谱图编码

您好，输入的谱图大小是256 x 256，其实谱图是不需要转成模型的输入大小的，在原来的谱图大小是1024 * 64 上也是可以做一样的patch，但是由于我们想利用swin-transformer的pretrained model来提高性能，所以做了一个rearrange

the size of the input spectrum

Already Answered in another issue.

Audioset dataset for pretraining

Hi, our pretrained checkpoint is released, please check our readme [released link](https://drive.google.com/drive/folders/1f5VYMk0uos_YnuBshgmaTVioXbs7Kmz6). For AudioSet, you can refer to [this repo](https://github.com/qiuqiangkong/audioset_tagging_cnn), we use their stored AudioSet (Please check the refered repo's...

cannot pickle 'module' object

Hi, sorry for the late reply. You can refer to [this](https://github.com/RetroCirce/HTS-Audio-Transformer/issues/21) issue. Basically the reason is about the environment and some hyperparameter changes. My environment (when I did this project)...

How can train using my own dataset

Hi, sorry for the late reply. You need to revise or refer to the data_processor.py file to change the dataset loader and dataset classes. I use the "LGSPDataset" module to...

type of GPU

8 V100 GPU for 1-2 days training can lead to the reported performance. If you use one GPU, perhaps also can achieve it by 5-7 days training.

add audio spectrogram transformer, and full audio clip

Hi @lucidrains Currently we briefly scanned your code and it looks great to us. After you finish the code, just let us know. We will go mainly over the spec-augment...

Add velocity in encoding

Hi, For this project encoding method, it is not easy to add the velocity. But some following works using transformer architecture and advanced representations of music (in 2021, 2022 year)...

Hey,Chen

Hi, I once used the dataset_idx/dataloader_idx because I test multiple test sets/validation sets when training the model. I.e., after I train 1 epoch, I test each validation sets (namely idx...