fast-bert
fast-bert copied to clipboard
feature request: allow other delimiters and encoding when reading csv files
Hi,
file: data_cls.py:
def get_train_examples( ... data_df = pd.read_csv(os.path.join(self.data_dir, filename) .. )
reads only csv files with "," as a separator and utf-8 encoding. Could you please make this configurable and allow other delimiters (";" and "|" are commonly used in Europe) and encoding? I tried to save my csv files delimited with "," but the code crashes at some other point since my data/csv cannot be parsed correctly.
I modified your function (hard coded):
data_df = pd.read_csv(os.path.join(self.data_dir, filename), delimiter=';', encoding = 'utf-8')
and now the model is training :-)