fast-bert icon indicating copy to clipboard operation
fast-bert copied to clipboard

feature request: allow other delimiters and encoding when reading csv files

Open woiza opened this issue 4 years ago • 0 comments

Hi,

file: data_cls.py:

def get_train_examples( ... data_df = pd.read_csv(os.path.join(self.data_dir, filename) .. )

reads only csv files with "," as a separator and utf-8 encoding. Could you please make this configurable and allow other delimiters (";" and "|" are commonly used in Europe) and encoding? I tried to save my csv files delimited with "," but the code crashes at some other point since my data/csv cannot be parsed correctly.

I modified your function (hard coded): data_df = pd.read_csv(os.path.join(self.data_dir, filename), delimiter=';', encoding = 'utf-8') and now the model is training :-)

woiza avatar Jul 30 '20 13:07 woiza