conversational-datasets icon indicating copy to clipboard operation
conversational-datasets copied to clipboard

how to run ?

Open Saurhub69 opened this issue 3 years ago • 2 comments

i don't want to run this code on google cloud i just want it till ("Extract the data and split it into shards") but i don't know how to do it can someone explain me how to run this commands

this one

PROJECT="your-google-cloud-project"

DATADIR="gs://${BUCKET?}/opensubtitles/$(date +"%Y%m%d")"

python opensubtitles/create_data.py
--output_dir ${DATADIR?}
--sentence_files gs://${BUCKET?}/opensubtitles/raw/lines/lines-*
--runner DataflowRunner
--temp_location ${DATADIR?}/temp
--staging_location ${DATADIR?}/staging
--project ${PROJECT?}
--dataset_format TF

Saurhub69 avatar Dec 16 '21 04:12 Saurhub69

create account in google cloud, create a project and enable bigquery, google cloud storage. These commands should be ran on Google clould console

duongkstn avatar Dec 28 '21 02:12 duongkstn

hello dyoungkstn i am asking i want to run this whole project on vs code (on my local machine ) but things showing error at first it was showing tensorflow error ,then apache baem and right now assertion error can you tell me how to run this project locally

Saurhub69 avatar Jan 15 '22 10:01 Saurhub69