nebula-importer
nebula-importer copied to clipboard
Read multiple file(s) at a time when wildcard in file path
@wey-gu
Using below config file...
- When multiple CSV data files are located at ./students/*.CSV path, Importer is trying to read all the file(s) at once
- Each CSV data file in 4 GB in size
- Why not read one file at a time?
Thanks in advance
version: v2
description: example
removeTempFiles: false
clientSettings:
retry: 3
concurrency: 1 # number of graph clients
channelBufferSize: 1
space: StudentCentral
connection:
user: root
password: nebula
address: rp-nebula-graphd-svc:9669
postStart:
commands: |
DROP SPACE IF EXISTS StudentCentral;
CREATE SPACE IF NOT EXISTS StudentCentral(partition_num=6, replica_factor=2, vid_type=FIXED_STRING(80));
USE StudentCentral;
CREATE TAG IF NOT EXISTS Student(sudentId string, hcs string, docInstance string);
maritalStatusId int, raceIds string);
afterPeriod: 8s
logPath: /csv_data/err/test.log
files:
- path: ./students/*.CSV
batchSize: 10000
inOrder: false
type: csv
csv:
withHeader: false
withLabel: false
delimiter: ","
schema:
type: vertex
vertex:
vid:
type: string
index: 0
tags:
- name: Patient
props:
- name: sudentId
type: string
- name: hcs
type: string
- name: docInstance
type: string
Sorry for the late response, didn't manage to clean my notifications in mailbox.
Yes, this should be done in an on-demand way to yield each file in a separate fashion instead of loading them in RAM in one go.