pgfutter
pgfutter copied to clipboard
index out of range error
pgfutter crashes when trying to upload large csv file
pgfutter --port 1337 csv train_transaction.csv
394 columns
[TransactionID isFraud TransactionDT TransactionAmt ProductCD card1 card2 card3 card4 card5 card6 addr1 addr2 dist1 dist2 P_emaildomain R_emaildomain C1 C2 C3 C4 C5 C6 C7 C8 C9 C10 C11 C12 C13 C14 D1 D2 D3 D4 D5 D6 D7 D8 D9 D10 D11 D12 D13 D14 D15 M1 M2 M3 M4 M5 M6 M7 M8 M9 V1 V2 V3 V4 V5 V6 V7 V8 V9 V10 V11 V12 V13 V14 V15 V16 V17 V18 V19 V20 V21 V22 V23 V24 V25 V26 V27 V28 V29 V30 V31 V32 V33 V34 V35 V36 V37 V38 V39 V40 V41 V42 V43 V44 V45 V46 V47 V48 V49 V50 V51 V52 V53 V54 V55 V56 V57 V58 V59 V60 V61 V62 V63 V64 V65 V66 V67 V68 V69 V70 V71 V72 V73 V74 V75 V76 V77 V78 V79 V80 V81 V82 V83 V84 V85 V86 V87 V88 V89 V90 V91 V92 V93 V94 V95 V96 V97 V98 V99 V100 V101 V102 V103 V104 V105 V106 V107 V108 V109 V110 V111 V112 V113 V114 V115 V116 V117 V118 V119 V120 V121 V122 V123 V124 V125 V126 V127 V128 V129 V130 V131 V132 V133 V134 V135 V136 V137 V138 V139 V140 V141 V142 V143 V144 V145 V146 V147 V148 V149 V150 V151 V152 V153 V154 V155 V156 V157 V158 V159 V160 V161 V162 V163 V164 V165 V166 V167 V168 V169 V170 V171 V172 V173 V174 V175 V176 V177 V178 V179 V180 V181 V182 V183 V184 V185 V186 V187 V188 V189 V190 V191 V192 V193 V194 V195 V196 V197 V198 V199 V200 V201 V202 V203 V204 V205 V206 V207 V208 V209 V210 V211 V212 V213 V214 V215 V216 V217 V218 V219 V220 V221 V222 V223 V224 V225 V226 V227 V228 V229 V230 V231 V232 V233 V234 V235 V236 V237 V238 V239 V240 V241 V242 V243 V244 V245 V246 V247 V248 V249 V250 V251 V252 V253 V254 V255 V256 V257 V258 V259 V260 V261 V262 V263 V264 V265 V266 V267 V268 V269 V270 V271 V272 V273 V274 V275 V276 V277 V278 V279 V280 V281 V282 V283 V284 V285 V286 V287 V288 V289 V290 V291 V292 V293 V294 V295 V296 V297 V298 V299 V300 V301 V302 V303 V304 V305 V306 V307 V308 V309 V310 V311 V312 V313 V314 V315 V316 V317 V318 V319 V320 V321 V322 V323 V324 V325 V326 V327 V328 V329 V330 V331 V332 V333 V334 V335 V336 V337 V338 V339]
4.00 KiB / 651.69 MiB [>---------------------------------------------------------------------------------------------------------] 0.00%
panic: runtime error: index out of range
goroutine 1 [running]:
main.copyCSVRows(0xc42023b980, 0xc420090ac0, 0x0, 0x73fa7a, 0x1, 0xc4201fa000, 0x18a, 0x200, 0xc42023b980, 0x0, ...)
/usr/src/pgfutter/csv.go:99 +0x747
main.importCSV(0x7ffe96d02f0c, 0x15, 0xc420174000, 0x64, 0x741dde, 0x6, 0xc420088580, 0x11, 0x0, 0x0, ...)
/usr/src/pgfutter/csv.go:172 +0x3f6
main.main.func2(0xc4200e2420, 0xc4200a5300, 0xc4200e2420)
/usr/src/pgfutter/pgfutter.go:164 +0x38f
github.com/codegangsta/cli.HandleAction(0x6e19c0, 0x7586c8, 0xc4200e2420, 0x0, 0xc4200a07e0)
/gopath/src/github.com/codegangsta/cli/app.go:501 +0xd2
github.com/codegangsta/cli.Command.Run(0x73fda9, 0x3, 0x0, 0x0, 0x0, 0x0, 0x0, 0x74b480, 0x18, 0x0, ...)
/gopath/src/github.com/codegangsta/cli/command.go:165 +0x4bb
github.com/codegangsta/cli.(*App).Run(0xc420142000, 0xc42008c0f0, 0x5, 0x5, 0x0, 0x0)
/gopath/src/github.com/codegangsta/cli/app.go:259 +0x740
main.main()
/usr/src/pgfutter/pgfutter.go:170 +0xcce
file can be found here https://www.kaggle.com/c/ieee-fraud-detection
What version of pgfutter? I ask because I had a similar panic on v1.2, but when I built from source it ran fine. There have been a few fixes since v1.2 that may have addressed your issue.
change to this code https://github.com/lukasmartinelli/pgfutter/blob/master/csv.go#L100
for i, col := range record {
cols[i] = strings.Replace(col, "\x00", "", -1)
// bytes.Trim(b, "\x00")
// cols[i] = col
}
to
for idx, col := range record {
cols[idx] = strings.Replace(col, "\x00", "", -1)
// bytes.Trim(b, "\x00")
// cols[i] = col
}
I got the same error with a 4MB file, version v1.2.
The problem was that the CSV file was malformed, a few rows had extra columns. I discovered this by validating the CSV file here https://csvlint.io/, the output was very helpful.
Are you guys still looking for a fix? I can help!
Are you guys still looking for a fix? I can help!
Yes, still seeing this, thank you!
Are you guys still looking for a fix? I can help!
Yes, still seeing this, thank you!
In linux, for some reason, for some files, you need to have windows style line breaks for it to work.
My file had no issues on csvLint except the line-break warning.
You can use the unix2dos command to convert the file first, then it worked for me.
Can confirm that running my CSV that was having the index out of range issue through unix2dos
resolved this for macOS.
Thank you@waynegraham
Can also confirm that running unix2dos on 18GB of tab-delimited files fixed this issue on Linux 👍
This issue isn't 100% consistent. Some files worked as-is (with Unix line breaks), some did not. After running unix2dos on all, they all worked.