h2database icon indicating copy to clipboard operation
h2database copied to clipboard

csvread fails on Windows with file: protocols

Open ashleymercer opened this issue 7 years ago • 5 comments

The following call fails on Windows:

select * from csvread('file:/D:/Users/AshleyM/test.csv')

with an exception:

liquibase.exception.DatabaseException: IO Exception: "IOException reading file:/C:/Users/AshleyM/test.csv"; SQL statement:
select * from csvread('file:/D:/Users/AshleyM/test.csv') [90028-196] [Failed SQL: select * from csvread('file:/D:/Users/AshleyM/test.csv')]
                ...
Caused by: org.h2.jdbc.JdbcSQLException: IO Exception: "IOException reading file:/D:/Users/AshleyM/test.csv"; SQL statement:
select * from csvread('file:/D:/Users/AshleyM/test.csv') [90028-196]
                at org.h2.message.DbException.getJdbcSQLException(DbException.java:345)
                at org.h2.message.DbException.get(DbException.java:168)
                at org.h2.tools.Csv.convertException(Csv.java:597)
                at org.h2.tools.Csv.read(Csv.java:196)
                ...
Caused by: java.net.MalformedURLException: no protocol: /D:/Users/AshleyM/test.csv
                at java.net.URL.<init>(URL.java:593)
                at java.net.URL.<init>(URL.java:490)
                at java.net.URL.<init>(URL.java:439)
                at org.h2.store.fs.FilePathDisk.newInputStream(FilePathDisk.java:317)
                at org.h2.store.fs.FileUtils.newInputStream(FileUtils.java:218)
                at org.h2.tools.Csv.initRead(Csv.java:315)
                at org.h2.tools.Csv.readResultSet(Csv.java:218)
                at org.h2.tools.Csv.read(Csv.java:194)

A similar call on Linux with a file URL (e.g. file:/var/tmp/test.csv) works as expected.

ashleymercer avatar Jun 26 '18 13:06 ashleymercer

http://h2database.com/html/functions.html#csvread

wrong URL format

grandinj avatar Jun 26 '18 13:06 grandinj

Please can you explain how I can fix it? In my case, the URLs are returned from java.lang.ClassLoader.getResources() so I would have assumed they were valid?

(I'm aware this might be classified as a support question, but the documentation only says any URL may be used, and gives some examples - if the URL must be in one of a limited subset of the URL spec, I can submit a PR to update the documentation?)

ashleymercer avatar Jun 26 '18 14:06 ashleymercer

file:///c:/temp/example.csv

grandinj avatar Jun 28 '18 19:06 grandinj

Ahh I had the same thought at first: I would normally expect to see an additional // after the protocol (certainly the Wikipedia examples look like that!) except:

  1. these URLs are returned by Java's own ClassLoader.getResources() method, so I would hope they're kosher

  2. even with three slashes it doesn't seem to work: I get the same exception no protocol: ///D:/Users/AshleyM/test.csv

I did a little bit of debugging and I think I've figured out what's going on:

  1. Csv.initRead ultimately ends up calling FilePath.get("file:///D:/Users/AshleyM/test.csv")

  2. this correctly finds FilePathDisk for the file: protocol

  3. it then calls FilePathDisk.getPath(...) which calls into FilePathDisk.translateFileName which strips the leading "file:" part of the path, if it exists

However, I think [3] is where the issue arises: it only strips the exact String "file:", and doesn't take account of any slashes that might follow. I'm happy to try and fix this, but I'd be wary about submitting a patch without your input: we can't, for example, simply strip all leading slashes after the protocol, since that would then break UNIX-style paths. I'm also not sure about the interaction between this and later methods that get called (for example, FilePathDisk.newInputStream then also checks the start of the name to see if it's a classpath resource, so I wouldn't want to mangle the URL earlier and break this).

Any thoughts?

ashleymercer avatar Jun 29 '18 12:06 ashleymercer

Good work!

FilePath and friends is not expecting to handle anything approaching the file URL syntax, it only expects to handle the limited subset we can specify in the H2 database URL, so any fix would need to live above this, probably in the CsvRead code.

Feel free to have a bash at this and submit a PR

grandinj avatar Jun 29 '18 12:06 grandinj