http_server
http_server copied to clipboard
http_server uploaded file handling
Originally opened as dart-lang/sdk#14303
This issue was originally filed by [email protected]
Current http_body implementation decodes / parses HttpBodyFileUpload.content based upon the Content-Type header of the part.
However, file server applications need raw uploaded file data. Such applications save received files into their file system with no modification. I think we need a switch to disable decoding / parsing for all content types and just return type List<int> object as HttpBodyFileUpload.content.
Another solution would be to return simply List<int> for all uploaded files. I am not sure how much the impact of this on other kinds of applications. However, when a .json file was uploaded from HTML form, IE sends it as “Content-Type: text/plain” and Chrome, Firefox and Safari send it as “Content-Type: application/octet-stream” (I don’t know why). In any case, .json files uploaded through HTML form will never be parsed.
Regarding to the filename with multibyte characters, although Windows uses UTF-8, I think it might be safe to keep LATIN1 decoding. I am not familiar with other file systems. We can retrieve it using UTF8.decode(LATIN1.encode(part.filename)) with Windows, assuming that the LATIN1.decode(bytes) simply generates a String that has the same byte character to corresponding byte of bytes.
<img src="https://avatars.githubusercontent.com/u/2909286?v=3" align="left" width="48" height="48"hspace="10"> Comment by madsager
cc @skabet. Added Area-IO, Triaged labels.
<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet
Set owner to @skabet. Removed Area-IO label. Added Area-Pkg, Library-HttpServer, Accepted labels.
<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet
Hi Terry,
You are right, in most cases with file uploads, it's the raw binary one wants to access. What if we do the following:
- Always provide the raw List<int> data.
- Add a method to the FileUpload class: 'parsedData()', that will try and parse/decode the data depending on the mime type. We can even throw in a optional 'mineType' argument for it, so one can override the default mime type, e.g. parse as 'text/utf-8' instead of 'application/json'.
Regarding the filename, I think we should do a test and see what the different browsers upload. if we can hit a 90% success rate with some default encoding, that could be the way to go.
<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet
I just tried with both chrome and Windows, and I get the following:
With <meta charset="UTF-8" />
:
- Chrome: as utf8
- IE: as utf8
Without <meta charset="UTF-8" />
:
- Chrome: multi-bytes replaced with ?
- IE: as utf8
I think it's fine to use utf8-decoding for filenames.
This comment was originally written by [email protected]
I confirmed it on my Windows Vista using following HTML text:
001 <!DOCTYPE html>
002 <html>
003 <head>
004 <title>file_upload_test</title>
005 <meta http-equiv="content-type" content="text/html; charset=UTF-8">
006 </head>
007 <body>
008 <form action="http://localhost:8080/DumpHttpMultipart"
009 enctype="multipart/form-data"
010 accept-charset="UTF-8"
011 method="POST"> <br>
012 What is your name? <input type="text" name="submitter"> <br>
013 What files are you sending? <input type="file" name="content"> <br>
014 <input type="submit" value="Send File">
015 </form>
016 </body>
017 </html>
If line 005 or 010 exists, Chrome, Firefox and Safari send filenames with multi-byte characters as UTF-8. Otherwise, such filenames are transmitted as Shit_JIS characters (one of most popular Japanese character encodings). Regardless of existence of line 005 or 010, IE sends them as UTF-8.
I agree to use UTF-8 decoding (current implementation uses ISO-8859-1 decoding) for filenames. It’s common to add line 005 for such applications.
<img src="https://avatars.githubusercontent.com/u/22043?v=3" align="left" width="48" height="48"hspace="10"> Comment by skabet
Hi
What do you think about the following API?
/** * A HTTP content body produced by [HttpBodyHandler] for either [HttpRequest] * or [HttpClientResponse]. / abstract class HttpBody { /** The actual data of the request. */ List<int> get data;
/**
* Convert the data using mimeType
.
*
* If mimeType
is left unspecified, the Content-Type
header will be used.
*/
dynamic asMimeType({String mimeType});
/** * Parse the [data] as text. * * If the headers contains a charset hint, that charset will be used. */ String asText();
/** * Parse the [data] as JSON. */ dynamic asJSON();
/**
* Parse the data as either multipart/form-data
or
* application/x-www-form-urlencoded
.
*
* The Content-Type
header will be used to identify the parsing.
*/
Map asFormPost();
}
/** * The [HttpBody] of a [HttpClientResponse] will be of type * [HttpClientResponseBody]. */ abstract class HttpClientResponseBody implements HttpBody, HttpClientResponse { }
/** * The [HttpBody] of a [HttpRequest] will be of type [HttpRequestBody]. */ abstract class HttpRequestBody implements HttpBody, HttpRequest { }
/** * A [HttpBodyFileUpload] object wraps a file upload, presenting a way for * extracting filename, contentType and the data of the uploaded file. / abstract class HttpBodyFileUpload { /** The filename of the uploaded file. */ String get filename;
/** * The [ContentType] of the uploaded file. */ ContentType get contentType;
/** * The content of the file. */ List<int> get content; }
cc @sethladd.
<img src="https://avatars.githubusercontent.com/u/5479?v=3" align="left" width="48" height="48"hspace="10"> Comment by sethladd
Thanks! I like how HttpRequestBody implements HttpRequest now. Also, I like how I can control how I get the body (json, text, etc) because sometimes a content-type is not set on the request.
This comment was originally written by [email protected]
I think this will give us more flexible POST body data handling.
<img src="https://avatars.githubusercontent.com/u/3276024?v=3" align="left" width="48" height="48"hspace="10"> Comment by anders-sandholm
Removed Library-HttpServer label. Added Pkg-HttpServer label.