drakma icon indicating copy to clipboard operation
drakma copied to clipboard

Sending Files with Non-latin-1 Filenames Fails with Encoding Error Regardless of EXTERNAL-FORMAT-OUT

Open Shinmera opened this issue 9 years ago • 2 comments

How to reproduce (the pathname does not have to exist on the system):

(drakma:http-request "http://example.com" :method :post :parameters '(("file" . #p"~/尻.txt")) :external-format-out :utf-8)

Expected behaviour: Sending the file with the filename encoded in utf-8 What happens instead:

Error:
#\U5C3B (code 23611) is not a LATIN-1 character.
   [Condition of type FLEXI-STREAMS:EXTERNAL-FORMAT-ENCODING-ERROR]

Restarts:
 0: [RETRY] Retry SLIME REPL evaluation request.
 1: [*ABORT] Return to SLIME's top level.
 2: [ABORT] Abort thread (#<THREAD "repl-thread" RUNNING {10053380B3}>)

Backtrace:
  0: (FLEXI-STREAMS::SIGNAL-ENCODING-ERROR #<FLEXI-STREAMS::FLEXI-LATIN-1-FORMAT (:ISO-8859-1 :EOL-STYLE :LF) {1003573FE3}> "~S (code ~A) is not a LATIN-1 character." #\U5C3B 23611)
  1: ((:METHOD FLEXI-STREAMS::WRITE-SEQUENCE* (FLEXI-STREAMS::FLEXI-LATIN-1-FORMAT T T T T)) #<unavailable argument> #<unavailable argument> #<unavailable argument> #<unavailable argument> #<unavailable ar..
  2: ((:METHOD TRIVIAL-GRAY-STREAMS:STREAM-WRITE-SEQUENCE (FLEXI-STREAMS:FLEXI-OUTPUT-STREAM T T T)) #<unavailable argument> #<unavailable argument> #<unavailable argument> #<unavailable argument>) [fast-m..
  3: ((SB-PCL::EMF TRIVIAL-GRAY-STREAMS:STREAM-WRITE-SEQUENCE) #<unused argument> #<unused argument> #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}> "尻.txt" 0 5)
  4: (SB-IMPL::%WRITE-STRING "尻.txt" #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}> 0 NIL)
  5: ((LABELS SB-IMPL::HANDLE-IT :IN SB-KERNEL:OUTPUT-OBJECT) #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}>)
  6: (PRINC "尻.txt" #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}>)
  7: ((LAMBDA (STREAM &OPTIONAL (#:FORMAT-ARG151 (ERROR (QUOTE SB-FORMAT:FORMAT-ERROR) :COMPLAINT "required argument missing" :CONTROL-STRING "; filename=\"~A\"" :OFFSET 13)) &REST SB-FORMAT::ARGS) :IN "/h..
  8: (FORMAT #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}> #<FUNCTION (LAMBDA (STREAM &OPTIONAL (#:FORMAT-ARG151 #) &REST SB-FORMAT::ARGS) :IN "/home/linus/quicklisp/dists/quicklisp/software/drakma-1.3.10/..
  9: ((LAMBDA (STREAM) :IN DRAKMA::MAKE-FORM-DATA-FUNCTION) #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}>)
 10: ((LAMBDA (STREAM) :IN DRAKMA::MAKE-FORM-DATA-FUNCTION) #<FLEXI-STREAMS:FLEXI-IO-STREAM {10087520B3}>) [external]
 11: ((LABELS DRAKMA::FINISH-REQUEST :IN DRAKMA:HTTP-REQUEST) #<CLOSURE (LAMBDA (STREAM) :IN DRAKMA::MAKE-FORM-DATA-FUNCTION) {100874CF1B}> NIL)
 12: (DRAKMA:HTTP-REQUEST #<PURI:URI http://example.com/> :METHOD :POST :PARAMETERS (("file" . #P"~/尻.txt")) :EXTERNAL-FORMAT-OUT :UTF-8)
 13: (SB-INT:SIMPLE-EVAL-IN-LEXENV (DRAKMA:HTTP-REQUEST "http://example.com" :METHOD :POST :PARAMETERS (QUOTE (#)) ...) #<NULL-LEXENV>)
 14: (EVAL (DRAKMA:HTTP-REQUEST "http://example.com" :METHOD :POST :PARAMETERS (QUOTE (#)) ...))
 --more--

It seems that the :external-format-out parameter is not taken into account for the filename header at all. Neither setting drakma:*drakma-default-external-format* or an implementation specific default encoding variable seems to have any effect.

Shinmera avatar Dec 08 '14 13:12 Shinmera

<fe[nl]ix> H4ns: you have to encode the UTF8 file name to octets, then encode
           octets to ASCII using URL-encoding, then split the header onto
           multiple lines if the name is too long
<fe[nl]ix> and to specify that the multiple parameters of content-disposition
           are to be concatenated, the parameters must have an index suffix
<fe[nl]ix> so instead of name="foo.jpg", you have name*0="start"\r\n
           name*1="end"
<fe[nl]ix> H4ns: and you can(must in practice?) also specify the encoding
<fe[nl]ix> see section 4.1 of https://tools.ietf.org/html/rfc2231

Also RFC 5987, 2231 and 2047

hanshuebner avatar Dec 08 '14 14:12 hanshuebner

Also, the file name parameter to Content-Disposition should be "filename", not "name". I suppose that many servers recognize the latter for robustness, but it's non-standard.

sionescu avatar Dec 08 '14 14:12 sionescu