clj-apache-http icon indicating copy to clipboard operation
clj-apache-http copied to clipboard

Escaped spaces in URL breaks call

Open reuableahcim opened this issue 14 years ago • 2 comments

If I make a call with escaped spaces in the URL, it breaks when parsing into a full URI. For example, if I make a call like this:

(http/get "http://www.someurl.com/page/A%20Page/123" :query {:some_param "value"} :as :string)

I get the following error:

java.net.URISyntaxException: Illegal character in path at index 52: http://www.someurl.com/page/A Page/123?some_param=value

It looks like whatever is merging the querystring into the base URL is decoding the space when it attaches the bit after the ? and this fails the java.net.URI check.

When I pass in a full URL string, with the querystring already attached, everything is fine, e.g.:

(http/get "http://www.someurl.com/page/A%20Page/123?some_param=value" :as :string)

Here's the stack trace, starting from the point I make my call to http/get:

at java.net.URI$Parser.fail(URI.java:2809)
at java.net.URI$Parser.checkChars(URI.java:2982)
at java.net.URI$Parser.parseHierarchical(URI.java:3066)
at java.net.URI$Parser.parse(URI.java:3014)
at java.net.URI.<init>(URI.java:578)
at org.apache.http.client.utils.URIUtils.createURI(URIUtils.java:106)
at com.twinql.clojure.http$resolve_uri.invoke(http.clj:415)
at com.twinql.clojure.http$get.doInvoke(http.clj:495)
at clojure.lang.RestFn.invoke(RestFn.java:486)
at singularity.request$make_request.invoke(request.clj:37)

reuableahcim avatar Sep 14 '11 19:09 reuableahcim

The problem stems from lines 424-436 in http.clj. The URI gets parsed then reassembled with the query string. Unfortunately, java.net.URI is helpful enough that it decodes the path for you when you call .getPath on it:

user=> (def uri (java.net.URI. "http://localhost:4000/p/abc/d/Just%20For%20Fun/m"))
#'user/uri
user=> uri
#<URI http://localhost:4000/p/abc/d/Just%20For%20Fun/m>
user=> (.getPath uri)
"/p/abc/d/Just For Fun/m"

So when you call (.getPath u) on line 431, you assemble a decoded string back in to the URI, which causes java.net.URI to barf. I'm pretty sure the problem will occur with UTF-8 paths too, since obviously java.net.URI isn't automatically encoding the path.

sbowman avatar Sep 20 '11 20:09 sbowman

Instead of (.getPath u), you should be using (.getRawPath u):

user=> (def uri (java.net.URI. "http://localhost:4000/p/abc/d/Just%20For%20Fun/m"))
#'user/uri
user=> uri
#<URI http://localhost:4000/p/abc/d/Just%20For%20Fun/m>
user=> (.getPath uri)
"/p/abc/d/Just For Fun/m"
user=> (.getRawPath uri)
"/p/abc/d/Just%20For%20Fun/m"

sbowman avatar Sep 20 '11 20:09 sbowman