esapi-java-legacy
esapi-java-legacy copied to clipboard
Add Support for mailto URLs in Encoder.getCanonicalizedURI()
The following unit test is incorrect in the baseline, but this one correctly shows that we don't properly canonicalize a mailto URL. For the record---the regex we currently use as a default restricts urls to the schemes "ftp" and "https*" So this is purely a future enhancement.
public void testGetCanonicalizedUriWithMailto() throws Exception {
Encoder e = ESAPI.encoder();
String expectedUri = "mailto:[email protected]?subject=Doc, do da dance";
//Please note that section 3.2.1 of RFC-3986 explicitly states not to encode
//password information as in http://palpatine:[email protected], and this will
//not appear in the userinfo field.
String input = "mailto:[email protected]?subject=Doc,%20do%20da%20dance";
URI uri = new URI(input);
System.out.println(uri.toString());
assertEquals(expectedUri, e.getCanonicalizedURI(uri));
}
After digging, mailto URIs have their own syntax separate from typical URI standard.
from RFC-6068:
mailtoURI = "mailto:" [ to ] [ hfields ] to = addr-spec *("," addr-spec ) hfields = "?" hfield *( "&" hfield ) hfield = hfname "=" hfvalue hfname = *qchar hfvalue = *qchar addr-spec = local-part "@" domain local-part = dot-atom-text / quoted-string domain = dot-atom-text / "[" *dtext-no-obs "]" dtext-no-obs = %d33-90 / ; Printable US-ASCII %d94-126 ; characters not including ; "[", "]", or "\" qchar = unreserved / pct-encoded / some-delims some-delims = "!" / "$" / "'" / "(" / ")" / "*" / "+" / "," / ";" / ":" / "@"
@xeno6696, I think we can do this if we can convert the URI to a URL? (not sure of the feasibility)
URI.toURL()
At that point we can look at the protocol of the URL to see if it's a 'mailto' protocol. If it is, then the getPath function returns the address.
I grabbed some sample mailTo addresses from the RFC-6068 document https://tools.ietf.org/html/rfc6068
and built a simple test case to pump them through the URL class and dump out all the *get method values. It seems pretty consistent.
import java.lang.reflect.Method;
import java.net.MalformedURLException;
import java.net.URL;
import org.junit.Test;
public class MailToUriTest {
//https://tools.ietf.org/html/rfc6068
String[] basic = new String[] {"mailto:[email protected]", "mailto:[email protected]?subject=current-issue", "mailto:[email protected]?body=send%20current-issue", "mailto:infobot@\r\n" +
"example.com?body=send%20current-issue%0D%0Asend%20index", "mailto:[email protected]?In-Reply-To=%3C3469A91.D10AF4C@\r\n" +
" example.com%3E", "mailto:[email protected]?body=subscribe%20bamboo-l","mailto:[email protected][email protected]&body=hello", "mailto:[email protected][email protected]?body=hello", "mailto:gorby%[email protected]" };
String[] complicated = new String[] {"mailto:\"not@me\"@example.org","mailto:\"oh\\\\no\"@example.org","mailto:\"\\\\\\\"it's\\ ugly\\\\\\\"\"@example.org"};
@Test
public void testURI() throws MalformedURLException, Exception {
//String mailto = "mailto:[email protected]?subject";
HTMLEntityCodec codec = new HTMLEntityCodec();
for (String mailto : basic) {
mailto = codec.decode(mailto);
System.out.println(mailto);
URL url = new URL(mailto);
dumpGetMethods(url);
}
for (String mailto : complicated) {
mailto = codec.decode(mailto);
System.out.println(mailto);
URL url = new URL(mailto);
dumpGetMethods(url);
}
}
private void dumpGetMethods(URL url) throws Exception {
for (Method m : URL.class.getMethods()) {
if (m.getName().startsWith("get") && m.getReturnType().equals(String.class)) {
System.out.println("\t" + m.getName() + " " + m.invoke(url));
}
}
}
}
mailto:[email protected]
getAuthority null
getPath [email protected]
getQuery null
getFile [email protected]
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:[email protected]?subject=current-issue
getAuthority null
getPath [email protected]
getQuery subject=current-issue
getFile [email protected]?subject=current-issue
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:[email protected]?body=send%20current-issue
getAuthority null
getPath [email protected]
getQuery body=send%20current-issue
getFile [email protected]?body=send%20current-issue
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:infobot@
example.com?body=send%20current-issue%0D%0Asend%20index
getAuthority null
getPath infobot@
example.com
getQuery body=send%20current-issue%0D%0Asend%20index
getFile infobot@
example.com?body=send%20current-issue%0D%0Asend%20index
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:[email protected]?In-Reply-To=%3C3469A91.D10AF4C@
example.com%3E
getAuthority null
getPath [email protected]
getQuery In-Reply-To=%3C3469A91.D10AF4C@
example.com%3E
getFile [email protected]?In-Reply-To=%3C3469A91.D10AF4C@
example.com%3E
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:[email protected]?body=subscribe%20bamboo-l
getAuthority null
getPath [email protected]
getQuery body=subscribe%20bamboo-l
getFile [email protected]?body=subscribe%20bamboo-l
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:[email protected][email protected]&body=hello
getAuthority null
getPath [email protected]
getQuery [email protected]&body=hello
getFile [email protected][email protected]&body=hello
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:[email protected][email protected]?body=hello
getAuthority null
getPath [email protected][email protected]
getQuery body=hello
getFile [email protected][email protected]?body=hello
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:gorby%[email protected]
getAuthority null
getPath gorby%[email protected]
getQuery null
getFile gorby%[email protected]
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:"not@me"@example.org
getAuthority null
getPath "not@me"@example.org
getQuery null
getFile "not@me"@example.org
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:"oh\\no"@example.org
getAuthority null
getPath "oh\\no"@example.org
getQuery null
getFile "oh\\no"@example.org
getHost
getProtocol mailto
getRef null
getUserInfo null
mailto:"\\\"it's\ ugly\\\""@example.org
getAuthority null
getPath "\\\"it's\ ugly\\\""@example.org
getQuery null
getFile "\\\"it's\ ugly\\\""@example.org
getHost
getProtocol mailto
getRef null
getUserInfo null
So we may be able to use this to split it up and run it through "Appropriate" codecs?
What do you think?