v icon indicating copy to clipboard operation
v copied to clipboard

vweb: Upper case letters of paths get mapped lower cased.

Open francescortiz opened this issue 1 year ago • 5 comments

Describe the bug

HTTP 404 not found when mapping uppercase paths using vweb method attributes. If you access to the lower case counterpart it works.

Expected Behavior

case should be respected for paths.

Current Behavior

For upper case paths, the lower case equivalent gets mapped.

Reproduction Steps

  1. Create a web server.
  2. Add a path with upper case letters.
  3. Try to access the path with lower case letters.

Possible Solution

No response

Additional Information/Context

No response

V version

V 0.3.3 d1f57ea

Environment details (OS name and version, etc.)

Manjaro Linux

francescortiz avatar Apr 14 '23 18:04 francescortiz

I would like to work on this. According to the path parsing code in vweb/parse.v paths are converted to lower case for case-insensitive comparisons. So when getting the path from a request this should also be converted to lower case. Is this the desired behavior @medvednikov ? If so we would need to convert the path from the request to lower case at some point before it is handled.

jacksonmowry avatar Apr 19 '23 11:04 jacksonmowry

module main

import vweb

struct App {
	vweb.Context
}

fn main() {
	mut app := &App{}
	vweb.run(app, 8080)
}

['/Index']
pub fn (mut app App) index() vweb.Result {
	return app.text('hi\n')
}
jackson@jtop:~/vcode/uc_server_test$ v run .
[Vweb] Running app on http://localhost:8080/

Request works with lowercase...

jackson@jtop:~$ curl localhost:8080/index
hi

Request fails with uppercase...

jackson@jtop:~$ curl localhost:8080/Index
404 Not Found

Minimum fix, but I am unsure that this is the correct approach

	// URL Parse
	 
	// url := urllib.parse(req.url) or { // Chaing req.url => req.url.to_lower() fixes the case sensitivity issue
	url := urllib.parse(req.url.to_lower()) or {
		eprintln('error parsing path: ${err}')
		return
	}

jacksonmowry avatar Apr 19 '23 13:04 jacksonmowry

I agree, either convert everything to lowercase and display an error/warning when a route is given a path with a capital letter (should probably also check for invalid characters). Or allow capitalized routes. Now there are two ways to do something.

Casper64 avatar Apr 19 '23 20:04 Casper64

The thing is that search engines treat urls very differently if you have a capital letter in them. So the route "/Index" does not automatically refer to the index page in the eyes of a search engine. Capital letters in urls could cause bad effects for SEO unintentionally.

Plus there are some browsers/applications/tools/bots that automatically convert urls into lowercase when they visit them. Thus resulting in them not finding your content.

Casper64 avatar Apr 19 '23 20:04 Casper64

In my case I am making a mock server for a service that has capitals in the URL. Why don't go for the simpler approach of leaving the case untouched and do case sensitive comparisons? Less code, less checks, and full support for whatever exists in the wild or the developer wants to implement.

francescortiz avatar Apr 20 '23 07:04 francescortiz

According to the HTML spec, URLs should be treated as case sensitive after the machine name.

The problem with mixing upper & lowercase is that you can wind up with valid URLs that look like duplicates, or you can actually have duplicate pages if you're really not paying attention. http://foo.com/A is a completely different URL from http://foo.com/a.

Search engines will also treat these as separate URLs. Google recommends only using lowercase, and having a rewrite rule such that if there are any uppercase characters after the machine name, they get converted to all lowercase, with a 301 return code (redirect) for the client.

That said, V is doing this incorrectly by forcing the URLs to be treated as lowercase without the 301 return code.

JalonSolov avatar Jun 19 '23 02:06 JalonSolov

Google recommends only using lowercase, and having a rewrite rule such that if there are any uppercase characters after the machine name, they get converted to all lowercase, with a 301 return code (redirect) for the client.

This is a SEO recommendation, not a standard. IMHO it is not the domain of a simple HTTP server to worry about it but of a higher level SEO framework or middleware.

francescortiz avatar Jun 19 '23 06:06 francescortiz