crates.io icon indicating copy to clipboard operation
crates.io copied to clipboard

Crates.io returns 404 when not specifying `Accept: text/html`

Open kud1ing opened this issue 8 years ago • 13 comments

  • awesome-rust uses awesome_bot to check links. Unfortunately links to crates.io return 404.
  • curl https://crates.io/keywords/cassandra receives a 404, too.

Current summary

  • This is confirmed as a bug that still exists as long as this issue is open. We do not need any further confirmations of the issue at this time.
  • What we do need is a PR fixing the problem, those are quite welcome
  • The workaround for this issue is to pass an Accept header of text/html.

kud1ing avatar Jun 20 '17 07:06 kud1ing

I think we are doing something nonstandard, but I'm not sure what. Until we figure this out, a workaround is specifying HTML as the content type you want:

curl -v 'https://crates.io/crates/assert_approx_eq' -H 'Accept: text/html'

returns a 200 for me. Hope that helps!

carols10cents avatar Jun 20 '17 12:06 carols10cents

crates.io was whitelisted in https://github.com/rust-unofficial/awesome-rust/pull/310 I'd like to revert this in the end so that we can verify links to crates, categories and keywords.

kud1ing avatar Jun 20 '17 16:06 kud1ing

Yeah, this is caused by https://github.com/rust-lang/crates.io/blob/a20eb96877d32bff6b5b9ac8ea3dfc497b5351b9/src/dist.rs#L38-L52

Interestingly requesting any URL with Accept: text/html will return 200 since the server doesn't know what the valid routes are (e.g. curl -H 'Accept: text/html' -I https://crates.io/foo/bar)

I don't think there's any way to have a bot validate URLs on crates.io without executing the javascript and checking if it loads the page successfully. At least, while crates.io does client-side only rendering, if server-side rendering of at least the initial page is ever added that should then know whether the URL being requested is valid or not.

Nemo157 avatar Jul 02 '17 14:07 Nemo157

FYI, you can track server-side rendering efforts at https://github.com/rust-lang/crates.io/pull/819.

locks avatar Jul 02 '17 18:07 locks

This is still not fixed.

image

AaronFriel avatar Dec 15 '17 02:12 AaronFriel

@AaronFriel Yes, that's why this issue is still open.

carols10cents avatar Dec 15 '17 16:12 carols10cents

Would a commit altering the wants_html test to include wildcards be accepted? I'm just not sure why this has languished and I'm guessing there is something I'm missing.

AaronFriel avatar Dec 16 '17 00:12 AaronFriel

Yes, I am not sure of the scope of the solution needed or the effects of various solutions on the frontend and backend-- please give it a try and let me know if you have questions!

carols10cents avatar Dec 17 '17 17:12 carols10cents

Any progress on this?

mark-i-m avatar Sep 15 '18 19:09 mark-i-m

@mark-i-m this is indirectly being tracked at https://github.com/rust-lang/crates.io/issues/204

locks avatar Sep 17 '18 15:09 locks

Any updates on this?

luciusmagn avatar Jul 19 '19 12:07 luciusmagn

I've recently added PR #1788 which would impact this bug. This PR would make it so that the behavior is consistent, regardless of if an Accept: text/html header was sent or not. All such requests will now return a status 200 with the Ember index html (as is currently done for browsers and other clients that set the header).

This will in effect invert the issue described here. Instead of always returning a 404 with a JSON response, the site will now always respond with a status 200 with static HTML. Instead of false negatives for crates that do exists, there would be false positives for crates that do not exist.

jtgeibel avatar Jul 25 '19 02:07 jtgeibel