SwiftPackageIndex-Server
SwiftPackageIndex-Server copied to clipboard
Invalid URL 500s
There have been a number of invalid URL 500s starting Friday, July 22 morning:
Abort.500: http://spi-prod-docs.s3-website.us-east-2.amazonaws.com/nordicsemiconductor/ios-dfu-library/raw/main/App/fastlane/screenshots/en-US/iPhone 8 Plus-01Devices.png is an invalid URL
Abort.500: http://spi-prod-docs.s3-website.us-east-2.amazonaws.com/nordicsemiconductor/ios-dfu-library/raw/main/App/fastlane/screenshots/en-US/iPhone 8 Plus-02Settings.png is an invalid URL
Abort.500: http://spi-prod-docs.s3-website.us-east-2.amazonaws.com/airbnb/lottie-ios/raw/master/_Gifs/Community 2_3.gif is an invalid URL
Abort.500: http://spi-prod-docs.s3-website.us-east-2.amazonaws.com/tuist/tuist/raw/main/assets/companies/Depop Logo.svg is an invalid URL
Abort.500: http://spi-prod-docs.s3-website.us-east-2.amazonaws.com/tuist/tuist/raw/main/assets/companies/hh mono.svg is an invalid URL
Abort.500: http://spi-prod-docs.s3-website.us-east-2.amazonaws.com/tuist/tuist/raw/main/assets/companies/Compass Black Logo.png is an invalid URL
These all seem to be related to lack of quoting of the requested url.
Actually, this doesn't seem to be doc related:
data:image/s3,"s3://crabby-images/e30c2/e30c2eaf408fc7e4b437735197b8e9e2e3858690" alt="Screenshot 2022-07-22 at 14 53 49"
Well, I guess in a sense it is: I suspect our docc proxy route is triggering and redirecting these resources to S3 instead of fetching from Github. This might be related to the very recent change where I added a route to handle docc favicons via
app.get(":owner", ":repository", ":reference", "**") {
try await packageController.documentation(req: $0, fragment: .root)
}
That's too wide a route. I should have constrained it to favicon.{svg,ico}
.
That doesn't appear to be the problem, or at least not the entire problem. If I take out this docc proxy route I still get
[ ERROR ] RouteNotFound.404: Not Found: /NordicSemiconductor/IOS-DFU-Library/blob/main/App/fastlane/screenshots/en-US/iPhone%208%20Plus-01Devices.png [component: server]
when running locally. It looks like something with the image link in the readme is off and because it isn't handled elsewhere it also ends up in the docc proxy route that is too wide.
I wonder if this even a new problem. Could it be that we've always been showing broken images but we're only noticing it now because the additional problem of "capturing" the url in the docc proxy promoted the 404 to a 500 which we see in Rollbar?
It's possible. I can look into this on Monday.
This is a bug in processRelativeImages
in PackageReadme.Model
, and it's entirely my fault!
In trying to correct relative URLs, I am accidentally instantiating them into a Swift URL
object, which immediately fails for relative URLs since relative URLs are not real URLs.
PR incoming.
OK, I've tested quite a few of these new URLs now it's deployed and it's working as expected. I set all the Rollbar errors to resolved. Let's see if any come back.
Re-opening as we had another error overnight
It's a space problem again with SwiftGenKit Contexts
but I can't find that path anywhere in the README (raw version here so others can confirm). I have no idea where this is coming from.
Edit 1: Ignore me, I found it! I couldn't see it because it is encoded. I was searching for the space, since that should be the problem. I'll investigate
Edit 2: There is a link to Documentation/SwiftGenKit%20Contexts/
in the Markdown (line 293) but we are dealing with that fine and it's also not a link to /blob/6.2.1/Documentation/SwiftGenKit Contexts/MigrationGuide.md
, which is the URL that's actually causing issues. I'm stumped again.
I can't find any reference in that file either. There was no recent edit of it that could have removed the link since the rollbar item was triggered.
The only explanation I can think of is that it's not this README that has a faulty link. We're only seeing the link that's being fetched, not the page where it's embedded. It's natural we assume it's in the README for that project but it doesn't have to be. I've looked through the sibling project's README (StencilSwiftKit) but it's not there either. Unlikely as it is, it could even be an entirely external page that links to this broken url. I don't think we have much of a chance finding the source unless we adding more detail to the logs.
Unlikely as it is
Very unlikely, because they'd have to be embedding a https://swiftpackageindex.com/... rooted url in their page with the path that trips us up.
I don't think we have much of a chance finding the source unless we adding more detail to the logs.
The one bit of information that'd help us track this down would be the referrer from the HTTP request. If we decide to investigate further. Let's see what the next few days bring in terms of occurences.
i thought i responded earlier, but i was pentesting the site last month and found that it returns a 500 whenever:
- four or more url path components are present, and
- at least one component contains a percent-encoded space (
%20
)
here is an example of an erroring url: / /x/x/x
as far as i could tell %20
is the only character that causes this behavior, other escape sequences just return a 404. i did not exhaustively test all possible escape sequences because i do not have a local development setup for this project and did not want to flood the production server with excessive requests. #1922 did not fix it because %20
is already properly-escaped.
i doubt it is related to HTTP referrers or user-generated content; this smells like a URI parsing bug.
Hi @kelvin13, I'm going to assume you misspoke when you say you "pentested" the site without asking for permission.
Can you please stop doing this (it triggers alerts) and DM us on Discord what exactly you attempted?
on the internet, anyone can do anything to a publicly-accessible service, and i have learned this the hard way myself.
in my case, i came across this issue two weeks ago and thought i would help a fellow swift ecosystem site maintainer out by debugging your uri parser for you. i got distracted and forgot to reply to this issue until came back today to ask about the breadcrumbs on the site frontend.
i don’t know what alert setup you have, but i am sorry for the notifications, and will not pentest your site again. i do not know which alerts or what percentage i was responsible for, but assuming i was able to trigger an error from my iphone safari browser, it’s likely others (and automated processes, and automated processes that manage automated processes) with worse intentions have also been doing this for quite some time.
i have DMed you on discord with an outline of what i remember testing, and wish you luck resolving this issue.
on the internet, anyone can do anything to a publicly-accessible service, and i have learned this the hard way myself.
Both Sven and I are very well aware of this, and this is not the problem we have with what you did.
It’s about courtesy.
If you had either set out to pentest the site and let us know with a quick message here or on Discord (anywhere, really), I’m sure we would have encouraged what you were planning to do. Or, if you had accidentally come across an issue and had then reported it, that would also have been helpful, and we’d have been thankful for your efforts, as we are with everyone else who logs a bug.
in my case, i came across this issue two weeks ago and thought i would help a fellow swift ecosystem site maintainer out by debugging your uri parser for you.
I don’t see any evidence of helping or debugging at all. You were “pentesting” the site, found a problem and didn’t report it to us. I don’t know how that helps us?
I know got distracted and didn’t report it. We’re all human and that happens. I just don’t know how you could possibly mistake that for being helpful. You “pentested” the site, triggered alerts that set us debugging, and didn’t report the issue or tell us what you were doing.
That isn’t helpful.
i doubt it is related to HTTP referrers or user-generated content; this smells like a URI parsing bug.
I was not suggesting it was related to either of those two things. The bug is relatively simple and is caused by us being too lenient with a URL route for the documentation system.
That part of the conversation related to me trying to figure out where these bad URLs were coming from. Some were being generated by README files (fixed in #1922), but I then spent significant time trying to find the source of the other bad links, in case we were generating them or they were coming out of DocC. If I had known that you were typing the bad URLs in, I could have avoided wasting my time.
All we ask is courtesy of communication about what you were doing. To come here and claim you were helping is the opposite.
i am sorry that i caused you to waste time investigating the bad urls. i really am. i should have given you guys a heads up, and i didn’t anticipate that it would resemble a naturally occurring error.
of the 20 or so uris i remember navigating to, probably about 17 or 18 of them were of the form “/x/ /x/x/x
” (with a character x
and a single encoded space), so i thought it would look like an intentional probe in your logs.
when i did this, the most recent comment on this thread was Sven’s
Very unlikely, because they'd have to be embedding a https://swiftpackageindex.com/... rooted url in their page with the path that trips us up.
from 14 days ago. so any invalid requests from before then were from other sources. i also did not visit any erroring endpoints since then (as i had forgotten about it) until today when i visited the / /x/x/x
endpoint again to check if the bug was still present. if you have other errors in your logs that did not contain x
’s, it was most likely a web crawler that picked up the url expressions in this thread, or a malicious script fuzzing valid swiftpackageindex.com urls.
i have removed the anchor and the hostname from the example case in my comment from earlier today to prevent web crawlers (and github) from requesting that endpoint in the future.
once again, i am sorry for wasting your time by failing to report my findings.
Hi @kelvin13 , thanks for clarifying what you attempted. The term "pentest" had me concerned having had sites go through actual pentests (which this wasn't) in the past. I'm glad we cleared that up!
i don’t know what alert setup you have
You see, that's exactly why it's polite not to poke a site and triggering 5xx errors unannounced - because you don't know if you might end up paging someone :)
However, my main concern was that "pentesting" meant other, more invasive actions and I wanted to make sure we were clear on what you actually tried.
That's good to know Kelvin, thanks for the apology and the explanation.
This should be addressed. I'm closing the Rollbars, we can re-open this should they re-appear.