git-plugin icon indicating copy to clipboard operation
git-plugin copied to clipboard

[JENKINS-75337] Jenkins git plugin incorrect hyperlink when file name has # character

Open jenkins-infra-bot opened this issue 10 months ago • 5 comments

If a file contains a '#' in it the hyperlink generated to the file in the changes view is not correct.

The generated hyperlink is:
https://github.com/PROJECT/REPO/blob/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1/my#file.txt

It should be:
https://github.com/PROJECT/REPO/blob/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1/my%23file.txt

Steps to reproduce:
1. Create a github repository
2. Create a file called "my#file.txt"
3. Create a Jenkinsfile

pipeline {
    agent any

    stages {
stage('Hello') {
    steps {
echo 'Hello World'
    }
}
    }
}

4. Setup a Jenkins job pointing to the above repository
5. Run the above job once
6. Make a change to "my#file.txt" and push it
7. Run the build again
8. Inspect the link to the file in the "Changes". The link will not work.


  "https://github.com/PROJECT/REPO/blob/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1/my#file.txt">my#file.txt
   
  "https://github.com/PROJECT/REPO/commit/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1#diff-0">(diff)

URL encoding is a pain and a game of whack-a-mole. Unfortunately the URI trick doesn't work in all cases. The URLEncoder is sometimes recommended and Jenkins core Functions has it but that doesn't handle some nuances around how different parts of the URL should be encoded. Worst of all, APIs are inconsistent. Some middleware is more lenient than others and will let incorrectly formatted URLs through. Some have different settings such as how to treat encoding and double encoding '/'.

I've found the best approach is to use Guava's URLEscapers. It forces the caller to select the correct escaper based on the context of which part of a URL is encoded. It also prevents generic "urlEncode" abuse where people run it on strings then selectively convert certain characters back with a String replace. It's best augmented with a "urlEncodePathExceptSlash" method which tends to be quite convenient. Finally, it forces the caller to do the encoding instead of a "do-it-all method" which can't always understand user intent (i.e. is '/' part of a path or an element of a path that needs to be encoded? Is the '#' character I see a genuine fragment separator or part of an element that needs to be encoded?)


Originally reported by mrichar2, imported from: Jenkins git plugin incorrect hyperlink when file name has # character
  • assignee: code_arnab
  • status: Open
  • priority: Minor
  • component(s): git-plugin
  • resolution: Unresolved
  • votes: 0
  • watchers: 4
  • imported: 2025-12-02
Raw content of original issue

If a file contains a '#' in it the hyperlink generated to the file in the changes view is not correct.

The generated hyperlink is: https://github.com/PROJECT/REPO/blob/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1/my#file.txt

It should be: https://github.com/PROJECT/REPO/blob/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1/my%23file.txt

Steps to reproduce: 1. Create a github repository 2. Create a file called "my#file.txt" 3. Create a Jenkinsfile

pipeline {
    agent any

    stages {
        stage('Hello') {
            steps {
                echo 'Hello World'
            }
        }
    }
}

4. Setup a Jenkins job pointing to the above repository 5. Run the above job once 6. Make a change to "my#file.txt" and push it 7. Run the build again 8. Inspect the link to the file in the "Changes". The link will not work.

<td>
  <a href="https://github.com/PROJECT/REPO/blob/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1/my#file.txt">my#file.txt</a>
  &nbsp;
  <a href="https://github.com/PROJECT/REPO/commit/c1c9643ce9303e7f3eb10eefdf761a9344e5d7e1#diff-0">(diff)</a>
</td>

URL encoding is a pain and a game of whack-a-mole. Unfortunately the URI trick doesn't work in all cases. The URLEncoder is sometimes recommended and Jenkins core Functions has it but that doesn't handle some nuances around how different parts of the URL should be encoded. Worst of all, APIs are inconsistent. Some middleware is more lenient than others and will let incorrectly formatted URLs through. Some have different settings such as how to treat encoding and double encoding '/'.

I've found the best approach is to use Guava's URLEscapers. It forces the caller to select the correct escaper based on the context of which part of a URL is encoded. It also prevents generic "urlEncode" abuse where people run it on strings then selectively convert certain characters back with a String replace. It's best augmented with a "urlEncodePathExceptSlash" method which tends to be quite convenient. Finally, it forces the caller to do the encoding instead of a "do-it-all method" which can't always understand user intent (i.e. is '/' part of a path or an element of a path that needs to be encoded? Is the '#' character I see a genuine fragment separator or part of an element that needs to be encoded?)

environment
git-plugin:5.7.0<br/>
jenkins-core:2.479.1

jenkins-infra-bot avatar Feb 25 '25 18:02 jenkins-infra-bot

basil:

Should likely be resolved with Util#rawEncode as in https://github.com/jenkinsci/junit-plugin/pull/668.

jenkins-infra-bot avatar Feb 25 '25 23:02 jenkins-infra-bot

code_arnab:

Hello basil I've previously worked on a similar issue - https://github.com/jenkinsci/claim-plugin/pull/346
So, I'd like to work on this too.

While, I was going through the files that might needs to be changed, I found several instances of such function  - https://github.com/jenkinsci/git-plugin/blob/4da48b6997cd112b6fae44a9c1a516c307002e87/src/main/java/hudson/plugins/git/browser/GitWeb.java#L40

while this one use URI - https://github.com/jenkinsci/git-plugin/blob/4da48b6997cd112b6fae44a9c1a516c307002e87/src/main/java/hudson/plugins/git/browser/GitRepositoryBrowser.java#L142

So, do I need to wrap all of them with Util#rawEncode or only those that are using urlEncode or URI?

jenkins-infra-bot avatar Feb 26 '25 06:02 jenkins-infra-bot

basil:
  • Original comment link
  • Raw content of original comment:

    I am not sure offhand. Empirical testing is needed in this case.

I am not sure offhand. Empirical testing is needed in this case.

jenkins-infra-bot avatar Feb 26 '25 07:02 jenkins-infra-bot

code_arnab:
  • Original comment link
  • Raw content of original comment:

    Yeah that makes sense. I will try to find the methods that needs to be modified.

Yeah that makes sense. I will try to find the methods that needs to be modified.

jenkins-infra-bot avatar Feb 26 '25 08:02 jenkins-infra-bot

sailedmoon:
  • Original comment link
  • Raw content of original comment:

    I've submitted a PR to fix this issue: https://github.com/jenkinsci/git-plugin/pull/1824

    The fix applies Util.rawEncode() to filenames when constructing URLs in all major Git browser implementations (GitWeb, GithubWeb, GitLab, BitbucketServer, BitbucketWeb), following the pattern from junit-plugin PR #668.

    This properly URL-encodes special characters like # in filenames, fixing the broken hyperlinks reported in this issue.

I've submitted a PR to fix this issue: https://github.com/jenkinsci/git-plugin/pull/1824

The fix applies Util.rawEncode() to filenames when constructing URLs in all major Git browser implementations (GitWeb, GithubWeb, GitLab, BitbucketServer, BitbucketWeb), following the pattern from junit-plugin PR #668.

This properly URL-encodes special characters like # in filenames, fixing the broken hyperlinks reported in this issue.

jenkins-infra-bot avatar Oct 18 '25 14:10 jenkins-infra-bot