unstructured icon indicating copy to clipboard operation
unstructured copied to clipboard

feat/enable github enterprise (v 3.10.8) connection

Open DanielBarbosabit opened this issue 5 months ago • 1 comments

Is your feature request related to a problem? Please describe. The GithubRunner works so fine to extract data from Github, but it is not possible to use the same runner to extract data from Enterprise accounts.

Describe the solution you'd like I would to use the GithubRunner to extract data from a Github Enterprise account. So, to enable this feature, I believe the SimpleGitHubConfig class should have a new parameter to pass the base URL API from the Github Enterprise, as shown in the code below:

from unstructured.ingest.connector.git import GitAccessConfig
from unstructured.ingest.connector.github import SimpleGitHubConfig
from unstructured.ingest.interfaces import PartitionConfig, ProcessorConfig, ReadConfig
from unstructured.ingest.runner import GithubRunner

if __name__ == "__main__":
    runner = GithubRunner(
        processor_config=ProcessorConfig(
            verbose=True,
            output_dir="github-ingest-output",
            num_processes=2,
        ),
        read_config=ReadConfig(),
        partition_config=PartitionConfig(),
        connector_config=SimpleGitHubConfig(
            url="<MyOrg>/<MyInternalRepo>", branch="main", access_config=GitAccessConfig(), base_url=base_url="https://<host_of_my_github_enterprise>/api/v3"
        ),
    )
    runner.run()

Describe alternatives you've considered Of course, It is necessary that the source code has to be compatible with the Github and Github enterprise API, but I already tested and it should be interesting to remove the line 32 condition , in order to be possible to allow other github hosts. Because in this way, we are not able to configure Github Enterprise account, which has different domains.

Additional context

  • The user should be able to pass the domain other than "github.com".

DanielBarbosabit avatar Mar 22 '24 14:03 DanielBarbosabit

Thanks for creating this issue @DanielBarbosabit :). We're tracking this as an enhancement and will take a look at it more closely as soon as we have bandwidth. In the meantime, if you have an implementation in mind feel free to open a PR and we'd be happy to review!

scanny avatar Mar 22 '24 18:03 scanny