opengrok
opengrok copied to clipboard
Opengrok for very large project Google Repo based
Hi, I am facing problem to create the Opengrok workspace from a very large project with multiple sub-repositories, which uses google repo tool. For this, I am unable to generate the mirror.yml tool using the opengrok-mirror tool since I am not aware of -H HEADER option, so I have manually created the mirror.yml. Even I have created readonly configuration for customize config. Can someone suggest to the example mirror.yml and readonly.xml file(as input config for indexer)? I am attaching these file for review. I am unable to generate the index even for the first time and I dont see the dropdown project on webapp. Can you pease helpUsing these file along with this command
"docker run -d \
--name opengrok \
-p 80:8080/tcp \
-e SYNC_PERIOD_MINUTES="10" \
-e NOMIRROR="1" \
-e CHECK_INDEX="1" \
-e INDEXER_OPT="-W /opengrok/etc/configuration.xml -m 256 -H -P -S -G --progress -v -O on -T 3 --depth 9 --assignTags --renamedHistory on -R /opengrok/etc/readonly.xml" \
-v /home/uxxxxx/opengrok_basic/src/:/opengrok/src/ \
-v /home/uxxxxx/opengrok_basic/etc/:/opengrok/etc/ \
-v /home/uxxxxx/opengrok_basic/data/:/opengrok/data/ \
ogk_basic
I have attached the logs and the configuration used.
cfg.zip
Can I request you to help review my configuration files? Request your helps and inputs on this. Thanks
The per project settings in the read-only configuration look sane however in order to use with the official Docker image it is necessary to use the READONLY_CONFIG_FILE environment variable (see https://github.com/oracle/opengrok/tree/master/docker#environment-variables). Using the -R directly will thwart the functionality of the main Docker script.
As for using opengrok-mirror to sync Android repository tree, I'd like to try it one day, esp. for #3622.
The mirror_v2.yml file has some weird artifact on the last line that causes YAML parsing to fail. Once removed, it loads fine, however still contains some weird indentation which I am not sure will be processed correctly.
The initial section:
commands:
repo:
# override repository command
command: /usr/local/bin/repo
sync: ['repo', 'sync','-cf']
# override incoming check with custom command (/my/custom/git is not called for incoming check)
# incoming: ['/bin/echo', 'Syncing the Repo!']
incoming: ['repo', 'sync', '-n']
incoming_check: true
will not work as intended - the commands section is meant to merely replace paths to commands executed.
If you want to specify path to the repo command, it should look like this:
commands:
repo: /usr/local/bin/repo
The
mirror_v2.ymlfile has some weird artifact on the last line that causes YAML parsing to fail. Once removed, it loads fine, however still contains some weird indentation which I am not sure will be processed correctly.The initial section:
commands: repo: # override repository command command: /usr/local/bin/repo sync: ['repo', 'sync','-cf'] # override incoming check with custom command (/my/custom/git is not called for incoming check) # incoming: ['/bin/echo', 'Syncing the Repo!'] incoming: ['repo', 'sync', '-n'] incoming_check: truewill not work as intended - the
commandssection is meant to merely replace paths to commands executed. If you want to specify path to therepocommand, it should look like this:commands: repo: /usr/local/bin/repo
If I use only the repo path command, than where do we define the sync command, incoming command and incoming_check command. Do you have a sample/example yml file which covers all the commands? I am bit confused on the format of the yml file.
Is it possible to provide the command to execute the python script to generate correct yml file. I am unable to get all the arguments required to generate the mirror.yml. Can you suggest an example complete command as reference to generate mirrror.yml. also would appreciate if you can provide the example complete command to generate read only configuration file.
The per project settings in the read-only configuration look sane however in order to use with the official Docker image it is necessary to use the
READONLY_CONFIG_FILEenvironment variable (see https://github.com/oracle/opengrok/tree/master/docker#environment-variables). Using the -R directly will thwart the functionality of the main Docker script.
I am already using -R option under INDEXER_OPT="-R /opengrok/etc/readonly.xml"
The
mirror_v2.ymlfile has some weird artifact on the last line that causes YAML parsing to fail. Once removed, it loads fine, however still contains some weird indentation which I am not sure will be processed correctly.The initial section:
commands: repo: # override repository command command: /usr/local/bin/repo sync: ['repo', 'sync','-cf'] # override incoming check with custom command (/my/custom/git is not called for incoming check) # incoming: ['/bin/echo', 'Syncing the Repo!'] incoming: ['repo', 'sync', '-n'] incoming_check: truewill not work as intended - the
commandssection is meant to merely replace paths to commands executed. If you want to specify path to therepocommand, it should look like this:commands: repo: /usr/local/bin/repo
Also can you help suggest the mirror yml file to support the below commands for the large google repo?
repo init
-u ssh://xxxxxxx
-b xyz
-m abc.xml
-g all
--depth=1
repo sync --current-branch --quiet --force-sync
The per project settings in the read-only configuration look sane however in order to use with the official Docker image it is necessary to use the
READONLY_CONFIG_FILEenvironment variable (see https://github.com/oracle/opengrok/tree/master/docker#environment-variables). Using the -R directly will thwart the functionality of the main Docker script.I am already using -R option under INDEXER_OPT="-R /opengrok/etc/readonly.xml"
That's will not work as intended. The path to the read-only configuration has to be supplied via the READONLY_CONFIG_FILE env var.
Also can you help suggest the mirror yml file to support the below commands for the large google repo?
repo init -u ssh://xxxxxxx -b xyz -m abc.xml -g all --depth=1
repo sync --current-branch --quiet --force-sync
The opengrok-mirror program will only call the repo sync part of the command: https://github.com/oracle/opengrok/blob/7b938e44688e6f3fd54b3d6c0d3345297ff5f0d5/tools/src/main/python/opengrok_tools/scm/repo.py#L39-L43
The repo init has to be called outside.
If I use only the repo path command, than where do we define the sync command, incoming command and incoming_check command. Do you have a sample/example yml file which covers all the commands? I am bit confused on the format of the yml file.
That's not possible currently. The mirror configuration for the SCM commands allows to specify the path to a binary only. That's useful for cases where the command is in non standard location. If you need higher level of customization then you have to perform the synchronization by other means.
Is it possible to provide the command to execute the python script to generate correct yml file. I am unable to get all the arguments required to generate the mirror.yml. Can you suggest an example complete command as reference to generate mirrror.yml. also would appreciate if you can provide the example complete command to generate read only configuration file.
Such capability does not exist and that's a good thing I'd say. The YAML syntax should be editable by hand and the opengrok-mirror tool should provide sufficient checking to see what is wrong (which it does not in this case, so I just created PR #3673). Also, the documentation should provide enough detail to produce the configuration by hand.
The per project settings in the read-only configuration look sane however in order to use with the official Docker image it is necessary to use the
READONLY_CONFIG_FILEenvironment variable (see https://github.com/oracle/opengrok/tree/master/docker#environment-variables). Using the -R directly will thwart the functionality of the main Docker script.I am already using -R option under INDEXER_OPT="-R /opengrok/etc/readonly.xml"
That's will not work as intended. The path to the read-only configuration has to be supplied via the
READONLY_CONFIG_FILEenv var.
We are extending the usage of the main docker file and creating our own docker file to support extra configuration for large code base. In this case, I think we still need -R readonly.xml file under INDEX_OPT.
One observation is when I use the READONLY_CONFIG_FILE="/opengrok/etc/readonly.xml", it keeps waiting for tomcat. Following are initial logs,
synchronization period = 6 minutes Deploying web application extra indexer options: -W /opengrok/etc/configuration.xml -m 256 -H -P -S -G --progress -v -O on -T 3 --depth 9 --assignTags --renamedHistory on Checking if index matches current version Jul 21, 2021 1:54:20 PM org.opengrok.indexer.configuration.Configuration read INFO: Reading configuration from /opengrok/etc/configuration.xml Jul 21, 2021 1:54:20 PM org.opengrok.indexer.index.Indexer parseOptions INFO: Indexer options: [-R, /opengrok/etc/configuration.xml, --checkIndex] Jul 21, 2021 1:54:20 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Jul 21, 2021 1:54:20 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:20 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:20 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:20 PM org.opengrok.indexer.configuration.Project getProject
Following are end logs, Jul 21, 2021 1:54:23 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:23 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:23 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:23 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Jul 21, 2021 1:54:23 PM org.opengrok.indexer.configuration.Project getProject WARNING: Path of project xyz_project is not set Merging read-only configuration from '/opengrok/etc/readonly.xml' with current configuration in '/opengrok/etc/configuration.xml' Number of sync workers: 40 Waiting for Tomcat to start Starting REST app on port 5000 Sleeping for 360 seconds Starting Tomcat Sleeping for 360 seconds Sleeping for 360 seconds
Web page at the mapped local port does not reflect the project and its index,

But if I add the readconfig under INDEX_OPT, I can view the project will all index generated.
Hi @shilpakangya , Did you have a solution for that? I'm facing a similar issue here, where the docker image starts but the opengrok-mirror cannot run with the default parameters, since it's a google repo repository.
Thanks in advance.