opengrok icon indicating copy to clipboard operation
opengrok copied to clipboard

do not attempt to retrieve history using the repository method for files without history cache entry for repos with history for directories

Open vladak opened this issue 11 years ago • 4 comments

I was playing with indexing on OS X (for #687) with one of my repos and discovered that untracked files cause unhandled exceptions in the phase of generating Lucene index (from addFile()). It would be nice to catch these exceptions and generate log message with WARNING or maybe INFO level.

vladak avatar Feb 21 '14 11:02 vladak

What does luke say? Were those files added? If yes, then they were added without history which is desired behaviour - and the stack goes into fine level i think ....

tarzanek avatar Feb 23 '14 19:02 tarzanek

The actual output looks like this:

11:44:51 WARNING: Non-zero exit status 255 from command [/usr/bin/hg, log, -f, --template, changeset: {rev}:{node|short}\n{branches}{tags}{parents}\nuser: {author}\ndate: {date|isodate}\nfiles: {files}\ndescription: {desc|strip|obfuscate}\n, race/test.dat] in directory /Users/vladimirkotal: abort: cannot follow file not in parent revision: "race/test.dat"
11:44:51 WARNING: An error occurred while reading history: 
org.opensolaris.opengrok.history.HistoryException: Failed to get history for: "/private/var/tmp/base/src/unix-prog-src/race/test.dat" Exit code: 255
    at org.opensolaris.opengrok.history.MercurialHistoryParser.parse(MercurialHistoryParser.java:79)
    at org.opensolaris.opengrok.history.MercurialRepository.getHistory(MercurialRepository.java:501)
    at org.opensolaris.opengrok.history.MercurialRepository.getHistory(MercurialRepository.java:494)
    at org.opensolaris.opengrok.history.FileHistoryCache.get(FileHistoryCache.java:307)
    at org.opensolaris.opengrok.history.HistoryGuru.getHistory(HistoryGuru.java:212)
    at org.opensolaris.opengrok.history.HistoryGuru.getHistoryReader(HistoryGuru.java:175)
    at org.opensolaris.opengrok.analysis.AnalyzerGuru.populateDocument(AnalyzerGuru.java:284)
    at org.opensolaris.opengrok.index.IndexDatabase.addFile(IndexDatabase.java:591)
    at org.opensolaris.opengrok.index.IndexDatabase.indexDown(IndexDatabase.java:842)
    at org.opensolaris.opengrok.index.IndexDatabase.indexDown(IndexDatabase.java:812)
    at org.opensolaris.opengrok.index.IndexDatabase.update(IndexDatabase.java:379)
    at org.opensolaris.opengrok.index.IndexDatabase$1.run(IndexDatabase.java:168)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
    at java.util.concurrent.FutureTask.run(FutureTask.java:166)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
    at java.lang.Thread.run(Thread.java:722)

I am not sure why there has to be the exception stack trace at all (for this specific case).

vladak avatar Feb 24 '14 10:02 vladak

With the recent Mercurial (6.1.1) the hg log no longer exits with non-zero code when the file is untracked, even with -f/--follow (which current OpenGrok version does not use unless renamed support is on for given repository) so the exception is no longer reported.

That said, this is a question of efficiency. The hg log command is executed which will slow the indexer down. For such repository types, the fallback should be false when calling getHistory() to avoid it because when -H is used the history cache should be generated for such repositories (i.e. these with the capability to retrieve history for directories) in the 1st phase of indexing.

For repositories with hasHistoryForDirectories() returning false (e.g. SCCS), this has to still work because for these the history will be only stored in the 2nd phase of the indexing via the populateDocumentHistory().

vladak avatar Mar 25 '24 11:03 vladak

Actually, the check is already being done in HistoryGuru#getHistoryFromRepository(): https://github.com/oracle/opengrok/blob/5d55b5f08d930049059a72ad2c2317584e55e94b/opengrok-indexer/src/main/java/org/opengrok/indexer/history/HistoryGuru.java#L547-L560

The issue is that the fetchHistoryWhenNotInCache configuration tunable is set to true by default so the history is retrieved also for repositories which support history per directory. Maybe the tunable should be set to false by default and that is why I am keeping this issue open.

That said, the tunable might go away as suggested in https://github.com/oracle/opengrok/issues/1498#issuecomment-299310389

vladak avatar Mar 25 '24 17:03 vladak