heritrix3
heritrix3 copied to clipboard
Disk usage is not within je.maxDisk or je.freeDisk limits and write operations are prohibited
I'm seeing errors related to disk usage and configuration, namely "Disk usage is not within je.maxDisk or je.freeDisk limits and write operations are prohibited".
Complete log entry:
SEVERE Failed to start bean 'bdb'; nested exception is java.lang.RuntimeException: com.sleepycat.je.DiskLimitException: (JE 7.5.11) Disk usage is not within je.maxDisk or je.freeDisk limits and write operations are prohibited: maxDiskLimit=0 freeDiskLimit=5,368,709,120 adjustedMaxDiskLimit=0 maxDiskOverage=0 freeDiskShortage=10,711,040 diskFreeSpace=5,357,998,080 availableLogSize=-10,711,040 totalLogSize=1,266 activeLogSize=1,266 reservedLogSize=0 protectedLogSize=0 protectedLogSizeMap={}
Where is je.maxDisk adjusted?
@tchnlgst you can set additional configuration values for the Berkley DB here: https://github.com/internetarchive/heritrix3/blob/adac067ea74b5a89f631ef771e2f598819bac6c4/commons/src/main/java/org/archive/bdb/BdbModule.java#L257
config.setConfigParam(EnvironmentConfig.FREE_DISK, "0");
for example will disable the checks completely which isn't necessarily recommended.
The docs for the settings are here if you want to take a look over what would be best for your use case: https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentConfig.html#FREE_DISK
Hi, we have the same error but we don't have the source version so we can't change Berkeley DB this way. There's any other way to solve ht problem? Where are this files stored?
Note that running out of space will corrupt any Berkeley DB instance, which is why the defaults assume there is at least 5GB of space available. I would not recommend running a crawl with less than 5GB of space available.
That said, if you really want to do this without code changes, you could try editing the je.properties
file that gets created in the BDB folder. Add a line like:
je.maxDisk=0
If you then try to start-up a crawl, I think it picks up the options from this file. Note that I've only done this when resuming from a checkpoint, but I think it may work when starting afresh.
Hi! One question related to this issue (I'm using the default configuration, and I guess that the Berkeley DB you're talking about is some specific configuration I don't have, so the error I got it is a little bit different). When this issue arises because you ran out of space, is it expected to remove the log files content? My file seeds-report.txt
is empty after this issue happened.
I'm using this version (last commit from master).
When I open the file in the web interface, the exact message I get is:
Cause: com.sleepycat.je.DiskLimitException: (JE 7.5.11) Disk usage is not within je.maxDisk or je.freeDisk limits and write operations are prohibited: maxDiskLimit=0 freeDiskLimit=5.368.709.120 adjustedMaxDiskLimit=0 maxDiskOverage=0 freeDiskShortage=525.475.840 diskFreeSpace=4.843.233.280 availableLogSize=-525.475.840 totalLogSize=6.632 activeLogSize=6.632 reservedLogSize=0 protectedLogSize=0 protectedLogSizeMap={}
Another issue I'm observing when this happens is that when I try to terminate my jobs after this issue, the status seems to be hanged on "finishing" status and doesn't finish properly. Indeed, it seems that when this issue happens, in the log I can see that the status is changed automatically to "finishing", but it wasn't reflected in the web interface until I clicked "terminate".