mapcache
mapcache copied to clipboard
Continuing increase in response time from mapcache
I am having a problem with mapcache after it has been running for a long time (approx. 2 weeks). Response times go through the roof and the Apache error log is littered with messages like these:
[error] [client 10.7.7.47] tileset ESN: unknown error (another thread/process failed to create the tile I was waiting for), referer: http://10.7.8.10/Map
[error] [client 10.7.7.47] tileset Streets: failed to re-get tile 92 54 6 from cache after set, referer: http://10.7.8.10/Map
[error] [client 10.7.7.47] (70007)The timeout specified has expired: proxy: error reading response
There is not much other info I can find about why these errors keep occurring. The environment is Linux x86_64, Apache 2.2.22, Mapserver 6.2.2. Image cache resides on a DRBD partition.
I'll be happy to provide any info and time necessary to debug this issue. If there are any suggestions please let me know.
@markathomas you haven't provided the mapcache version... does the slowdown occur until you restart apache, or could it be related to the filesystem itself being an issue? try updating to 1.2.1 or at least apply ee0107ad82 which should print out more meaningfull error messages
We are using MapCache 1.2.0 with MapServer 6.2.2.
Typically, after we notice a slowdown if we stop apache, clear all the tiles, and restart apache the maps are snappy.
i'll try your suggestions
Hi Thomas, I was wondering what the best cmake build type would be for debugging? I currently have it set to RelWithDebInfo. Is that sufficient?
Regards,
Mark Thomas [email protected] 205.529.9013
"Commit to the Lord whatever you do, and your plans will succeed." - Proverbs 16:3
On Wed, Mar 12, 2014 at 12:03 PM, Mark Thomas [email protected]:
We are using MapCache 1.2.0 with MapServer 6.2.2.
Typically, after we notice a slowdown if we stop apache, clear all the tiles, and restart apache the maps are snappy.
i'll try your suggestions
RelWithDebInfo is OK, but may render debugging more complicated due to the compiler optimizations. Use "Debug" if you need to step through the code line by line.
Thanks for the info. During testing we appear to see a correlation between the cache expiration time and the errors appearing in our logs. I have rebuilt mapcache 1.2.1 with -DCMAKE_BUILD_TYPE=Debug and turned Apache log level to debug. Cache expiration is currently set to 1 hour so I should have more info soon.
To give some more background, here is a sample config:
I am wondering if some of my settings are wrong (in red above with **** at end of line).
- We use a custom grid defined by the extent of all shapefiles present
in the map and we always us EPSG:4326 so I wondering if I should instead
use
WGS84 or if that even matters? The resolutions are calculated from the extent and number of zoom levels according to the algorithm in OpenLayers. - We are currently setting the expires and auto_expire elements to the same value (14 days in production, 1 hour in test). Is this ok or should they be different? For our purposes, the images will only change if we receive new shapefiles/orthography from our customers thus I was planning on setting both of these elements to a value of 1 year (we produce upgrades approx. every six months).
- We currently only use WMS; does having the other services enabled causes any potential issues?
- We initially saw lock-related errors (timeouts) so we bumped the lock_retry from 10000 to 1000000? Is that too high, low, ?
Thanks for all your help!
I seem to have gotten my mapcache in this same state. For me I noticed it after trying to do some profiling/benchmarking/stress testing. I did 100 curl requests in a row for the same tile at relatively the same time. I saw the initial request take sub-1 second, then everyone after that increased until it got to ~7 or 8 seconds for the later requests. The apache error log showed a lot of failed to re-get tile 1 1 6 from cache after set
. I figured it was some threading issue so I restarted the docker container that was running Apache but kept the cache directory and I still saw the same rquests failing to re-get. What is even more confusing is that the disk cache I'm using didn't even have a directory for the TIME I was requesting. It also seems to be tied to all-empty tiles where empty tiles seem to have this failure but tiles with some valid pixels seem fine. I can't be sure yet.
I can try to provide more information if needed. I just recently got MapCache working for my system. I was running off of master for a pre-1.10 commit. I'll try updating, clearing the cache directory, and see how that goes.
Edit: Updated and still noticing a lot of these re-get errors.
Edit 2: I thought this may have been caused by permissions on the lock directory. Changed it to another directory and used inotify and verified the lock is being created.
Just an update. I think I've figured this out. I based my config off the provided ones which include this:
https://github.com/mapserver/mapcache/blob/c467a3bb444cab69c52bc78b91b9bae3a9415f2f/mapcache.xml.sample#L353-L367
Without knowing exactly what these options were doing I thought the detect blank was needed for symlink to work. Turns out that detect blank was taking precedence and producing the re-get
error mentioned above. Once I removed the detect_blank
the symlink_blank
is working as expected and I now get a blank tile. Sorry to add nonsense to this issue.
We're able to reproduce this at will on the same application that @markathomas originally posted about. As mentioned by @djhoese it does appear to be related to symlink_blank, although notably we do not use it in combination with detect_blank. Rather, in our case it appears to be an incompatibility between symlink_blank and tileset->auto_expire. When the blank file timestamps are older than the auto_expire date, the tiles will attempt to continuously regenerate, but this introduces contention on the blank file referred to by the symlink. Additionally, an updated file will result in an updated timestamp on the symlink, but not on the referred-to file. However, mapcache appears to be using the timestamp on the symlink in order to determine whether or not to regenerate the referred to file, so the regeneration behavior will continue indefinitely. This can be reproduced by manually setting the modification date on the blank files to a date that is older than the auto_expire, and can also be resolved manually by touching the blank files so that they have a modification date that is newer than the auto_expire. Ideally, when using symlink_blank with auto_expire, mapcache should be using the timestamps of the symlink file and not the generated blank file for determining expiry. Edit: I had documented the relationship between the timestamp and symlink backwards. Corrected.