grass icon indicating copy to clipboard operation
grass copied to clipboard

[Feat] Implement GRASS_RASTER_TMPDIR_MAPSET like it exists for vector data

Open neteler opened this issue 5 years ago • 3 comments

Is your feature request related to a problem? Please describe.

Since GRASS GIS is able to process enormous amounts of data, it is important to not slow down unnecessarily the processing. When performing raster processing, a .tmp/ directory is created in the current mapset.

In case that the parent mapset directory is located on a slow (e.g. network) drive this slows down the entire processing.

The current workaround to set a link of location/mapset/.tmp/ to another (fast) drive isn't really user friendly.

Describe the solution you'd like

A desired solution the implementation of support for a new GRASS_RASTER_TMPDIR_MAPSET variable like it already exists for vector data (GRASS_VECTOR_TMPDIR_MAPSET)

This would require to change all hardcoded .tmp/ occurences to the variable, esp. in source code in lib/init/grass.py, lib/gis/open.c, lib/gis/file_name.c, the raster library, etc.

neteler avatar Aug 11 '20 10:08 neteler

A (cleaned) search for .tmp shows these candidate files in need to be updated:

lib/gis/file_name.c:67:  $LOCATION/$MAPSET/.tmp/$HOSTNAME. If GRASS_VECTOR_TMPDIR_MAPSET is

lib/gis/open.c:62:    is_tmp = (element && strncmp(element, ".tmp", 3) == 0);

lib/gis/tempfile.c:117:    strcpy(element, ".tmp");

lib/vector/Vlib/open.c:582:  <tt>.tmp/<hostname>/vector</tt>).
lib/vector/Vlib/open.c:649:  <tt>.tmp/<hostname>/vector</tt>).
lib/vector/Vlib/open.c:935:  <tt>.tmp/<hostname>/vector</tt>). If the map already exists, it is

lib/raster/rasterlib.dox:1017:Creates a new floating-point raster map (in <tt>.tmp</tt>) and returns
lib/raster/rasterlib.dox:1183:If the map is a new floating point, move the <TT>.tmp</TT> file into
lib/raster/rasterlib.dox:1188:cat = max value (for backwards compatibility). Move the <TT>.tmp</TT>

lib/raster/close.c:86: * If the map is a new floating point, move the <tt>.tmp</tt> file
lib/raster/close.c:92: * the <tt>.tmp</tt> NULL-value bitmap file to the <tt>cell_misc</tt>

lib/init/variables.html:320:  <tt>$LOCATION/$MAPSET/.tmp/$HOSTNAME</tt>. If GRASS_VECTOR_TMPDIR_MAPSET is

lib/init/grass.py:2073:        self.tmp_location = False
lib/init/grass.py:2074:        self.tmp_mapset = False
lib/init/grass.py:2114:            params.tmp_location = True
lib/init/grass.py:2116:            params.tmp_mapset = True
lib/init/grass.py:2123:        if params.tmp_location:
lib/init/grass.py:2136:    if params.tmp_location and params.tmp_mapset:
lib/init/grass.py:2141:    if params.tmp_location and not params.geofile:
lib/init/grass.py:2148:    if params.tmp_location and params.mapset:
lib/init/grass.py:2288:    if not params.mapset and not params.tmp_location:
lib/init/grass.py:2298:        if params.tmp_location:
lib/init/grass.py:2302:                       tmp_location=params.tmp_location, tmpdir=tmpdir)
lib/init/grass.py:2306:        elif params.tmp_mapset:
lib/init/grass.py:2308:                       tmp_mapset=params.tmp_mapset)
lib/init/grass.py:2403:        if not params.tmp_location:

display/d.legend.vect/d.legend.vect.html:51:By default the legend file is stored in grassdata/location/mapset/.tmp/user

display/d.mon/start.c:153:    /* create .tmp/HOSTNAME/u_name directory */

scripts/d.rast.edit/d.rast.edit.html:124:<p>There is no user-interrupt handling. This could leave files in .tmp

vector/v.hull/globals.h:17:#define TMPFILE "voxeltmp.tmp"

gui/wxpython/animation/frame.py:71:        # (stored in MAPSET/.tmp/)

gui/wxpython/gui_core/dialogs.py:2424:        self.tmp_file = grass.tempfile(False) + '.png'
gui/wxpython/gui_core/dialogs.py:2604:        env['GRASS_RENDER_FILE'] = self.tmp_file
gui/wxpython/gui_core/dialogs.py:2609:            self.renderfont.SetBitmap(wx.Bitmap(self.tmp_file))
gui/wxpython/gui_core/dialogs.py:2612:        try_remove(self.tmp_file)

neteler avatar Aug 11 '20 12:08 neteler

Yes, that would be nice indeed, esp. for processes with frequent access of temporary data (e.g. r.watershed). In some cases (e.g. lots of temporary maps, only limited amount of final results), linking is more efficient because temporary data has to be copied to the other disk (vs. moved if on the same disk)... I guess the trick with linking mapsets to temporary GRASS DBs on SSD is not so well known. Maybe adding the possibility of linking to the data catalog, now with multiple GRASS DB supported soon, would help peoploe discover? Would be a different issue though...

ninsbl avatar Aug 14 '20 07:08 ninsbl

See new PR #1786 Note that the workaround to set a link of location/mapset/.tmp/ to another (fast) drive can cause fatal errors because files can not be renamed across mount points. Therefore a new routine had to be added to lib/raster.

Creating the entire .tmp folder on a different location seems thus too dangerous to me. Instead, each lib/module can make use of these new functions and, as before, must take care with cleaning up, renaming or moving temporary data.

The new functions are fairly generic and can be used by modules. E.g. r.watershed could get a new option tmpdir where temporary data should be stored.

metzm avatar Aug 11 '21 15:08 metzm