geotrellis icon indicating copy to clipboard operation
geotrellis copied to clipboard

OverviewStrategy requires improvements

Open jpolchlo opened this issue 5 years ago • 6 comments

After a brief interaction with the OverviewStrategy "interface", it appears to me that there are significant shortcomings with this feature.

  1. This is not an extensible system for resolution selection.
  2. The strategies that exist are hardly comprehensive.
  3. The semantics are not set by the interface (the responsibility for interpreting these strategies falls on the RasterSource implementation, so we can have inconsistencies between implementations).

I am proposing to improve this interface in the following way: Remove the sealed keyword and add an apply method to OverviewStrategy which takes in the list of available resolutions, the desired resolution, and returns one of the available resolutions (or the index of the best match).

If we think of the available resolution as "stops" on the number line, then we might have

coarse                                                                   fine
<----------*---------*---------------*-----------------------*-------------->
           1         2         ˄     3                       4

where the '*'s indicate the available resolutions and the caret indicates the request.

At the moment, we can choose the finest resolution (4) with Base (or Auto(1)?) or the next highest resolution (3) with AutoHigherResolution (or with Auto(0)?). (The semantics of Auto(n) are not clear from the docs, hence the question marks.) There's not currently a way to move to coarser resolutions (due to an inexplicable requirement that Auto(n) have n ≥ 0), or to "round" to the closest resolution.

More than anything, the lack of set semantics for these strategies strongly suggests that we need to improve this API, lest different users of this interface diverge in behavior.

jpolchlo avatar Mar 03 '20 18:03 jpolchlo

Connects #3196

pomadchin avatar Mar 06 '20 14:03 pomadchin

We discussed this issue at length and the consensus is that this would make good improvement.

The main consequence of changing OverviewStrategy to be extensible is that it will no longer be possible for GDALRasterSource to operate on all sub-classes of it since there is no way to ensure that each possible subclass maps to an actual GDAL parameter. In those case GDALRasterSource will have to throw. This is not ideal but seems "fine".

// client-code.jar
val source: GDALRasterSource = ???
source.resample(OverviewStrategy.Base) // works

case object WeirdStrategy extends OverviewStrategy {
  override def select(candidates: List[CellSize]): Integer = -1
}

source.resample(WeirdStrategy ) // throws

The discussion finished on how many types we are willing to introduce to make that exception, which can't be avoided, as obvious as possible and to reduce the chance of downstream changes making GDAL and GeoTrellis behavior inconsistent.

Currently the majority of people prefer Option 1 because it introduces the least nouns.

Option 1

// geotrellis-raster.jar
// package geotrellis.raster
trait OverviewStrategy {
  def select(candidates: List[CellSize]): Integer
}

object OverviewStrategy {
  /** This strategy matches the GDAL Base strategy */
  case object Base extends OverviewStrategy {
    def select(candidates: List[CellSize]): Integer = 0
  }

  case class Auto(n: Int) extends OverviewStrategy {
    def select(candidates: List[CellSize]): Integer = n
  }
}

trait RasterSource {
  def resample(strategy: OverviewStrategy)
}

// geotrellis-gdal.jar
// package geotrellis.raster.gdal
class GDALRasterSource extends RasterSource {
  override def resample(strategy: OverviewStrategy): Unit = {
    strategy match {
      case OverviewStrategy.Auto(n) => s"-o AUTO$n"
      case OverviewStrategy.Base => "-o BASE"
      case _ => ???
    }
  }
}

In this option GDALRasterSource has a private match that will translate a fixed list of GeoTrellis overview strategies to GDAL cases. The main concern is that if somebody tweaked the behavior of OvervewStrategy.Auto they would not realize that they're potentially making it inconsistent with GDAL behavior. However this is the most succinct option.

Option 2

// geotrellis-raster.jar
// package geotrellis.raster
trait OverviewStrategy {
  def select(candidates: List[CellSize]): Integer
}

sealed trait GDALOverviewStrategy extends OverviewStrategy {
  def gdalParams: String
}

object OverviewStrategy {
  case object Base extends GDALOverviewStrategy {
    def select(candidates: List[CellSize]): Integer = 0
    def gdalParams: String = "-o BASE"
  }

  case class Auto(n: Int) extends GDALOverviewStrategy {
    def select(candidates: List[CellSize]): Integer = n /
    def gdalParms: String = s"-o AUTO$n"
  }
}

trait RasterSource {
  def resample(strategy: OverviewStrategy)
}

// geotrellis-gdal.jar
// package geotrellis.raster.gdal
class GDALRasterSource extends RasterSource {
  override def resample(strategy: OverviewStrategy): Unit = {
    strategy match {
      case s: GDALOverviewStrategy => s.gdalParams
      case _ => sys.error("bad")  
    }
  }
}

Here we introduce geotrellis.raster.GDALOverStrategy. This class has to be defined outside of geotrellis-gdal.jar such that OverviewStrategy.Auto may extend it. If anybody were to change OverviewStrategy.Auto they would have to immediately think about their life and the interface at hand. Additionally it establishes a link in behavior between that implementation and GDAL option which may be well known.

This costs us introducing GDALOverviewStrategy to root GeoTrellis package without implementation of GDALRasterSource which has a heavy dependency. This is "fine" because its definitional but pollutes the namespace some.

Notes

The actual interface for OverviewStrategy is not fully pinned down here. Questions like "is List really the right type here?` and other possible overloads are up in the air a bit.

echeipesh avatar Mar 25 '20 16:03 echeipesh

A few random thoughts, with the caveat that I'm not sure I understand the specific use case, so given that, just ignore my questions if they are not worth the time.

  1. When does select get called, and does the list candidates exist in geotrellis.raster or only in geotrellis.gdal?
  2. I see in GDAL docs an -ovr <level|AUTO|AUTO-n|NONE> option for overview selection. Is this what we're talking about? If so, would exposing the level mode via case objects take care of the possible set of modes? Then you'd end up with:
// geotrellis-raster.jar
// package geotrellis.raster
trait OverviewStrategy {
  def select(candidates: List[CellSize]): Integer
}

object OverviewStrategy {
  /** This strategy matches the GDAL Base strategy */
  case object Base extends OverviewStrategy {
    def select(candidates: List[CellSize]): Integer = 0
  }

  case class Auto(n: Int) extends OverviewStrategy {
    def select(candidates: List[CellSize]): Integer = n
  }

  trait UserSpecified extends OverviewSrategy
}

trait RasterSource {
  def resample(strategy: OverviewStrategy)
}

// geotrellis-gdal.jar
// package geotrellis.raster.gdal
class GDALRasterSource extends RasterSource {
  override def resample(strategy: OverviewStrategy): Unit = {
    strategy match {
      case OverviewStrategy.Auto(n) => s"-o AUTO$n"
      case OverviewStrategy.Base => "-o BASE"
      case s: OverviewStrategy.UserSpecified => 
        val candidates = ???
        val n = s.select(candidates)
        s"-o $n"
    }
  }
}
  1. Why is strategy ignored here? Because of this issue?
  2. IMO, don't pollute the raster namespace with gdal stuff. You'll regret it, and I'm confident there are design patterns that can save us from that (perhaps at the expense of new abstractions). This is the classic expression problem. Typeclasses/type constraints might help. I'd fight hard against option 2.

metasim avatar Mar 26 '20 19:03 metasim

@metasim

  1. because it is a typo, should be options.copy(resampleMethod = Some(method)).resample(gridExtent, resampleTarget)

pomadchin avatar Mar 26 '20 19:03 pomadchin

typo, or bug?

metasim avatar Mar 26 '20 19:03 metasim

@metasim an oversight (bug), should be fixed here https://github.com/locationtech/geotrellis/pull/3211/files#diff-0e95acbe337f0418d570db27a0bfd6e1R117

pomadchin avatar Mar 26 '20 19:03 pomadchin