geotrellis
geotrellis copied to clipboard
OverviewStrategy requires improvements
After a brief interaction with the OverviewStrategy "interface", it appears to me that there are significant shortcomings with this feature.
- This is not an extensible system for resolution selection.
- The strategies that exist are hardly comprehensive.
- The semantics are not set by the interface (the responsibility for interpreting these strategies falls on the RasterSource implementation, so we can have inconsistencies between implementations).
I am proposing to improve this interface in the following way: Remove the sealed keyword and add an apply method to OverviewStrategy which takes in the list of available resolutions, the desired resolution, and returns one of the available resolutions (or the index of the best match).
If we think of the available resolution as "stops" on the number line, then we might have
coarse fine
<----------*---------*---------------*-----------------------*-------------->
1 2 ˄ 3 4
where the '*'s indicate the available resolutions and the caret indicates the request.
At the moment, we can choose the finest resolution (4) with Base (or Auto(1)?) or the next highest resolution (3) with AutoHigherResolution (or with Auto(0)?). (The semantics of Auto(n) are not clear from the docs, hence the question marks.) There's not currently a way to move to coarser resolutions (due to an inexplicable requirement that Auto(n) have n ≥ 0), or to "round" to the closest resolution.
More than anything, the lack of set semantics for these strategies strongly suggests that we need to improve this API, lest different users of this interface diverge in behavior.
Connects #3196
We discussed this issue at length and the consensus is that this would make good improvement.
The main consequence of changing OverviewStrategy to be extensible is that it will no longer be possible for GDALRasterSource to operate on all sub-classes of it since there is no way to ensure that each possible subclass maps to an actual GDAL parameter. In those case GDALRasterSource will have to throw. This is not ideal but seems "fine".
// client-code.jar
val source: GDALRasterSource = ???
source.resample(OverviewStrategy.Base) // works
case object WeirdStrategy extends OverviewStrategy {
override def select(candidates: List[CellSize]): Integer = -1
}
source.resample(WeirdStrategy ) // throws
The discussion finished on how many types we are willing to introduce to make that exception, which can't be avoided, as obvious as possible and to reduce the chance of downstream changes making GDAL and GeoTrellis behavior inconsistent.
Currently the majority of people prefer Option 1 because it introduces the least nouns.
Option 1
// geotrellis-raster.jar
// package geotrellis.raster
trait OverviewStrategy {
def select(candidates: List[CellSize]): Integer
}
object OverviewStrategy {
/** This strategy matches the GDAL Base strategy */
case object Base extends OverviewStrategy {
def select(candidates: List[CellSize]): Integer = 0
}
case class Auto(n: Int) extends OverviewStrategy {
def select(candidates: List[CellSize]): Integer = n
}
}
trait RasterSource {
def resample(strategy: OverviewStrategy)
}
// geotrellis-gdal.jar
// package geotrellis.raster.gdal
class GDALRasterSource extends RasterSource {
override def resample(strategy: OverviewStrategy): Unit = {
strategy match {
case OverviewStrategy.Auto(n) => s"-o AUTO$n"
case OverviewStrategy.Base => "-o BASE"
case _ => ???
}
}
}
In this option GDALRasterSource has a private match that will translate a fixed list of GeoTrellis overview strategies to GDAL cases. The main concern is that if somebody tweaked the behavior of OvervewStrategy.Auto they would not realize that they're potentially making it inconsistent with GDAL behavior. However this is the most succinct option.
Option 2
// geotrellis-raster.jar
// package geotrellis.raster
trait OverviewStrategy {
def select(candidates: List[CellSize]): Integer
}
sealed trait GDALOverviewStrategy extends OverviewStrategy {
def gdalParams: String
}
object OverviewStrategy {
case object Base extends GDALOverviewStrategy {
def select(candidates: List[CellSize]): Integer = 0
def gdalParams: String = "-o BASE"
}
case class Auto(n: Int) extends GDALOverviewStrategy {
def select(candidates: List[CellSize]): Integer = n /
def gdalParms: String = s"-o AUTO$n"
}
}
trait RasterSource {
def resample(strategy: OverviewStrategy)
}
// geotrellis-gdal.jar
// package geotrellis.raster.gdal
class GDALRasterSource extends RasterSource {
override def resample(strategy: OverviewStrategy): Unit = {
strategy match {
case s: GDALOverviewStrategy => s.gdalParams
case _ => sys.error("bad")
}
}
}
Here we introduce geotrellis.raster.GDALOverStrategy. This class has to be defined outside of geotrellis-gdal.jar such that OverviewStrategy.Auto may extend it. If anybody were to change OverviewStrategy.Auto they would have to immediately think about their life and the interface at hand. Additionally it establishes a link in behavior between that implementation and GDAL option which may be well known.
This costs us introducing GDALOverviewStrategy to root GeoTrellis package without implementation of GDALRasterSource which has a heavy dependency. This is "fine" because its definitional but pollutes the namespace some.
Notes
The actual interface for OverviewStrategy is not fully pinned down here. Questions like "is List really the right type here?` and other possible overloads are up in the air a bit.
A few random thoughts, with the caveat that I'm not sure I understand the specific use case, so given that, just ignore my questions if they are not worth the time.
- When does
selectget called, and does the listcandidatesexist ingeotrellis.rasteror only ingeotrellis.gdal? - I see in GDAL docs an
-ovr <level|AUTO|AUTO-n|NONE>option for overview selection. Is this what we're talking about? If so, would exposing thelevelmode viacase objects take care of the possible set of modes? Then you'd end up with:
// geotrellis-raster.jar
// package geotrellis.raster
trait OverviewStrategy {
def select(candidates: List[CellSize]): Integer
}
object OverviewStrategy {
/** This strategy matches the GDAL Base strategy */
case object Base extends OverviewStrategy {
def select(candidates: List[CellSize]): Integer = 0
}
case class Auto(n: Int) extends OverviewStrategy {
def select(candidates: List[CellSize]): Integer = n
}
trait UserSpecified extends OverviewSrategy
}
trait RasterSource {
def resample(strategy: OverviewStrategy)
}
// geotrellis-gdal.jar
// package geotrellis.raster.gdal
class GDALRasterSource extends RasterSource {
override def resample(strategy: OverviewStrategy): Unit = {
strategy match {
case OverviewStrategy.Auto(n) => s"-o AUTO$n"
case OverviewStrategy.Base => "-o BASE"
case s: OverviewStrategy.UserSpecified =>
val candidates = ???
val n = s.select(candidates)
s"-o $n"
}
}
}
- Why is
strategyignored here? Because of this issue? - IMO, don't pollute the
rasternamespace with gdal stuff. You'll regret it, and I'm confident there are design patterns that can save us from that (perhaps at the expense of new abstractions). This is the classic expression problem. Typeclasses/type constraints might help. I'd fight hard against option 2.
@metasim
- because it is a typo, should be
options.copy(resampleMethod = Some(method)).resample(gridExtent, resampleTarget)
typo, or bug?
@metasim an oversight (bug), should be fixed here https://github.com/locationtech/geotrellis/pull/3211/files#diff-0e95acbe337f0418d570db27a0bfd6e1R117