ArchGDAL.jl
ArchGDAL.jl copied to clipboard
Speedup conversion from types `<: DataType` or `<: UnionAll` to `CEnum.Cenum` or `Enum` type ids
@yeesian and @visr, I may have found a way to speedup conversion from OGRFieldType
and OGRFieldSubType
to DataType
x200 times.
This could notably speedup layer to table conversions.
In order to move the Dict
lookup from runtime to compile time, we can create a real OGRFieldType
instead of using directly a type id as we do currently, OGRFieldType
being an Enum
This would allow to use generated convert functions and simplify @convert
macro by the way
I have added the following code mockup to types.jl:
struct Real_OGRFieldType{OGRFieldType} end
Real_OGRFieldType2DataType = Dict(
Real_OGRFieldType{OFTInteger} => Int32,
# ...
)
@generated function convert(
::Type{DataType},
real_oft::Type{T},
) where {T<:Real_OGRFieldType}
return :($(get(Real_OGRFieldType2DataType, T, nothing)))
end
nothing
could be replaced to throw error when lookup fails
This gives:
julia> using ArchGDAL; const AG=ArchGDAL
ArchGDAL
julia> using BenchmarkTools
julia> convert(DataType, AG.Real_OGRFieldType{AG.OFTInteger})
Int32
julia> @code_lowered convert(DataType, AG.Real_OGRFieldType{AG.OFTInteger})
CodeInfo(
@ /Users/Mathieu/.../dev/ArchGDAL/src/types.jl:299 within `convert'
┌ @ /Users/Mathieu/.../dev/ArchGDAL/src/types.jl within `macro expansion'
1 ─│ return Int32
└
)
julia> @benchmark convert(DataType, AG.Real_OGRFieldType{AG.OFTInteger})
BenchmarkTools.Trial: 10000 samples with 1000 evaluations.
Range (min … max): 0.026 ns … 0.050 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 0.037 ns ┊ GC (median): 0.00%
Time (mean ± σ): 0.037 ns ± 0.002 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█ ▁
▂▁▁▂▁▁▁▁▁▁▁▁▁▁▁▂▁▁▄▁▁▁▅▁▁▁▂▁▁▁▂▁▁▁▂▁▁▃▁▁▁█▁▁▁█▁▁▁▃▁▁▁▃▁▁▂ ▂
0.026 ns Histogram: frequency by time 0.041 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia> convert(DataType, AG.OFTInteger)
Int32
julia> @code_lowered convert(DataType, AG.OFTInteger)
CodeInfo(
1 ─ %1 = Base.getindex(ArchGDAL.OGRFieldType_to_DataType_map, ft)
└── return %1
)
julia> @benchmark convert(DataType, AG.OFTInteger)
BenchmarkTools.Trial: 10000 samples with 999 evaluations.
Range (min … max): 8.242 ns … 58.932 ns ┊ GC (min … max): 0.00% … 0.00%
Time (median): 8.314 ns ┊ GC (median): 0.00%
Time (mean ± σ): 8.754 ns ± 1.666 ns ┊ GC (mean ± σ): 0.00% ± 0.00%
█▆▃▃ ▃▃ ▁ ▄ ▁
████▅██▄▄█▁▁▃▁▅█▁▁▃▃▁▁▃▁▁▁▄▁▁▃▁▁▁▁▃▁▁▃▄▁▁▁▁▁▁▁▁▁▁▁▃▁▅▅▆▆▇▇ █
8.24 ns Histogram: log(frequency) by time 18.2 ns <
Memory estimate: 0 bytes, allocs estimate: 0.
julia>
That sounds like a proposition to me with almost no downsides, I'm curious what led to your choice of types (compared with e.g. https://github.com/yeesian/ArchGDAL.jl/issues/246)? Is it for the machinery of generated functions?
Well here is what I did:
- Examined
@code_lowered
output - => need to get the same as the reverse conversion
convert(::OGRFieldType, ::Int32)
- => need to have
OGRFieldType
as a real type to benefit from the Julia shortcutconvert(::Type{Type}, x::Type) = x
at https://github.com/JuliaLang/julia/blob/ae8452a9e0b973991c30f27beb2201db1b0ea0d3/base/essentials.jl#L206 , as in the reverse conversion - Tried to use
Val(oft)
and dispatch on a subtype of something likestruct Val{OGRFieldType} end
=> did not work - Mimicking what has been done with
IGeometry
, I tried withstruct NewOGRFieldType{OGRFieldType} end
=> it worked :-)
The @generated
macro, which only works on types, could do the same as the @convert
macro while being more concise. And therefore could be placed in types.jl file next to each conversion pairs collections.
Regarding the case of issue #246, maybe we could make getfield(::Feature, ::Int)
specialize, if we had types looking like Feature{FD} where FD <: FeatureDefn{NT} where NT <: NamedTuple
.
Testing would tell.
While raising this issue, I had not in mind the usage of convert(DataType, oft::OGRFielType)
, which I guess may only be a the end of asint
, ..., asdatetime
functions. I'm not even sure it is called there...
In those functions, the convert time must be less than 20% of the total time spent in getfield
.
We should then first address the issue #246, shouldn't we ?
If so and if @yeesian, you have not yet started to work on it, shall I auto assign #246 to myself ?
EDIT 1:
Looking at the output of @code_warntype
below, it seems that convert(DataType, oft::OGRFielType)
is not used asint
, ..., asdatetime
functions. I going to go on looking for where these convert(DataType, oft::OGRFielType)
functions are used, to balance to potential gain with the one of fixing #246
julia> AG.read("test/data/point.geojson") do ds
AG.getlayer(ds) do layer
AG.getfeature(layer, 0) do f
@code_warntype GDAL.ogr_f_getfieldasstring(f.ptr, 1)
end
end
end
Variables
#self#::Core.Const(GDAL.ogr_f_getfieldasstring)
arg1::Ptr{Nothing}
arg2::Int64
Body::Union{Nothing, String}
1 ─ %1 = Base.cconvert(GDAL.OGRFeatureH, arg1)::Ptr{Nothing}
│ %2 = Base.cconvert(GDAL.Cint, arg2)::Int32
│ %3 = Base.unsafe_convert(GDAL.OGRFeatureH, %1)::Ptr{Nothing}
│ %4 = Base.unsafe_convert(GDAL.Cint, %2)::Int32
│ %5 = $(Expr(:foreigncall, :(Core.tuple(:OGR_F_GetFieldAsString, GDAL.libgdal)), Cstring, svec(Ptr{Nothing}, Int32), 0, :(:ccall), :(%3), :(%4), :(%2), :(%1)))::Cstring
│ %6 = GDAL.aftercare(%5, false)::Union{Nothing, String}
└── return %6
julia> AG.read("test/data/point.geojson") do ds
AG.getlayer(ds) do layer
AG.getfeature(layer, 0) do f
@code_warntype AG.asstring(f, 1)
end
end
end
Variables
#self#::Core.Const(ArchGDAL.asstring)
feature::ArchGDAL.Feature
i::Int64
Body::String
1 ─ %1 = ArchGDAL.String::Core.Const(String)
│ %2 = GDAL.ogr_f_getfieldasstring::Core.Const(GDAL.ogr_f_getfieldasstring)
│ %3 = Base.getproperty(feature, :ptr)::Ptr{Nothing}
│ %4 = (%2)(%3, i)::Union{Nothing, String}
│ %5 = Base.convert(%1, %4)::String
│ %6 = Core.typeassert(%5, %1)::String
└── return %6
EDIT 2:
I have changed this line in utils.jl in @convert
macro with:
(eval(T2) isa Type{DataType}) && (eval(T1) <: OGRFieldType) || push!(
to drop the convert(DataType, oft::OGRFielType)
functions.
Running Pkg.test("ArchGDAL")
, reveals that only specific unit tests are failing.
https://github.com/yeesian/ArchGDAL.jl/blob/17b7e7d1200a6aead977a0e74ebd6e70d2283111/test/test_convert.jl#L64-L65
https://github.com/yeesian/ArchGDAL.jl/blob/17b7e7d1200a6aead977a0e74ebd6e70d2283111/test/test_types.jl#L24
https://github.com/yeesian/ArchGDAL.jl/blob/17b7e7d1200a6aead977a0e74ebd6e70d2283111/test/test_types.jl#L29
So these conversion functions may be useless. The move of the dictionary lookup from runtime to compile time might be useful for some other conversions, but I propose to tackle the issue #246 first
So sorry for my late response, I was dealing with a lot recently. I have not started working on #246 and have assigned it to you instead, thank you!