add cur variants
Three variants of CUR are implemented in PotentialLearning.jl: LinearTimeCUR, DEIMCUR, and LSCUR.
Hi!, I updated this branch with last changes from main. I added the PCA-ACE example which uses the PCAState datatype.
The current CUR codes need to be adapted to the PotentialLearning.jl interface. They should be similar to PotentialLearning/src/DimensionReduction/pca_state.jl. In particular, functions like fit!(ds::DataSet, ltcur::LinearTimeCUR) and transform!(ds::DataSet, ltcur::LinearTimeCUR) should be implemented. The use of such functions should be similar to PotentialLearning/examples/PCA-ACE-aHfO2/fit-pca-ace-ahfo2.jl:
pca = PCAState(tol = n_desc)
fit!(ds_train, pca)
transform!(ds_train, pca)
Another option is to work with an exposed interface and directly create the descriptor matrices in the example and then apply CUR (using functions such as those below). In that case we need to move current CUR codes to the example folder. We can continue the discussion at the next developer meeting.
# Global energy descriptors
function get_ged(ds)
ged = sum.(get_values.(get_local_descriptors.(ds)))
ged_mat = stack(ged)'
return ged_mat
end
# Local energy descriptors
function get_led(ds)
led = get_values.(get_local_descriptors.(ds))
led_mat = vcat([stack(l)' for l in led]...)
return led_mat
end
# Global force descriptors
function get_gfd(ds)
gfd = [reduce(vcat, get_values(get_force_descriptors(dsi)))
for dsi in ds]
gfd_mat = vcat([stack(f)' for f in gfd]...)
return gfd_mat
end
# Local force descriptors
using JuLIP
using InteratomicPotentials: convert_system_to_atoms
function get_lfd(ds, basis)
lfd_mat = []
for d in ds
s = get_system(d)
a = convert_system_to_atoms(s)
for j = 1:length(a)
lfdj = site_energy_d(basis.rpib, a, j)
aux = reduce(hcat, [vcat(lfdj[k]...) for k = 1:50])
push!(lfd_mat, aux)
end
GC.gc()
end
lfd_mat = vcat(lfd_mat...)
return lfd_mat
end