sisl icon indicating copy to clipboard operation
sisl copied to clipboard

Creating DM from EigenstateElectron object

Open zerothi opened this issue 6 years ago • 15 comments

Creating the DM from the eigenstate should be rather easy. Having this would immediately allow the creation of a DM from any Hamiltonian using the BrillouinZone objects.

zerothi avatar Apr 19 '18 13:04 zerothi

My plan is to add an occupation (accepts a distribution, returns array), and occupation_state method to the EigenstateElectron class which returns a new class with the occupations and eigenvalues attached.

zerothi avatar Jul 02 '18 18:07 zerothi

I have implemented the occupations. They are returned, as-is, since I think a class for these values are not that useful...

Now we only need to implement the routines which create the DM.

zerothi avatar Jul 28 '18 00:07 zerothi

Conceptually it would be something like this, right?

def DM_from_eigenstates(eigenstates):
    # Initialize a density matrix
    DM = sisl.DensityMatrix(eigenstates[0].parent)
    # Calculate and store all vectors between atoms
    Rij = DM.Rij("atom")
    
    # Loop through all eigenstates and add their contribution to the DM.
    for eigenstate in eigenstates:
        add_to_DM(DM, eigenstate, Rij)
    
    return DM

def add_to_DM(DM, eigenstate, Rij):
    # Find out which k point are we dealing with
    k = eigenstate.info["k"]
    # And its weight
    weight = eigenstate.info["weight"]
    
    # Now find out the occupations for each wf
    occs = eigenstate.occupation()
    
    # Loop through all orbitals in the unit cell
    for i in range(0, DM.shape[0]):
        # Iterate over all orbitals j that overlap with orbital i
        for j in range(0, DM.shape[1]):
            
            # Calculate c_ui * n_i * c_iv for all wfs at the same time and sum all of them
            c_n_c = eig.state[:, i] * occs * eig.state[:, DM.osc2uc(j)].conj()
            c_n_c = c_n_c.sum()
            
            # Calculate the phase factor
            phase = Rij[DM.o2a(i), DM.o2a(j)].dot(k)
            # Apply the phase factor and take only the real part
            val = np.cos(phase) * c_n_c.real - np.sin(phase) * c_n_c.imag
            
            # Add the contribution
            DM[i,j] += val * weight

pfebrer avatar Oct 29 '21 00:10 pfebrer

Not quite. The eigenstate object may be in 2 different gauges R (supercell distances) or r (atomic distances). And the phase depends on this convention. It is a bit more complicated for NC/SOC.

Also one would generally not want a dense matrix as you do it. In can be rather quick in dense form, but will scale very badly for medium sized systems.

If you want something quick then looping entries in the Rij matrix is the way to go (Rij only contains values where DM contains values. So the above will have some entries very wrong, while all the ones that will be used are correct.

Also, as far as I recall I don't store the weight in the eigenstate object... So you can't access it like that... :(

zerothi avatar Nov 01 '21 08:11 zerothi

Ok, thanks for the clarifications!

Also one would generally not want a dense matrix as you do it. In can be rather quick in dense form, but will scale very badly for medium sized systems.

Yes, I already assumed that the double loop here would be replaced by overlapping orbitals. It was just to draw a quick sketch.

If you want something quick then looping entries in the Rij matrix is the way to go

That is assuming you have the sparse pattern (e.g. you have the Hamiltonian). What if you don't? I'm about to make a PR for finding neighbours efficiently in sisl. It could be useful if you only have the eigenstates and the geometry, right? E.g. imagine in SIESTA you have stored the *.fullBZ.WFSX file but not the hamiltonian.

pfebrer avatar Nov 01 '21 21:11 pfebrer

Yes, I already assumed that the double loop here would be replaced by overlapping orbitals. It was just to draw a quick sketch.

Ok. Note the state[:, i] should be transposed, state[i, :] is correct.

If you want something quick then looping entries in the Rij matrix is the way to go

That is assuming you have the sparse pattern (e.g. you have the Hamiltonian). What if you don't? I'm about to make a PR for finding neighbours efficiently in sisl. It could be useful if you only have the eigenstates and the geometry, right? E.g. imagine in SIESTA you have stored the *.fullBZ.WFSX file but not the hamiltonian.

True. However, that also may prove difficult since you don't know the range of the atoms? So you need some extra information than just the orbitals.

Will your finding neighbours be more efficient than close? I know that kdtree's should perform much better, I just never have had the time to look into it! So if you can make it much faster than great! :)

zerothi avatar Nov 02 '21 08:11 zerothi

Will your finding neighbours be more efficient than close? I know that kdtree's should perform much better, I just never have had the time to look into it! So if you can make it much faster than great! :)

See #393 (for anyone that reads this)

pfebrer avatar Nov 02 '21 12:11 pfebrer

What's the most efficient way to store values in a sparse matrix?

I'm giving this a try and I have a loop like:

DM = sisl.DensityMatrix(geometry, nnzpr=1000)

for i, j in Rij:
    element = element_calculation()
    DM[i,j] += element

About 25s are spent calculating the matrix elements and 40s storing them :sweat_smile:.

pfebrer avatar Jun 09 '22 16:06 pfebrer

Yes, sometimes it may be better to assign multiple values at the same time, so collect data for one row at a time, that should be faster. Otherwise, try and construct it from a scipy.sparse.lil_matrix, and then use fromsp, that should be faster.

zerothi avatar Jun 09 '22 19:06 zerothi

Great, thanks!

My intuition was that I could initialize a sparse density matrix from an already existing sparsity pattern, but I couldn't find a way to do that. Is there any?

If not, I think it could be useful. Then you could simply map from Rij to the DM values without having to actually search/set values in the DM by orbital index, which would be much faster, right?

pfebrer avatar Jun 09 '22 22:06 pfebrer

Fromsp should be able to do that. But beware that rij is the atomic sparse matrix, not the orbital as far as I recall.

zerothi avatar Jun 10 '22 04:06 zerothi

Hm, I can see it doesn't work with my sparse matrix, it should I think... So I need to add that functionality.

zerothi avatar Jun 10 '22 05:06 zerothi

Changing this line

https://github.com/zerothi/sisl/blob/2d976d3ceb4424e57904e67890cbd1adf1559715/sisl/physics/sparse.py#L152

to

if isspmatrix(P) or isinstance(P, SparseOrbital): 

works.

But probably there should be a type dispatcher, like Geometry.new? Because if you provide a SparseOrbital it could be assumed that the geometry and overlap are taken from it unless stated otherwise.

However this does not fully solve my problem, which was to have some kind of way of mapping from one to the other without needing to pass the orbital indices to set the values in the new matrix.

pfebrer avatar Jun 10 '22 13:06 pfebrer

Ok, I understood how to access the csr data array directly and now I have an efficient working code to compute the DM (not for NC and SO yet). I've been testing it with a gold chain with a single atom in the unit cell and comparing to SIESTA's DM, it seems to be fine.

I have a concrete question though: you said that I should transpose the indices in the left side. I.e. to get c_ui here:

Screenshot from 2022-06-13 02-12-07

I should do state[u, i] instead of what I was doing, state[i, u]. Makes sense, however the first one (commented c_n_c line in the code) gives me a completely wrong DM, while the second one (uncommented line) gives me the right result. Could you comment if that makes sense or not? :sweat_smile:

Here's the code:

import sisl
import numpy as np
import tqdm

def compute_dm(bz, elecT=300):
    """Computes the DM from the eigenstates of a Hamiltonian along a BZ.
    
    Parameters
    ----------
    bz: BrillouinZone
        The brillouin zone object containing the Hamiltonian of the system
        and the k-points to be sampled.
    elecT: float, optional
        The electronic temperature to consider, in K.
    """
    H = bz.parent

    # Geometry
    geom = H.geometry

    # Sparsity pattern information
    Rij = H.Rij()
    orb_pairs = np.array(Rij)
    Rij_sc = Rij.o2sc(orb_pairs[:,1])
    neigh_uc = Rij.osc2uc(orb_pairs[:, 1])

    # Initialize the density matrix
    # Keep a reference to its data array so that we can have
    # direct access to it (instead of through orbital indexing).
    DM = sisl.DensityMatrix.fromsp(geom, [H], S=H.S)
    vals = DM._csr.data
    vals[:] = 0

    # Fake spin variables (in spin polarized calculations there should
    # be a spin loop)
    ispin = 0
    spin_factor = 2

    # Compute the eigenstates
    eigenstates = bz.apply.list.eigenstate()

    # Now, loop through all the eigenstates
    for k_weight, k_eigs in tqdm.tqdm(zip(bz.weight, eigenstates), total=len(eigenstates)):
        # Get the k point for which this state has been calculated (in fractional coordinates)
        k = k_eigs.info['k']
        # Convert the k points to 1/Ang
        k = k.dot(geom.rcell)

        # Ensure R gauge so that we can use supercell offsets.
        k_eigs.change_gauge("R")

        # Calculate all phases, this will be a (nnz, ) shaped array.
        phases = Rij_sc.dot(k)

        # Now find out the occupations for each wavefunction
        kT = sisl.unit_convert("K", "eV") * elecT
        distribution = sisl.get_distribution("fermi_dirac", smearing=kT, x0=0)
        occs = k_eigs.occupation(distribution)

        # Create a mask to operate only on wavefunctions with relevant occupation
        mask = occs > 1e-9
        # Alias for the array with the state coefficients
        state = k_eigs.state

        # Calculate c_ui * n_i * c_iv for all wf_i and all combinations of orbitals uv at the same time.
        #c_n_c = state[:, mask][orb_pairs[:, 0], :].T * occs[mask].reshape(-1, 1) * state[mask][:, neigh_uc].conj()
        c_n_c = state[mask][:, orb_pairs[:, 0]] * occs[mask].reshape(-1, 1) * state[mask][:, neigh_uc].conj()
        # Then just sum over the wavefunctions to get the value for each combination of orbitals.
        c_n_c = c_n_c.sum(axis=0)

        # Apply the phase factor and take only the real part. 
        # Add the weighted contribution to the final DM
        vals[:, ispin] += k_weight * (np.cos(phases) * c_n_c.real - np.sin(phases) * c_n_c.imag) * spin_factor
        
    return DM

And by running:

import plotly.express as px
fdf = sisl.get_sile("singleat_2k/RUN.fdf")

H = fdf.read_hamiltonian()

kp = sisl.get_sile(fdf.file.parent / (fdf.get("SystemLabel", "siesta") + ".KP"))
bz = kp.read_brillouinzone(H)
real_DM = fdf.read_density_matrix()

DM = compute_dm(bz)

errors = abs((real_DM - DM))
px.imshow(errors.tocsr().toarray(), color_continuous_scale="gray_r", aspect=10).update_layout(title="DM_siesta - this DM")

with the supposedly right indexing, i.e. state[u, i], commented line, I get these great differences between SIESTA's DM and the calculated one (I generated the density from it and it's complete gibberish):

newplot - 2022-06-13T021835 427

and with the uncommented line I get exactly the same as SIESTA's DM (and the density looks right):

newplot - 2022-06-13T021758 674

Sorry to bother and no rush, I'm just writing it now because it's fresh in my mind :)

pfebrer avatar Jun 13 '22 00:06 pfebrer

Nevermind, I already solved the doubts with discussions at ICN2 :sweat_smile:

pfebrer avatar Jun 16 '22 23:06 pfebrer