reciprocalspaceship icon indicating copy to clipboard operation
reciprocalspaceship copied to clipboard

RS requires set_index(["H","K","L"]) to create a multi_index using rs.DataSet

Open DHekstra opened this issue 9 months ago • 5 comments

Problem: Derek was trying to construct an rs.DataSet object from NumPy arrays (h, k, l, wave, etc.) for unmerged data. The following,

 ds = rs.DataSet({"H":h, "K":k, "L":l, "WAVE": wave, "dH":dh, "dK":dk, "dL":dl,
               "NPIX": npix_col, "BATCH": batch, "SigI": sigI_vals, "I": I_vals, "X":x, "Y":y},
               cell=args.ucell, spacegroup=sg_num, merged=False)
    ds["PARTIAL"] = True
    ds.write_mtz(args.mtz)

this throws the following error.

In [1]: ds.write_mtz("wow.mtz")

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[1], line 1
----> 1 ds.write_mtz("wow.mtz")

File /global/cfs/cdirs/lcls/dermen/postcori/xtal/conda_base/lib/python3.8/site-packages/reciprocalspaceship/dataset.py:612, in DataSet.write_mtz(self, mtzfile, skip_problem_mtztypes, project_name, crystal_name, dataset_name)
    586 """
    587 Write DataSet to MTZ file.
    588 
   (...)
    608     Dataset name to assign to MTZ file
    609 """
    610 from reciprocalspaceship import io
--> 612 return io.write_mtz(
    613     self,
    614     mtzfile,
    615     skip_problem_mtztypes,
    616     project_name,
    617     crystal_name,
    618     dataset_name,
    619 )

File /global/cfs/cdirs/lcls/dermen/postcori/xtal/conda_base/lib/python3.8/site-packages/reciprocalspaceship/io/mtz.py:225, in write_mtz(dataset, mtzfile, skip_problem_mtztypes, project_name, crystal_name, dataset_name)
    191 def write_mtz(
    192     dataset,
    193     mtzfile,
   (...)
    197     dataset_name,
    198 ):
    199     """
    200     Write an MTZ reflection file from the reflection data in a DataSet.
    201 
   (...)
    223         Dataset name to assign to MTZ file
    224     """
--> 225     mtz = to_gemmi(
    226         dataset, skip_problem_mtztypes, project_name, crystal_name, dataset_name
    227     )
    228     mtz.write_to_file(mtzfile)
    229     return

File /global/cfs/cdirs/lcls/dermen/postcori/xtal/conda_base/lib/python3.8/site-packages/reciprocalspaceship/io/mtz.py:151, in to_gemmi(dataset, skip_problem_mtztypes, project_name, crystal_name, dataset_name)
    149         continue
    150     else:
--> 151         raise ValueError(
    152             f"column {c} of type {cseries.dtype} cannot be written to an MTZ file. "
    153             f"To skip columns without explicit MTZ dtypes, set skip_problem_mtztypes=True"
    154         )
    155 mtz.set_data(temp[columns].to_numpy(dtype="float32"))
    157 # Handle Unmerged data

ValueError: column index of type int64 cannot be written to an MTZ file. To skip columns without explicit MTZ dtypes, set skip_problem_mtztypes=True

Changing the last line by adding infer_mtz_datatypes().set_index(["H","K","L"], drop=True) solves the problem.

ds.infer_mtz_dtypes().set_index(["H","K","L"], drop=True).write_mtz(args.mtz)

I'll leave it to @dermen and @kmdalton to comment further.

DHekstra avatar May 29 '24 21:05 DHekstra