pysradb icon indicating copy to clipboard operation
pysradb copied to clipboard

[BUG] All arrays must be of the same length

Open Rohit-Satyam opened this issue 1 year ago • 1 comments

Describe the bug Unable to run pysradb for GSE198257 To Reproduce Steps to reproduce the behavior:

## Installation: pip install git+https://github.com/saketkc/pysradb
pysradb gse-to-srp  GSE198257

Traceback (most recent call last):
  File "/home/subudhak/miniconda3/bin/pysradb", line 8, in <module>
    sys.exit(parse_args())
             ^^^^^^^^^^^^
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pysradb/cli.py", line 1206, in parse_args
    gse_to_srp(args.gse_ids, args.saveto, args.detailed, args.desc, args.expand)
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pysradb/cli.py", line 232, in gse_to_srp
    df = sradb.gse_to_srp(
         ^^^^^^^^^^^^^^^^^
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pysradb/sraweb.py", line 799, in gse_to_srp
    new_gse_df = pd.DataFrame(
                 ^^^^^^^^^^^^^
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pandas/core/frame.py", line 767, in __init__
    mgr = dict_to_mgr(data, index, columns, dtype=dtype, copy=copy, typ=manager)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 503, in dict_to_mgr
    return arrays_to_mgr(arrays, columns, index, dtype=dtype, typ=typ, consolidate=copy)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 114, in arrays_to_mgr
    index = _extract_index(arrays)
            ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/subudhak/miniconda3/lib/python3.11/site-packages/pandas/core/internals/construction.py", line 677, in _extract_index
    raise ValueError("All arrays must be of the same length")
ValueError: All arrays must be of the same length

Desktop (please complete the following information):

  • OS: [ Ubuntu 20.04]
  • Python version [Python 3.11.8]

Rohit-Satyam avatar Apr 07 '24 15:04 Rohit-Satyam

I'm getting the same error for GSE279289. The sradb.gse_to_srp code assumes that all accessions return a dataframe, but some return None, which caused the error:

def fetch_gds_results(self, gse, **kwargs):
        result = self.get_esummary_response("geo", gse)

        try:
            uids = result["uids"]
        except KeyError:
            print("No results found for {} | Obtained result: {}".format(gse, result))
            return None
        gse_records = []
        for uid in uids:
            record = result[uid]
            del record["uid"]
            if record["extrelations"]:
                extrelations = record["extrelations"]
                for extrelation in extrelations:
                    keys = list(extrelation.keys())
                    values = list(extrelation.values())
                    assert sorted(keys) == sorted(
                        ["relationtype", "targetobject", "targetftplink"]
                    )
                    assert len(values) == 3
                    record[extrelation["relationtype"]] = extrelation["targetobject"]
                del record["extrelations"]
                gse_records.append(record)
        if not len(gse_records):
            print("No results found for {}".format(gse))
            return None
        return pd.DataFrame(gse_records)

The correct type hint for the return is -> Optional[pd.DataFrame].

However, the possible None return is not accounted for:

    def gse_to_srp(self, gse, **kwargs):
        if isinstance(gse, str):
            gse = [gse]
        gse_df = self.fetch_gds_results(gse, **kwargs)
        gse_df = gse_df.rename(
            columns={"accession": "study_alias", "SRA": "study_accession"}
        )

nick-youngblut avatar Nov 05 '24 03:11 nick-youngblut

This is now fixed:

 pysradb gse-to-srp  GSE198257
study_alias     study_accession
GSE198257       SRP363227
GSE198257       SRP363224

saketkc avatar Oct 07 '25 05:10 saketkc