isa-api icon indicating copy to clipboard operation
isa-api copied to clipboard

'TypeError: Could not resolve data type labeled:' when conversion from Tab to JSON

Open casper937 opened this issue 2 years ago • 12 comments

Hi,

When writing my ISA object to JSON, I get an error: 'TypeError: Could not resolve data type labeled:'.

The error can be traced back to line 1729 of isajson.py. What I noticed here, was that in line 1727, Labeled Extract was named material/labledextract- in the code. However, back in line 522 of isajson.py, this is referred to as labeledextract- (and extra 'e' in the name).

Could this be causing my error? If not, would anyone know what could cause this error?

Thanks in advance,

Casper

casper937 avatar Nov 24 '21 11:11 casper937

Hi @casper937 thx for reporting!

I have picked up on that too and had made the correction in a branch (not yet merged). We'll need to add a test checking that element.

Could you confirm the version of ISA-API you are running? All the best Phil

proccaserra avatar Nov 24 '21 17:11 proccaserra

Hi Phil,

Thanks for your response!

The isatools installation that I have been using is 0.12.0. Is this sufficient information?

casper937 avatar Nov 25 '21 09:11 casper937

Hi @casper937 Thx for sending the info. So I have now tested with the various isa-api releases from 0.12.0 to the latest commit in the develop branch and I can serialize to ISA Tab and ISA JSON without being able to reproduce the error you are facing. Upon checking the code in the isajson.py, while there is indeed a typo, it shouldn't affect the serialisation as far as I can tell. I will push a jupyter notebook showing an example using Labeled Extracts and that the isa-api is ok writing to easy tab or json. However, i am curious as to how you end up facing that issue. Would you be able to share the code ? if not ok on this channel, just get in touch via email so we can get to the bottom of it. ATB P

proccaserra avatar Nov 25 '21 11:11 proccaserra

Sure,

We have a public Github repo, with Jupyter Notebook: https://github.com/Xomics/ISA-ACTION-Template/blob/main/ISA_ACTION-Template.ipynb. The code for metabolomics is in the last few code blocks (there are three different metabolomics assays, but these are created with similar code).

Looking forward to the Labeled Extracts example! Could you notify me here, when you have pushed this?

Thanks a lot! Casper

casper937 avatar Nov 25 '21 13:11 casper937

@casper937 do you happen t have a dummy ACTIONdemonstrator_XOmics_IDs_fake.csv handy you could push to the repo so I could run the code? thx

proccaserra avatar Nov 25 '21 13:11 proccaserra

Of course, my mistake. I added it to the repo:

https://github.com/Xomics/ISA-ACTION-Template/blob/main/ACTION_Dummy_file.csv

casper937 avatar Nov 25 '21 14:11 casper937

@casper937 ok so I managed to reproduce the error with your notebook when using isa-api 0.12.2, but i confirm that there error is not related to the typo you saw. The message leads to the confusion (since it use labeled). The real cause of the pb is the following in the notebook you shared:

            ## Chromatography
            separated_molecules = Material(
                name = "separated_molecules_{0}".format(row["XOmicsmetaboID"])
            )

This creates an ISA Material but does not set its type, and currently the type for otherMaterial can only be Extract Name or Labeled Extract Name (enum in the ISA JSON schema).

The question is therefore: Do you to identify the Fractions coming out of the Chromatography columns?

If not, then the fix would be to chain the Chromatography and MS processing, without explicitly specifying the output of the chromatography process. The process_sequence object and plink function would allow the graph to be built. (there was also minor issue such as a missing closing bracket)

Having said that, when testing again with the most recent code (isa-api 0.13.rcX), the same code fails with another error (Source not iterable) which I am now investigating.

proccaserra avatar Nov 25 '21 20:11 proccaserra

Hi @casper937 (busy week-> delay): after some digging, I have spotted a couple of issues with the notebooks which caused failure:

if using derives_from on ISA.Materials, make sure to append to a list:

urine_sample = Sample(
                name = urine_sample_name, 
                derives_from = [source])

instead of:

urine_sample = Sample(
                name = urine_sample_name, 
                derives_from = source)

this caused the "Source not iterable" error.

However, the initial error kept cropping up so I ended up 'bypassing' the material objects resulting from the 'chromatograph separation' (as I indicated in my previous communication). I also started testing a simpler version of the study (dropping all but one assay).

I have now isolated the issue to a conflict on Node naming between 2 assays (now focusing on the code generating to objects for assay_metabolomics_steroids.

Leaving that assay out, I currently can serialize to ISA-JSON.

I'll update you when I identify the root of the issue.

{
    "@id": "#investigation/5133003936",
    "comments": [],
    "description": "Predict childhood aggression with multi-omics data and demonstrate the FAIRification process and data analysis of a multi-omics project",
    "identifier": "tbd",
    "ontologySourceReferences": [
        {
            "@id": "#ontology/5133320400",
            "comments": [],
            "description": "Allotrope Merged Ontology Suite",
            "file": "",
            "name": "AFO",
            "version": ""
        },
        {
            "@id": "#ontology/5132905344",
            "comments": [],
            "description": "Chemical Entities of Biological Interest",
            "file": "",
            "name": "CHEBI",
            "version": ""
        },
        {
            "@id": "#ontology/5132906304",
            "comments": [],
            "description": "Chemical Methods Ontology",
            "file": "",
            "name": "CHMO",
            "version": ""
        },
        {
            "@id": "#ontology/5132471888",
            "comments": [],
            "description": "Bioinformatics operations, data types, formats, identifiers and topics",
            "file": "",
            "name": "EDAM",
            "version": ""
        },
        {
            "@id": "#ontology/5132471072",
            "comments": [],
            "description": "Experimental Factor Ontology",
            "file": "",
            "name": "EFO",
            "version": ""
        },
        {
            "@id": "#ontology/5132471264",
            "comments": [],
            "description": "An ontology of research resources such as instruments, protocols, reagents, animal models and biospecimens",
            "file": "",
            "name": "eagle-i resource ontology",
            "version": ""
        },
        {
            "@id": "#ontology/5130481824",
            "comments": [],
            "description": "Medical Action Ontology",
            "file": "",
            "name": "MAXO",
            "version": ""
        },
        {
            "@id": "#ontology/5132145568",
            "comments": [],
            "description": "Metabolite Standards Initiative Ontology",
            "file": "",
            "name": "MSIO",
            "version": ""
        },
        {
            "@id": "#ontology/5132147824",
            "comments": [],
            "description": "NCBI organismal classification",
            "file": "",
            "name": "NCBITAXON",
            "version": ""
        },
        {
            "@id": "#ontology/5132146912",
            "comments": [],
            "description": "NCI Thesaurus OBO Edition",
            "file": "",
            "name": "NCIT",
            "version": ""
        },
        {
            "@id": "#ontology/5132147776",
            "comments": [],
            "description": "Ontology for Biomedical Investigations",
            "file": "",
            "name": "OBI",
            "version": ""
        },
        {
            "@id": "#ontology/5132147296",
            "comments": [],
            "description": "PATO - the Phenotype And Trait Ontology",
            "file": "",
            "name": "PATO",
            "version": ""
        },
        {
            "@id": "#ontology/5132147584",
            "comments": [],
            "description": "Uber-anatomy ontology",
            "file": "",
            "name": "UBERON",
            "version": ""
        }
    ],
    "people": [],
    "publicReleaseDate": "",
    "publications": [],
    "studies": [
        {
            "@id": "#study/5132906448",
            "assays": [
                {
                    "@id": "#5132146480",
                    "characteristicCategories": [],
                    "comments": [],
                    "dataFiles": [],
                    "filename": "a_assay_genotype.txt",
                    "materials": {
                        "otherMaterials": [],
                        "samples": [
                            {
                                "@id": "#sample/5132532752",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/bbd479f1-1ebf-4297-821e-808024226d91"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP1"
                            },
                            {
                                "@id": "#sample/5132838560",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/38b7d729-46d7-416c-b301-e3b41b80c182"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP2"
                            },
                            {
                                "@id": "#sample/5132783568",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/0ccd9bcf-d6e8-4ea3-98f0-0eeb18a2f956"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP3"
                            },
                            {
                                "@id": "#sample/5132781552",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/d7659100-9af2-46d3-92d0-8ffca07bc0b6"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP4"
                            },
                            {
                                "@id": "#sample/5132783968",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/44adb55c-687d-449c-ba59-c6581d1ec35f"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP5"
                            },
                            {
                                "@id": "#sample/5132785984",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/cfa18ea3-2ef8-4fda-b93f-c9dac03174b1"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP6"
                            },
                            {
                                "@id": "#sample/5132751728",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/92e7ba02-7272-4907-b656-83dcc687f850"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_mucosa_XOP7"
                            },
                            {
                                "@id": "#sample/5132753744",
                                "characteristics": [
                                    {
                                        "category": {
                                            "@id": "#annotation_value/564a4fb3-ac3c-424b-a7c3-6b6181c83512"
                                        },
                                        "comments": [],
                                        "value": ""
                                    }
                                ],
                                "comments": [],
                                "factorValues": [],
                                "name": "buccal_muc…

proccaserra avatar Dec 01 '21 22:12 proccaserra

update: root cause => separated_molecules = Material( name = "separated_molecules_{0}".format(row["XOmicsmetaboID"]) )

dropping them on each of the MS assay clears the serialization problem:

           ## Chromatography
#             separated_molecules = Material(
#                 name = "new_separated_molecules_{0}".format(row["XOmicsmetaboID"],
#                 type_ ="Labeled Extract Name")  
#             )
            
            instrument = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Chromatography Instrument")), value = "Agilent 1290")
            column_model = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Column model")), value = "Acquity UPLC CSH C18 column (Waters)")
            column_type = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Column type")), value = "reverse phase")

            
            chromatography_process = Process(
                name = "chromatography_{0}".format(row["XOmicsmetaboID"]),
                executes_protocol = chromatography,
                parameter_values = [instrument, column_model, column_type],
                inputs = [labelling_process.outputs[0]], 
              **  outputs = []
                #outputs = [separated_molecules] **
            )
            
            ## Mass spectrometry
            scan_polarity = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Scan polarity")), value = "switching positive and negative ion mode !! MAYBE SERPARATE INTO NEGATIVE AND POSITIVE ASSAY?")
            scan_range = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Scan m/z range")), value = "5-3000?")
            instrument = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Instrument")), value = "Agilent 6460")
            ion_source = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Ion source")), value = "ESI")
            mass_analyzer = ParameterValue(category = ProtocolParameter(parameter_name=OntologyAnnotation(term="Mass Analyzer")), value = "triple quadrupole")
            
            
            mass_spectrometry_process = Process(
                name = "mass_spectrometry_{0}".format(row["XOmicsmetaboID"]),
                executes_protocol= mass_spectrometry,
                parameter_values = [scan_polarity, scan_range, instrument, ion_source, mass_analyzer],
              ** # inputs = [separated_molecules],
                inputs = [],**
                outputs = [raw_datafile]
            )

and then:

# Assay.other_material.append(separated_molecules)

Since the chromatography fractions aren't stored, not making them explicit does not seem to lose much but you know best.

Let us know if this is critical for your use case and if so why.

proccaserra avatar Dec 01 '21 22:12 proccaserra

Thank you very much for the great support.

I will leave out the separated molecules as Material. We were already planning to leave this out..

casper937 avatar Dec 02 '21 08:12 casper937

I just run the code without the separated molecules Material and it worked. Many thanks again!

casper937 avatar Dec 02 '21 08:12 casper937

I'll send a PR to your repo with the modified notebook when I have a moment. I have a couple of suggestions, for instance simplifying the 'sampling process' and dropping the ParameterValue to report anatomical part, which would fit better as Characteristics of the Sample (which is in fact what you use already). atb P

proccaserra avatar Dec 02 '21 09:12 proccaserra