odis-arch icon indicating copy to clipboard operation
odis-arch copied to clipboard

connect Helio-KNOW graph as ODIS node

Open jmckenna opened this issue 2 years ago • 19 comments

  • home: https://github.com/rmcgranaghan/Helio-KNOW/

Possible next steps:

  • draft a JSON-LD template together
    • using ODIS pattern templates at https://book.oceaninfohub.org/thematics/index.html
  • Helio-KNOW to create a sitemap.xml pointing to the record page(s) (or directly to the JSON-LD), and place that file on the web (or in GitHub)
  • Helio-KNOW to create an entry in the ODIS Catalogue
    • important fields are
      • Startpoint URL for ODIS-Arch (this is the url to your sitemap.xml file)
      • Type of the ODIS-Arch URL (select "Sitemap")
  • OIH team to harvest into ODIS

cc @rmcgranaghan

jmckenna avatar Feb 08 '24 16:02 jmckenna

Hi @jmckenna thanks for creating this

I'm now preparing for a working meeting with you. What would be excellent to have ready prior to our discussion?

rmcgranaghan avatar Mar 26 '24 18:03 rmcgranaghan

hi @rmcgranaghan, related to the first step above of "draft a JSON-LD template together", maybe have a sample "dataset" or other type, that we can draft a template together for. The different types of templates are listed here, for example, inside the "dataset/graphs" folder there, there is a datasetTemplate.json that we could possibly adapt together for one of your datasets.

jmckenna avatar Mar 26 '24 20:03 jmckenna

@rmcgranaghan in other words, the goal will be to adapt metadata for one of your datasets, into that template.

jmckenna avatar Mar 26 '24 20:03 jmckenna

@rmcgranaghan then the goal will be to validate your metadata template, through the schema.org validator. We can explain all this during that tech meeting.

jmckenna avatar Mar 26 '24 20:03 jmckenna

Thanks for this outstanding guidance @jmckenna

tagging my colleague @lechatpito who will be collaborating with me on this. We will review this thread and schedule a technical meeting with you

rmcgranaghan avatar Mar 29 '24 16:03 rmcgranaghan

@jmckenna

Two quick thoughts and then I think we will be ready for a technical meeting:

  1. We have this for general Helio data https://hpde.io/ - does that meet the requirements you are hoping for?
  2. I'm working on something similar, but specifically to be used for Heliophysics phenomena (the holy grail for science researchers to be able to search and discover by).

rmcgranaghan avatar Apr 22 '24 18:04 rmcgranaghan

@jmckenna please excuse the ping, but a friendly check-in on the comment above to find out if that is something with which we could move forward? Additionally, @lechatpito is now more familiar with the ODIS architecture and we might be in a good place to hold the first technical discussion

rmcgranaghan avatar May 09 '24 14:05 rmcgranaghan

@rmcgranaghan is there a Helio-KNOW dataset you have in mind that has Ocean related data? eg effect of the sun on ocean tides... Once we have identified that dataset I can start working on the JSON-LD template.

lechatpito avatar May 13 '24 14:05 lechatpito

@lechatpito good question - we are interested in exploring this architecture to create more of a Heliophysics data exchange rather than directly tie it to the ocean system right now. That being said, there are clear interconnections, which are spelled out in places like the Environment Ontology (ENVO) https://www.ebi.ac.uk/ols4/ontologies/envo and SWEET (sweetontology.net/)

rmcgranaghan avatar May 17 '24 15:05 rmcgranaghan

@jmckenna friendly reminder about this open issue - would you like to schedule a technical working session?

rmcgranaghan avatar Jul 05 '24 16:07 rmcgranaghan

@rmcgranaghan sincere apologies for the quietness on our side, will send you an email about the next technical working session, to keep the momentum going :)

jmckenna avatar Jul 09 '24 18:07 jmckenna

@lechatpito @rmcgranaghan I was using this entry to check for JSON-LD embedded on the page, here are some comments (to re-start the discussion), I have also pasted the raw JSON-LD below, for others like @pbuttigieg to give you more feedback):

  • it seems to be missing the top-level @id (usually points to wherever the JSON-LD lives, which in this case is the same landing page. I would place it before your @type
  • creditText would be the "Recommended Citation" for your dataset, whereas citation is used when you are referring to using someone else's creative work or dataset.
  • specifying sdPublisher would help our ODIS searches for your entries
  • it would be great if you could include spatialCoverage where possible
  • if you haven't done so already, please enter your new endpoint into the ODIS Catalogue, pointing to your sitemap (see the Getting Started with ODIS page for tips on that) thanks!
{
		"@context": "https://schema.org/",
		"@type" :"Dataset",
		"name": "Australian Space Weather Services, SWS, Culgoora Solar Observatory Solar Image",
    
		"dateModified": "2022-02-24",
                "identifier": "spase://ASWS/DisplayData/Solar_Image/Clg_Solar_Image",
                "creditText": "We are thankful to the Culgoora Solar Observatory of Space Weather Network, Bureau of Meteorology of Australia for the observations of Culgoora Solar Image data.",
 		"description": "Australia has made major contributions to research in solar physics for several decades. Solar images are a major kind of records of solar activities. The SWS has monitored and archived solar activity images from Culgoora and Learmonth for more than two decades. They include H-Alpha (2003-2014) and White Light (2004-2013) images from Culgoora, and H-Alpha (2004-2009), GONG H-Alpha (since 2013), GONG White Light (since 2010), and GONG Magnetogram (since 2010) images from Learmonth. A new set of three telescopes, which are on a single mount, has installed at Culgoora. Culgoora solar image data are available since 2003-02-01.",
		"abstract": "Australia has made major contributions to research in solar physics for several decades. Solar images are a major kind of records of solar activities. The SWS has monitored and archived solar activity images from Culgoora and Learmonth for more than two decades. They include H-Alpha (2003-2014) and White Light (2004-2013) images from Culgoora, and H-Alpha (2004-2009), GONG H-Alpha (since 2013), GONG White Light (since 2010), and GONG Magnetogram (since 2010) images from Learmonth. A new set of three telescopes, which are on a single mount, has installed at Culgoora. Culgoora solar image data are available since 2003-02-01.",
		"temporalCoverage": "2003-01-02T05:00:00Z/...",
                 "genre": "Spectrum",
		"keywords": [  "Solar Image", "H-Alpha Image", "White Light Image" ],
		"license": "https://cdla.io/permissive-1-0/",
                "audience":{
                  "@type": "Audience",
                  "audienceType": ["Space Physicist", "Space Community", "Data Scientists", "Machine Learning Users"]
                }
}

jmckenna avatar Jul 09 '24 19:07 jmckenna

Thanks @jmckenna

in addition:

  • note that the dateModified property is about the dataset, not the metadata in the JSON-LD. See the other "sd" properties if you're describing the metadata
  • the usage of the genre property is interesting, but I don't think it's semantically precise. I would add a stanza in the distribution property (type DataDownload) and use the properties in there to describe the flavour and format of the dataset
  • in addition to the @id property, you should have a url property with a URL value that points to the human-friendly Web page for the dataset - this is where users will be directed to when they click on your records in Ocean InfoHub and likely other search and discovery systems
  • if the dataset has an identifier like a UUID, DOI, or similar, it's wise to add that as the value of an identifier property

pbuttigieg avatar Jul 09 '24 19:07 pbuttigieg

Thank you both for your suggestions, here is an updated version of the example:

{
		"@context": "https://schema.org/",
		"@id": "https://hpde.io/ASWS/DisplayData/Solar_Image/Clg_Solar_Image", 
		"@type" :"Dataset",
		"name": "Australian Space Weather Services, SWS, Culgoora Solar Observatory Solar Image",
        "url": "https://hpde.io/ASWS/DisplayData/Solar_Image/Clg_Solar_Image.html",
		"publisher": {
			"@type": "Person",
			"identifier": "spase://ASWS/Person/Kehe.Wang",
			"url": "http://hpde.io/ASWS/Person/Kehe.Wang"
		}
		"sdDatePublished": "2022-02-24",
		"sdPublisher": {
             "@type": "Organization",
             "name": "hdpe.io"
		}
		"includedInDataCatalog": {
			"@type": "DataCatalog",
			"name": "HPDE.io / SPASE"
		}
        "identifier": "spase://ASWS/DisplayData/Solar_Image/Clg_Solar_Image",
        "creditText": "We are thankful to the Culgoora Solar Observatory of Space Weather Network, Bureau of Meteorology of Australia for the observations of Culgoora Solar Image data.",
 		"description": "Australia has made major contributions to research in solar physics for several decades. Solar images are a major kind of records of solar activities. The SWS has monitored and archived solar activity images from Culgoora and Learmonth for more than two decades. They include H-Alpha (2003-2014) and White Light (2004-2013) images from Culgoora, and H-Alpha (2004-2009), GONG H-Alpha (since 2013), GONG White Light (since 2010), and GONG Magnetogram (since 2010) images from Learmonth. A new set of three telescopes, which are on a single mount, has installed at Culgoora. Culgoora solar image data are available since 2003-02-01.",
		"abstract": "Australia has made major contributions to research in solar physics for several decades. Solar images are a major kind of records of solar activities. The SWS has monitored and archived solar activity images from Culgoora and Learmonth for more than two decades. They include H-Alpha (2003-2014) and White Light (2004-2013) images from Culgoora, and H-Alpha (2004-2009), GONG H-Alpha (since 2013), GONG White Light (since 2010), and GONG Magnetogram (since 2010) images from Learmonth. A new set of three telescopes, which are on a single mount, has installed at Culgoora. Culgoora solar image data are available since 2003-02-01.",
		"temporalCoverage": "2003-01-02T05:00:00Z/...",
        "variableMeasured": "Spectrum",
		"spatialCoverage": ["Sun", "Earth.Magnetosphere", "Heliosphere.Inner", "Heliosphere.Outer"],
		"keywords": [  "Solar Image", "H-Alpha Image", "White Light Image" ],
		"license": "https://cdla.io/permissive-1-0/",
                "audience":{
                  "@type": "Audience",
                  "audienceType": ["Space Physicist", "Space Community", "Data Scientists", "Machine Learning Users"]
                }
}

lechatpito avatar Aug 13 '24 16:08 lechatpito

Some refinements (as rough notes during a meeting):

Main ideas:

  • Some syntactic cleanup and use of @vocab to JSONify things
  • clarification that the "sd" properties are about the JSON-LD metadata itself, not the dataset described (other properties available for that)
  • Use of DefinedTerm stanzas in keywords of @Place to better link semantic resources to metadata graphs (will be very important downstream), while avoiding idiosyncratic syntax, or at least providing more standard syntax (e.g. "Earth magnetosphere" alongside "Earth.Magnetosphere"). Used UATS as an example to riff on.
  • Added distribution property and relevant values to accommodate direct downloads show by @rmcgranaghan and @lechatpito (dummy values for now).
  • Clarified meaning of creditText

{
      "@context": {
        "@vocab": "https://schema.org/"
         },
		"@id": "https://hpde.io/ASWS/DisplayData/Solar_Image/Clg_Solar_Image", 
		"@type" :"Dataset",
		"name": "Australian Space Weather Services, SWS, Culgoora Solar Observatory Solar Image",
        "url": "https://hpde.io/ASWS/DisplayData/Solar_Image/Clg_Solar_Image.html",
		"publisher": {
			"@type": "Person",
			"identifier": "spase://ASWS/Person/Kehe.Wang",
			"url": "http://hpde.io/ASWS/Person/Kehe.Wang"
		},
		"sdDatePublished": "2022-02-24",
		"sdPublisher": {
             "@type": "Organization",
             "name": "hdpe.io"
		},
        "distribution": {
            "@type": "DataDownload",
            "contentUrl": "http://urlToDirectDownloadOfThisDataset.org/",
            "encodingFormat": "text/csv"
        },
		"includedInDataCatalog": {
			"@type": "DataCatalog",
			"name": "HPDE.io / SPASE",
            "url": "..."
		},
        "identifier": "spase://ASWS/DisplayData/Solar_Image/Clg_Solar_Image",
        "creditText": "Wang, K. (2024) Australian Space Weather Services, SWS, Culgoora Solar Observatory Solar Image.",
 		"description": "Australia has made major contributions to research in solar physics for several decades. Solar images are a major kind of records of solar activities. The SWS has monitored and archived solar activity images from Culgoora and Learmonth for more than two decades. They include H-Alpha (2003-2014) and White Light (2004-2013) images from Culgoora, and H-Alpha (2004-2009), GONG H-Alpha (since 2013), GONG White Light (since 2010), and GONG Magnetogram (since 2010) images from Learmonth. A new set of three telescopes, which are on a single mount, has installed at Culgoora. Culgoora solar image data are available since 2003-02-01.",
		"abstract": "Australia has made major contributions to research in solar physics for several decades. Solar images are a major kind of records of solar activities. The SWS has monitored and archived solar activity images from Culgoora and Learmonth for more than two decades. They include H-Alpha (2003-2014) and White Light (2004-2013) images from Culgoora, and H-Alpha (2004-2009), GONG H-Alpha (since 2013), GONG White Light (since 2010), and GONG Magnetogram (since 2010) images from Learmonth. A new set of three telescopes, which are on a single mount, has installed at Culgoora. Culgoora solar image data are available since 2003-02-01.",
		"temporalCoverage": "2003-01-02T05:00:00Z/...",
        "variableMeasured": "Spectrum",
		"spatialCoverage": [
            {
            "@type": "Place",
            "name": "Planetary Magnetosphere",
            "keywords":[
            { 
                "@type": "DefinedTerm",
                "url": "https://astrothesaurus.org/",
                "name": "Planetary magnetosphere",
                "inDefinedTermSet": "https://github.com/astrothesaurus/UAT/blob/master/UAT.json",
                "identifier": "http://astrothesaurus.org/uat/997",
                "termCode": "997",
                "description": "Planetary magnetosphere"
            }
            ]
            },
           {
            "@type": "Place",
            "name": "Sun"
           } 
            
            ],
		"keywords": [  "Solar Image", "H-Alpha Image", "White Light Image" ],
		"license": "https://cdla.io/permissive-1-0/",
                "audience":{
                  "@type": "Audience",
                  "audienceType": ["Space Physicist", "Space Community", "Data Scientists", "Machine Learning Users"]
                }
}

pbuttigieg avatar Aug 14 '24 15:08 pbuttigieg

For the records: the ODISCat entry is https://catalogue.odis.org/view/3310

jmckenna avatar Sep 12 '24 12:09 jmckenna

update: the ODIS Dashboard is showing no issues with the SPASE sitemap, finding 584 JSON-LD records. (those records are not yet harvested into the ODIS graph, which is why the other values are empty)

Screenshot 2024-09-13 at 2 44 54 PM

jmckenna avatar Sep 13 '24 17:09 jmckenna

🎉

On Fri, Sep 13, 2024, 1:47 PM Jeff McKenna @.***> wrote:

update: the ODIS Dashboard http://dashboard.oceaninfohub.org/ is showing no issues with the SPASE sitemap, finding 584 JSON-LD records. (those records are not yet harvested into the ODIS graph, which is why the other values are empty)

Screenshot.2024-09-13.at.2.44.54.PM.png (view on web) https://github.com/user-attachments/assets/5a840fdd-870a-4986-8740-873d0c04c42d

— Reply to this email directly, view it on GitHub https://github.com/iodepo/odis-arch/issues/397#issuecomment-2349644199, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABMC7VY5VQDQSXYO3W7KOLZWMQJRAVCNFSM6AAAAABDACLD72VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGNBZGY2DIMJZHE . You are receiving this because you were mentioned.Message ID: @.***>

lechatpito avatar Sep 13 '24 17:09 lechatpito

Plan for November technical meeting: Go through an example of Madrigal http://cedar.openmadrigal.org/ with Pier Luigi (these data would be very meaningful for the ocean and ocean-adjacent communities and is currently disconnected from data indexed by SPASE, which is what we have been connecting to ODIS so far)

rmcgranaghan avatar Oct 09 '24 14:10 rmcgranaghan