connect INCOIS catalogue as ODIS node
- catalogue home: https://incois.gov.in/essdp/
- sample metadata record
- catalogue architecture (text found in help files):
"Metadata portal is developed using Java enterprise technologies and is deployed on Apache Tomcat web application server. MySQL database is used for the archival of the metadata information." Features: - ISO 19115 standards compliant representation of metadata information - GCMD Science Keywords for controlled keyword search - Spatial, Temporal, Keyword & Free Text Search - Simple interface for metadata submission, update and search - Java EE technologies based cross platform solution
Possible next steps:
- INCOIS team to embed JSON-LD on each record page in the catalogue
- INCOIS to create a sitemap.xml pointing to each record, and place the sitemap on the web
- INCOIS to create an entry in the ODIS Catalogue
- important fields are
-
Startpoint URL for ODIS-Arch(this is the url to your sitemap.xml file) -
Type of the ODIS-Arch URL(select "Sitemap")
-
- important fields are
Update from INCOIS team:
The following steps for INCOIS to become an ODIS node on OIH are completed.
1. Modify ViewMetadata page to generate and include JSON-LD for each metadata entry → Embedded JSON-LD in the view metadata page
2. Create sitemap.xml including links to all metadata entries → https://incois.gov.in/essdp/xml/sitemap.xml
We can proceed with the third step [ODISCat entry].
Latest feedback for the INCOIS team regarding the ODISCat entry, sitemap, and the embedded JSON-LD:
-
the ODISCat entry looks good, the ODIS-arch fields are set properly
-
some of the record's JSON-LD do not validate, such as this record
-
you can use the schema.org validator to check the JSON-LD embedded in your pages
-
the
creatorproperty has an extra comma at the end, causing an error:"creator": [ { "@type": "Role", "roleName": "PointOfContact", "creator": { "@type": "Person", "name": "Johnny Konjarla", "jobTitle": "Project Scientist", "email": "johnny.konjarla [ at ] gmail.com", "telephone": "9949836662", "affiliation": { "@type": "Organization", "name": "Centre for Marine Living Resources and Ecology (CMLRE)" }, "address": { "@type": "PostalAddress", "streetAddress": "CMLRE, Atal Bhavan, LNG Road, Puthuvypin South,Ochanthuruthu PO", "addressLocality": "Kochi", "addressRegion": "Kerala", "postalCode": "682508", "addressCountry": "IND" } } }, <---------- here is the extra comma ]
-
-
here is another record that fails to validate the JSON-LD
-
notice the extra comma in the
addressproperty:"address": { "@type": "PostalAddress", , <-------------------extra comma "addressLocality": "Vasco-da-Gama", "addressRegion": "Goa", "postalCode": "403804", "addressCountry": "IND" }
-
-
the
<meta>and<link>HTML tags are not closed in the pages, causing validation errors.<meta charset="utf-8"> <link href="essdp4/assets/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet">should instead be:
<meta charset="utf-8" /> <link href="essdp4/assets/vendor/bootstrap/css/bootstrap.min.css" rel="stylesheet" />
@pbuttigieg reminder of my question about the creator property with an embedded creator in the pull request at https://github.com/iodepo/odis-in/pull/14 (or see my previous comment's snippet here, to see what I had meant).
Thanks for the quick fixes by the INCOIS team.
There is another error in the HTML: the <img> tag is not closed, see this record
<img src="images/logo.png" alt="" class="img-fluid" width="320" height="120" >
should instead be:
<img src="images/logo.png" alt="logo" class="img-fluid" width="320" height="120" />
Please also add a value for alt (see above).
We're really close with this one. It seems that the INCOIS team could benefit from validation checks for their HTML.
@fils does invalid HTML affect Gleaner? Or can the ODIS harvest ignore this?
Latest results when trying to harvest from the INCOIS sitemap:
- Gleaner error
syntax error on line 289: unquoted or missing attribute value in elementin record https://incois.gov.in/essdp/ViewMetadata?fileid=5f7f56a5-2868-4baf-acc4-d37595774ce2- appears to be related to
<input type=buttonwhich should instead be<input type="button"
- appears to be related to
Similar to other HTML tags mentioned above, that same input tag needs to be closed: <input type="button" ... />
update needed from INCOIS team:
- change the syntax in your
sitemap.xmlfile- remove mentions of
sitemapindex, so that your new sitemap looks like:<?xml version="1.0" encoding="UTF-8"?> <urlset xmlns="https://www.sitemaps.org/schemas/sitemap/0.9"> <url> <loc>https://incois.gov.in/essdp/ViewMetadata?fileid=5f7f56a5-2868-4baf-acc4-d37595774ce2</loc> <lastmod>2024-05-30</lastmod> </url> <url> <loc>https://incois.gov.in/essdp/ViewMetadata?fileid=64e955b6-bba3-4176-ac97-7b5f543bf0c7</loc> <lastmod>2024-05-30</lastmod> </url> <url> <loc>https://incois.gov.in/essdp/ViewMetadata?fileid=a818c658-251c-4fb8-934b-1ec91e3995f7</loc> <lastmod>2024-05-29</lastmod> </url> ... </urlset>
- remove mentions of
(generally, a sitemapindex is used when you have over 50k records) This is also described in our ODIS Book .
- additional changes needed by INCOIS team:
-
in the sitemap.xml file, change line#2 to:
<urlset xmlns="https://www.sitemaps.org/schemas/sitemap/0.9">(notice the https in the link, instead of http)
-
in this record
-
on line#93 change the text
Synedra(\)toSynedra(\\)(in other words, the backslash must be escaped in the JSON-LD)
-
-
@jmckenna looks like the dashboard reports all datasets from the Marine Science and Oceans facets of the INCOIS resource (1047)