melt icon indicating copy to clipboard operation
melt copied to clipboard

Wrong performance.csv

Open FelixFrizzy opened this issue 4 months ago • 3 comments

Describe the bug

When using the evaluation-client to run logmap-bio with the DH tracks, I get a wrong performance.csv for the "tadirah-unseco" test case. It shows me 0 TP, but the "true" number is higher when looking into the systemAlignment.rdf and compare it to the reference.rdf of the test case. I think it would be important to find out if this is an issue on the matcher's side (unproblematic) or an issue in MELT. The latter would be more problematic since the evaluation is highly based on the numbers of the performane.csv files.

To Reproduce

Steps to reproduce the behavior:

  • Version of MELT: evaluation client from documentation

  • Java version: openjdk version "1.8.0_422"

  • Python version: 3.8.18

  • Operating system: macOS 14.4

  • Run java -jar matching-eval-client-latest.jar --systems ../Matcher/DockerMatcher/logmap-bio-melt-oaei-2021-web-latest.tar.gz --track http://oaei.webdatacommons.org/tdrs/ dh 2024all --results oaei2024_logmapbio_oaeidh_(date +"%Y-%m-%d_%H-%M-%S")

performance.csv:

Type,Precision (P),Recall (R),Residual Recall (R+),F1,"# of TP","# of FP","# of FN","# of Correspondences",Time,Time (HH:MM:SS)
ALL,0.0,0.0,0.0,0.0,0,0,15,0,3763809750,00:00:03
CLASSES,0.0,0.0,0.0,0.0,0,0,0,0,-,-
PROPERTIES,0.0,0.0,0.0,0.0,0,0,0,0,-,-
INSTANCES,0.0,0.0,0.0,0.0,0,0,15,0,-,-
systemAlignment.rdf: (removed unneeded alignments for readability)
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns="http://knowledgeweb.semanticweb.org/heterogeneity/alignment"
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema#">

<Alignment>
<xml>yes</xml>
<level>0</level>
<type>??</type>
<onto1>https://vocabs.dariah.eu/tadirah/</onto1>
<onto2>http://logmap-tests/oaei/target.owl</onto2>
<uri1>https://vocabs.dariah.eu/tadirah/</uri1>
<uri2>http://logmap-tests/oaei/target.owl</uri2>
<map>
	<Cell>
		<entity1 rdf:resource="https://vocabs.dariah.eu/tadirah/cataloging"/>
		<entity2 rdf:resource="http://vocabularies.unesco.org/thesaurus/concept3799"/>
		<measure rdf:datatype="xsd:float">1.0</measure>
		<relation>=</relation>
	</Cell>
</map>
...
</Alignment>
</rdf:RDF>
[reference.rdf](https://github.com/FelixFrizzy/DH-benchmark/blob/main/dhcs2_tadirah-unesco/reference.rdf) of tadriah-unesco test case (removed unneeded alignments for readability)
<?xml version="1.0" encoding="utf-8"?>
<rdf:RDF xmlns="http://knowledgeweb.semanticweb.org/heterogeneity/alignment"
	xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	xmlns:xsd="http://www.w3.org/2001/XMLSchema#">

<Alignment>
<xml>yes</xml>
<level>0</level>
<type>**</type>
<onto1>https://vocabs.dariah.eu/tadirah/</onto1>
<onto2>http://vocabularies.unesco.org/thesaurus/</onto2>
<uri1>https://vocabs.dariah.eu/tadirah/</uri1>
<uri2>http://vocabularies.unesco.org/thesaurus/</uri2>
<map>
	<Cell>
		<entity1 rdf:resource="https://vocabs.dariah.eu/tadirah/cataloging"/>
		<entity2 rdf:resource="http://vocabularies.unesco.org/thesaurus/concept3799"/>
		<measure rdf:datatype="xsd:float">1.0</measure>
		<relation>=</relation>
	</Cell>
</map>
...
</Alignment>
</rdf:RDF>

The correctly identified alignment is not reflected in the perfomance.csv (along with all the other TP's)

Full log output

issue.log

Expected behavior

The performance.csv should list 10 TP and 5 FP instead of 0TP and 15FP.

FelixFrizzy avatar Oct 08 '24 12:10 FelixFrizzy