inspire-next icon indicating copy to clipboard operation
inspire-next copied to clipboard

CitationAnalysis: citeable not found

Open ksachs opened this issue 6 years ago • 6 comments

I don't see why labs doesn't find these citations

recid   legacy labs
231707	9113	0	{'reference': {'publication_info': {'journal_title': 'Z.Phys.', 'artid': '714', 'year': '1936', 'page_start': '714', 'journal_volume': '98'}, 'label': '101', 'authors': [{'full_name': 'Heisenberg, W.'}, {'full_name': 'Euler, H.'}]}, 'record': {'$ref': 'http://localhost:5000/api/literature/9113'}, 'recid': 9113, 'curated_relation': False}
403689	54961	0	{'reference': {'publication_info': {'journal_title': 'Theor.Math.Phys.', 'artid': '1', 'page_start': '1', 'journal_volume': '1'}}, 'record': {'$ref': 'http://localhost:5000/api/literature/54961'}, 'recid': 54961, 'curated_relation': False}
1186037	652597	0	{'reference': {'publication_info': {'journal_title': 'Eur.Phys.J.C', 'artid': '1', 'year': '2005', 'page_start': '1', 'journal_volume': '41'}, 'label': '10', 'authors': [{'full_name': 'Charles, J.'}], 'urls': [{'value': 'http://ckmfitter.in2p3.fr'}], 'misc': ['CKMfitter Group HEP-PH/0406184], updated results and plots available at']}, 'record': {'$ref': 'http://localhost:5000/api/literature/652597'}, 'recid': 652597, 'curated_relation': False}
1511941	47300	0	{'reference': {'publication_info': {'journal_title': 'Phys.Rev.', 'artid': '1766', 'year': '1949', 'page_start': '1766', 'journal_volume': '75'}, 'label': '78', 'authors': [{'full_name': 'Haxel, O.'}, {'full_name': 'Jensen, J.H. D.'}], 'misc': ['e H. E. Suess']}, 'record': {'$ref': 'http://localhost:5000/api/literature/47300'}, 'recid': 47300, 'curated_relation': False}
634193	61163	0	{'reference': {'publication_info': {'journal_title': 'Nuovo Cim.A', 'artid': '457', 'page_start': '457', 'journal_volume': '69'}}, 'record': {'$ref': 'http://localhost:5000/api/literature/61163'}, 'recid': 61163, 'curated_relation': False}
879364	642815	0	{'reference': {'publication_info': {'journal_title': 'Nucl.Instrum.Meth.A', 'artid': '1', 'page_start': '1', 'journal_volume': '530'}}, 'record': {'$ref': 'http://localhost:5000/api/literature/642815'}, 'recid': 642815, 'curated_relation': False}
793389	716060	0	{'reference': {'publication_info': {'journal_title': 'Nucl.Instrum.Meth.A', 'artid': '1', 'page_start': '1', 'journal_volume': '560'}}, 'record': {'$ref': 'http://localhost:5000/api/literature/716060'}, 'recid': 716060, 'curated_relation': False}
924622	83793	0	{'reference': {'publication_info': {'journal_title': 'Sov.Phys.Usp.', 'artid': '777', 'page_start': '777', 'journal_volume': '16'}}, 'record': {'$ref': 'http://localhost:5000/api/literature/83793'}, 'recid': 83793, 'curated_relation': False}



1382176	810300	0	{'reference': {'collaborations': ['ATLAS Collaboration'], 'arxiv_eprint': '0901.0512', 'authors': [{'full_name': 'Aad, G.'}], 'report_numbers': ['CERN-OPEN-2008-020'], 'label': '27', 'misc': ['and']}, 'record': {'$ref': 'http://localhost:5000/api/literature/810300'}, 'recid': 810300, 'curated_relation': False}
1321755	1241571	0	{'reference': {'arxiv_eprint': '1307.1347', 'authors': [{'full_name': 'Heinemeyer, S.'}, {'full_name': 'Mariotti, C.'}], 'publication_info': {'year': '2013'}, 'report_numbers': ['CERN-2013-004'], 'label': '84', 'misc': ['and G. Passarino, and R. Tanaka (eds.) (LHC Higgs Cross Section Working Group), Handbook of LHC Higgs Cross Sections: 3. Higgs Properties (CERN, Geneva,)']}, 'record': {'$ref': 'http://localhost:5000/api/literature/1241571'}, 'recid': 1241571, 'curated_relation': False}

ksachs avatar Jun 05 '18 13:06 ksachs

I recognize the first one in the list, as we investigated it a bit with @salmanmaq and @michamos. It's due to the fact the matcher is not ignoring deleted records, so it detects an ambiguous match between https://labs.inspirehep.net/literature/9113 and https://labs.inspirehep.net/literature/431037, so it decides to assign the citation to none of them.

jacquerie avatar Jun 05 '18 14:06 jacquerie

That cant be the only problem. E.g.

In [3]: search_pattern(p='773__p:"Eur.Phys.J." 773__v:"C75" 773__c:"1"')
intbitset([1300380])

In [7]: search_pattern(p="037:'1307.1347'")
intbitset([1241571])
In [8]: search_pattern(p='037:"CERN-2013-004"')
intbitset([1241571])

have only one record (as far as I see).

ksachs avatar Jun 06 '18 11:06 ksachs

Mh. My best guess is that the cited record was not migrating successfully at the time of the experiment, so the matcher was not able to find it.

jacquerie avatar Jun 06 '18 12:06 jacquerie

can we make sure these citations are resolvable if everything goes OK? Just to make sure there are no hidden bugs. And please fix the problem with the deleted records.

ksachs avatar Jun 06 '18 12:06 ksachs

To be honest, I can't really tell the reason for not matching here. The cited records are present in the localhost on which I ran the experiment. :neutral_face:

I'll look in more detail though but nothing concrete so far.

salmanmaq avatar Jun 06 '18 14:06 salmanmaq

Part of this issue (specifically the problem I mentioned in https://github.com/inspirehep/inspire-next/issues/3448#issuecomment-394729685) is addressed by https://github.com/inspirehep/inspire-next/pull/3462.

jacquerie avatar Jun 13 '18 11:06 jacquerie