anystyle-cli icon indicating copy to clipboard operation
anystyle-cli copied to clipboard

et-al and Collaborator not identified, how can i add additional tags

Open iKomettech opened this issue 6 years ago • 5 comments

iKomettech avatar Jan 18 '19 12:01 iKomettech

You can add any tags you like by adding them to your training set and creating a new model from it.

Note that et-al is currently parsed as part of other fields (e.g., author) and then interpreted by normalizers that's why we don't need a dedicated tag for it (you get normalized results if you use the json output format for example).

inukshuk avatar Jan 18 '19 13:01 inukshuk

i have added collab tag in my dataset but still it finding as title See below example dataset

<sequence>
    <collab>Action to Control Cardiovascular Risk in Diabetes Study Group</collab>
    <author>Gerstein HC, Miller ME, Byington RP, Goff DC Jr, Bigger JT</author>
    <et-al>et al</et-al>
    <title>Effects of intensive glucose lowering in type 2 diabetes</title>
    <container-title>N Engl J Med</container-title>
    <date>2008</date>
    <volume>358</volume>
    <pages>2545-2559</pages>
  </sequence>

iKomettech avatar Jan 18 '19 15:01 iKomettech

You'll have to add sufficient samples to your training set.

Do your references really lack all punctuation? That makes for particularly hard to parse references.

inukshuk avatar Jan 18 '19 19:01 inukshuk

No, its having punctuations See below is my sample reference Action to Control Cardiovascular Risk in Diabetes Study Group, Gerstein HC, Miller ME, Byington RP, Goff DC Jr, Bigger JT, et al. Effects of intensive glucose lowering in type 2 diabetes. N Engl J Med 2008;358:2545-2559.

Okay thank i will try with punctuation and let you know, thanks

And i have one more question, anystyle is have option to do frontmatter styling ? like tile, bodytext, keywords, authorgroups, abstract and affiliations

iKomettech avatar Jan 19 '19 03:01 iKomettech

Note that the XML format is used for training purposes mostly: it it extremely important that you keep all punctuation otherwise the model will not work very well with your input. That is to say, from the XML format you must be able to re-construct the original input.

I'm not sure what you mean by front-matter styling? You can certainly extract those fields from, e.g. the JSON output.

inukshuk avatar Jan 20 '19 10:01 inukshuk