zotero-cita icon indicating copy to clipboard operation
zotero-cita copied to clipboard

Add api endpoint to add citations to an item by libraryID and itemID

Open Dominic-DallOsto opened this issue 3 years ago • 26 comments

Fixes #144

Adds the endpoint /cita/citation/add by which citations (listed by itemKey) can be added to an item (identified by libraryID and itemKey)

An example request in python:

requests.post('http://127.0.0.1:23119/cita/citation/add', json={'libraryID':'1','itemKey':"5ZKBTFW2",'citedItemKeys':["HQPJ4N8V","SCN78LKP"]})
  • [ ] Move addCitationsByKey and deleteCitationsByIndex to SourceItemWrapper
  • [ ] Resolve #157
  • [ ] Call SourceItemWrapper.deleteCitations with explicit sync=false
  • [ ] Decided on endpoint architecture (thoughts in https://github.com/Dominic-DallOsto/zotero-api-endpoint/issues/14)
  • [ ] Basic API documentation

Dominic-DallOsto avatar Jan 06 '22 12:01 Dominic-DallOsto

Unfortunately, I am having CORS issues: trying to access the connector endpoint from a python script running in a virtual machine (Windows WSL2) fails because the connector server only seems to accept connections from localhost. Similarly, I cannot connect to the connector server from a script running in another window served from a different port. I fear that's probably nothing that can be changed, since the Zotero people will probably not add a more liberal CORS header.

cboulanger avatar Jan 07 '22 08:01 cboulanger

Web browsers cannot connect, but a python script should be fine, I do it all the time.

retorquere avatar Jan 07 '22 08:01 retorquere

Web browsers cannot connect, but a python script should be fine, I do it all the time.

My python cgi script tries to connect from 127.0.1.1 (WSL2 Ubuntu VM) to the Zotero server listening on 127.0.0.1 and it refuses the connection. Maybe it is a firewall issue, though. I'll switch to a Mac where it should work.

cboulanger avatar Jan 07 '22 08:01 cboulanger

Try setting another agent header

retorquere avatar Jan 07 '22 09:01 retorquere

Ok, I got it working but soon discovered that the existing connector API does not provide the methods I need for programmatically adding the reference data that I want to link with cita (or I didn't find them in docs or in the source), such as methods to lookup items in order to avoid duplicates, or to create them if they don't exist already.

Would you consider adding the endpoints that I put here to your PR? I tested them by pasting the code into the "Run JavaScript" console, which works well. Although they do what they should, they are not 100% ready yet because I want to test them in my workflow first (maybe I need another method or additional data). But maybe you can already have a look now.

cboulanger avatar Jan 08 '22 23:01 cboulanger

@Dominic-DallOsto One comment regarding parameter names: Maybe it would be better to rename itemID => itemKey and citedItemsIDs => citedItemsKeys because internally, Zotero uses ID for the internal primary ids stored in the Sqlite database, whereas the alphanumeric "keys" are part of the item data.

cboulanger avatar Jan 09 '22 19:01 cboulanger

@Dominic-DallOsto I have added more connector endpoints: https://github.com/cboulanger/excite-docker/blob/main/zotero/zoteroOverlay.jsx - maybe it makes sense that I create a PR to add them after your PR has been merged. They add capabilities that are essential to work programmatically with the cita plugin from outside Zotero.

cboulanger avatar Jan 09 '22 19:01 cboulanger

@Dominic-DallOsto One comment regarding parameter names: Maybe it would be better to rename itemID => itemKey and citedItemsIDs => citedItemsKeys because internally, Zotero uses ID for the internal primary ids stored in the Sqlite database, whereas the alphanumeric "keys" are part of the item data.

That makes sense!

@Dominic-DallOsto I have added more connector endpoints: https://github.com/cboulanger/excite-docker/blob/main/zotero/zoteroOverlay.jsx - maybe it makes sense that I create a PR to add them after your PR has been merged. They add capabilities that are essential to work programmatically with the cita plugin from outside Zotero.

Thanks, that looks great! Actually I was thinking maybe this would make more sense as a separate extension, because most of the functionality is for working with Zotero data generally and it might be something others who don't necessarily use Cita would be interested in? Then I think anything specific to citations should be part of Cita. What do you think? Would you like to do up an extension? Otherwise I can quickly do it tomorrow.

Dominic-DallOsto avatar Jan 09 '22 22:01 Dominic-DallOsto

Thanks, that looks great! Actually I was thinking maybe this would make more sense as a separate extension, because most of the functionality is for working with Zotero data generally and it might be something others who don't necessarily use Cita would be interested in? Then I think anything specific to citations should be part of Cita. What do you think? Would you like to do up an extension? Otherwise I can quickly do it tomorrow.

You are right that these features go beyond Cita and should really be part of a more general set of query features for the connector (and there are lots more that I would wish to be available). I found the prerequisites of plugin development a bit involved and have shied away from it so far, so if you would set up the infrastructure for a separate plugin, that would be really great and I would be happy to contribute to it!

cboulanger avatar Jan 10 '22 07:01 cboulanger

If you don't mind typescript, I have a yeoman generator for zotero plugins which scaffolds a plugin plus the infrastructure to do releases.

retorquere avatar Jan 10 '22 08:01 retorquere

Thanks! I'll whip this up quickly.

Dominic-DallOsto avatar Jan 10 '22 08:01 Dominic-DallOsto

If you don't mind typescript, I have a yeoman generator for zotero plugins which scaffolds a plugin plus the infrastructure to do releases.

That sounds wonderful! I really like TypeScript because it catches so many bugs in advance and is self-documenting.

cboulanger avatar Jan 10 '22 09:01 cboulanger

Thanks! I'll whip this up quickly.

Thank you. We can then open up issues discussing which endpoints should be available and what data they should return.

cboulanger avatar Jan 10 '22 09:01 cboulanger

If you don't mind typescript, I have a yeoman generator for zotero plugins which scaffolds a plugin plus the infrastructure to do releases.

I just posted this issue https://github.com/retorquere/generator-zotero-plugin/issues/20 I'm having with running the plugin

Dominic-DallOsto avatar Jan 10 '22 13:01 Dominic-DallOsto

@cboulanger There's an initial version up here. Hopefully I didn't break anything. And it should be possible to tidy up a lot of the checks with Typescript.

Dominic-DallOsto avatar Jan 10 '22 18:01 Dominic-DallOsto

@Dominic-DallOsto Really great job - so fast and beautiful code, too! Looking forward to trying it out and I'll certainly contribute some more endpoints. It is great to have a place to add them. I have always been missing a fast and reliable way of scripting Zotero in a cross-platform way. The zotero.org web API is great and everything, but this is much more versatile (and extensible, too).

cboulanger avatar Jan 10 '22 18:01 cboulanger

Updated now with new methods - see what you think / if it works for your workflow

Dominic-DallOsto avatar Jan 22 '22 14:01 Dominic-DallOsto

Ok, I think this is finished and ready for review now. The only additional change is that when adding a citation to a Zotero item, that Zotero item will be automatically linked in the citation.

Dominic-DallOsto avatar Jan 26 '22 10:01 Dominic-DallOsto

Hi all! Thanks for the work you've put into this. I wasn't aware of the extensibility of the Zotero HTTP server, and I like to see that Cita can be integrated into automated workflows through it. And I really like the brand new zotero-api-endpoint plugin. Congratulations!

I have reviewed the changes in the pull request and I generally like them. I do have some comments, though:

First, I like that you've put the initialization code in the init function of the zoteroOverlay object (src/zoteroOverlay.jsx). It seems to be the earliest point in the extension's bootstrapping process where we have access to our modules.

Then, I think that the specifics of the addItemCitations and deleteItemCitations should be implemented elsewhere, in the SourceItemWrapper class, where they can be reused in the future, for example to address #39. I think we should move most of the code to a couple of functions addCitationsByKey and deleteCitationsByIndex there.

Regarding the addCitationsByKey method, it creates Citation objects and adds them to the citing item using its addCitations method. However, because these citations are linked to Zotero items by passing their key to the Citation constructor, rather than using the Citation's linkToZoteroItem method, the citing and cited items are not marked as "related" in Zotero. We may call the Citation's linkToZoteroItem method (instead of passing the cited item key to the Citation constructor), but I think doing this before adding the Citations to the source/citing item (via addCitations) may be a bad idea. This is probably a design flaw in Cita (I'll post a separate issue: #156)

To solve this, I would suggest that we change the SourceItemWrapper class' addCitations method to:

  1. create a temporary array of Zotero keys in the Citation objects being added, and
  2. use these keys collected to mark citing and cited items as related (if not yet marked so).

This would also work for links that may have been established by manually editing (discouraged) the Citations note attachment (instead of using the GUI). I'll post a separate issue for this too: #157.

On the other hand, regarding the deleteCitationsByIndex method, I see that it uses the citing item's deleteCitation method. This is OK. However, when this method is called with sync=true it also removes the remote citation link in Wikidata (if known). This may be unexpected by the user. For example, when citations are removed using the GUI, a confirmation dialog is shown asking the user whether they want to remove the remote citation link as well. I think that in this case we should make explicit that we are calling deleteCitation with sync=false.

Regarding the endpoints, I think we should:

  • have a single endpoint for citations (/cita/citations) accepting GET and POST (and DELETE?) methods
  • specify source/citing item's library ID and Zotero key as query string parameters: ?libraryID=xx&itemKey=xx
  • to get citations for an item one would GET /cita/citations and specify the library ID and item key as query string parameters. We may have an additional format parameter indicating the citation list format desired. The current implementation may be the cita format, but future additions may include bibtex, etc.
  • to add citations for an item one would POST the citations to be added to /cita/citations, again specifying library ID and (citing) item key as query string parameters. The citations to be added may still be specified in the body of the request (as in your current implementation). In the future we may use the same endpoint to add citations in other ways, depending on the parameters provided in the request body. For example, POSTing cited item full metadata, instead of cited item Zotero key.
  • to delete citations I thought of using a DELETE request, but I understand that whether DELETE requests should have request bodies or not is debatable. Instead, we may use a POST request as for adding citations (above) and maybe have an action query string parameter indicating whether we want to add or delete citations. In this case, the indices of the citations to be deleted may be sent in the request body (as in your current implementation), leaving room in the future for other ways of indicating citations to be deleted (e.g., by cited item Zotero key).

Finally, do you think we may provide basic API documentation in a root /cita endpoint? Ideally I was thinking of something like Swagger, but I don't know if this is easy or even possible. At least, I would include some basic documentation in the README, so these API features are somehow discoverable.

I think that pretty much covers everything I thought while reviewing this PR. In short:

  • [ ] Move addCitationsByKey and deleteCitationsByIndex to SourceItemWrapper
  • [ ] Update Zotero item relations inside SourceItemWrapper's addCitations
  • [ ] Call SourceItemWrapper's deleteCitations with explicit sync=false
  • [ ] Update endpoint paths
  • [ ] Basic API documentation

I'll try to reply to future comments and changes as soon as possible! Thank you again!!

diegodlh avatar Jan 26 '22 16:01 diegodlh

@diegodlh Thank you for your thoughful review, I think it is a very good idea to integrate the target feature better into the existing cita infrastructure. As to your suggestion to align the requests better with the HTTP methods, that is certainly better practice than putting everything into the POST data. However, in terms of API documention/service discovery I think separating request data into a GET querystring and POST data makes validation and documentation more complicated.

In zotero-api-endpoint we use some wizardry to convert the typescript information into json-schema files which can then be used to validate the input and could further be transformed into OpenApi/Swagger information with little cost. I wonder if that's something that could be interesting for Cita, although it might be an overkill for three service methods.

cboulanger avatar Jan 26 '22 17:01 cboulanger

Thinking about it some more, the question for me is if the effort of making the API RESTful, given a server that doesn't let you parse out dynamic URI components (such as /cita/citations/{libraryID}/{itemKey}) really provides major advantages over treating the enpoints more like RPC endpoints which receive POSTed JSON data without caring so much about the HTTP protocol. Then all you have to care about is the structure of the JSON data, which makes validation much easier. In Cita, and in the case of the present PR it's really not so much of deal, since we're talking about three endpoints, but I am thinking about it because it's a very relevant question for the zotero-api-endpoint add-on, which potentially could have a larger number of endpoints exposing major parts Zotero's javascript API.

cboulanger avatar Jan 26 '22 21:01 cboulanger

I think separating request data into a GET querystring and POST data makes validation and documentation more complicated.

Regarding this, I also liked the idea of using /cita/citations/{libraryID}/{itemKey} endpoints, but I also assumed it wouldn't be possible with Zotero's builtin endpoint registration.

it might be an overkill for three service methods.

Yes, regarding documentation I agree that doing something very complicated would not be urgent right now. I would at least provide some docs in the Readme or in a landing page at the root /cita endpoint.

the endpoints are separated from the implementations, so the API itself can be versioned and improved.

I'm quoting part of your email yesterday, concerning the zotero-api-endpoint plugin, I believe. I think it applies here too. I think I prefer an initial API that looks more like how we want it to be in the long term. However, if making some of the changes I suggested would delay merging this PR too long, I guess we may open a separate issue and address these in the future.

diegodlh avatar Jan 27 '22 12:01 diegodlh

It does in fact not add the owl:related relationship between citing and cited items. Since I am already processing data with it, it would be good if

  • this relationship could be added programmatically with a separate endpoint
  • or if re-linking an item was possible without creating a new link (which currently happens), just adding the relationship

cboulanger avatar Jan 27 '22 15:01 cboulanger

Thanks for the super detailed review - I'll go through step by step:

Then, I think that the specifics of the addItemCitations and deleteItemCitations should be implemented elsewhere, in the SourceItemWrapper class, where they can be reused in the future, for example to address #39. I think we should move most of the code to a couple of functions addCitationsByKey and deleteCitationsByIndex there.

That makes a lot of sense, yep!

Regarding the addCitationsByKey method, it creates Citation objects and adds them to the citing item using its addCitations method. However, because these citations are linked to Zotero items by passing their key to the Citation constructor, rather than using the Citation's linkToZoteroItem method, the citing and cited items are not marked as "related" in Zotero. We may call the Citation's linkToZoteroItem method (instead of passing the cited item key to the Citation constructor), but I think doing this before adding the Citations to the source/citing item (via addCitations) may be a bad idea. This is probably a design flaw in Cita (I'll post a separate issue: #156)

Yep - will follow up in #157. A solution that works when passing in Zotero keys to the constructor would be most idiomatic if possible I think.

On the other hand, regarding the deleteCitationsByIndex method, I see that it uses the citing item's deleteCitation method. This is OK. However, when this method is called with sync=true it also removes the remote citation link in Wikidata (if known). This may be unexpected by the user. For example, when citations are removed using the GUI, a confirmation dialog is shown asking the user whether they want to remove the remote citation link as well. I think that in this case we should make explicit that we are calling deleteCitation with sync=false.

That makes sense as well!

  • have a single endpoint for citations (/cita/citations) accepting GET and POST (and DELETE?) methods
  • specify source/citing item's library ID and Zotero key as query string parameters: ?libraryID=xx&itemKey=xx

I'm not sure if it's possible to mix query string parameters and post data. I'll investigate this more here (https://github.com/Dominic-DallOsto/zotero-api-endpoint/issues/14) - but initially reading the docs, it looks like we can only get either the query string (if it's a GET request) or the post data (if it's a POST request) but not both. I'm not sure if DELETE requests are supported - I will have to check.

  • to get citations for an item one would GET /cita/citations and specify the library ID and item key as query string parameters. We may have an additional format parameter indicating the citation list format desired. The current implementation may be the cita format, but future additions may include bibtex, etc.

That sounds good. Would relate to #4 #125.

Finally, do you think we may provide basic API documentation in a root /cita endpoint? Ideally I was thinking of something like Swagger, but I don't know if this is easy or even possible. At least, I would include some basic documentation in the README, so these API features are somehow discoverable.

Definitely docs would be a good idea. We'll think about what this should look like.

Dominic-DallOsto avatar Jan 30 '22 10:01 Dominic-DallOsto

I'm not sure if it's possible to mix query string parameters and post data. I'll investigate this more here (Dominic-DallOsto/zotero-api-endpoint#14) - but initially reading the docs, it looks like we can only get either the query string (if it's a GET request) or the post data (if it's a POST request) but not both. I'm not sure if DELETE requests are supported - I will have to check.

Given the limitations of the Zotero server, I would suggest that we stick to GET/POST as it is now and improve in the case that the Zotero devs improve the server to process requests in a more RESTful way.

cboulanger avatar Jan 31 '22 10:01 cboulanger

or if re-linking an item was possible without creating a new link (which currently happens), just adding the relationship

@cboulanger, this should be addressed with #157.

I think we should move most of the code to a couple of functions addCitationsByKey and deleteCitationsByIndex there.

I think I was partly wrong here: deleteCitationsByIndex is already a source item method (deleteCitation). Maybe we should just make it accept multiple indices. Calling it multiple times may have unexpected consequences, because citation indices may change after a citation has been deleted.

I'm not sure if it's possible to mix query string parameters and post data.

Apparently it is. See my last comment in Dominic-DallOsto/zotero-api-endpoint/issues/14

I'm not sure if DELETE requests are supported - I will have to check.

Just checked: they are not supported.

diegodlh avatar Feb 07 '22 00:02 diegodlh