promnesia icon indicating copy to clipboard operation
promnesia copied to clipboard

annotate/highlight web page without other Services

Open andrewchenshx opened this issue 3 years ago • 2 comments

is it possible to annotate/highlight web pages without integrating with hypothesis or other 3rd parties' service s? I would like to save all the data locally, or self-host it on my own server.

andrewchenshx avatar Jul 10 '21 16:07 andrewchenshx

@karlicoss Is it a good idea to use JSON file as annotation storage? I had implemented web annotator using RecognitoJS some weeks ago to make Promnesia plugin for Joplin.

RecognitoJS sends annotation or highlight to Promnesia backend and the backend save to a JSON file for HPI and Orger. The downside of the implementation are that RecognitoJS doesn't support Markdown and that local JSON can grow a large annotation file instead of DB(e.g: SQLite).

In addition, When Promnesia handles the local annotations out of HPI, Both recognitoJS and Promnesia handle the same highlight and annotation on the same DOM. I suddenly felt this is not what I wanted. I stopped the work after It seemed Promnesia doesn't have any annotator role of itself.

Can you clarify this for me?

hwiorn avatar Mar 20 '22 15:03 hwiorn

Hi! Sorry for late reply! These are good questions people sometimes ask, and I should probably put it in FAQ at some point.

So TLDR -- creating new annotations is out of scope for Promnesia (perhaps a related issue: https://github.com/karlicoss/promnesia/issues/111 ).

The main reasons it's out of scope:

  • It's kind of against the goal of Promnesia which tries to only do the minimal work to aggregate data from silos and other data sources, instead of trying to reinvent the wheel :)

  • Implementing annotator is no easy task, in particular in terms of providing proper interface for it etc. Even if you implement it, I'd much rather see a good annotator implemented as a separate extension, so one can use it without promnesia (some people might prefer that). And then it's possible to feed the data from the annotator into Promnesia, so overall result would still be same.

  • Even if you implement the annotator in the Promnesia extension, the Promnesia backend architecture is going to become more complicated. Currently the indexer database is basically just a cache -- if you delete it it would just be recreated next time you run the indexer. Extension only reads from it, writing to it would mean we'd need to somehow properly retain it between indexer runs, Promnesia upgrades etc.

@hwiorn in terms of using JSON for storage -- it depends on how much data is this annotator keeping. As an example, my whole Hypothesis export is in JSON and it's just 10Mb (500Kb compressed) -- so definitely not a problem. Even if it was 100Mb or even 1Gb it would be fine IMO (although that would depend on how it's parsed and used). In addition it's always possible to use an extra layer of sqlite caching to speed up things. It's possible to use sqlite from the beginning, but it's just a bit harder, e.g. you might need to be careful about transactions, choosing database schema, migrations etc. So I'd say if you wanna experiment start with JSON and maybe later switch to sqlite if you want to speed it up.

@andrewchenshx -- for local annotations it's possible for example to self-host Hypothesis, Wallabag (although I haven't used it much), or selfhost Worldbrain Memex (same, haven't used it much, and IIRC lately they've closed some code, I fell a bit out of the loop)/ There might be some other tools I'm not aware of that are possible by selfhost.

I'm personally sometimes using my own extension for that, although it's very barebones in comparison to a proper annotator https://github.com/karlicoss/grasp

I'm trying to maintain the tools I use on this page: https://beepb00p.xyz/annotating.html#web , you might find it useful

karlicoss avatar Apr 10 '22 16:04 karlicoss