kindle-clippings icon indicating copy to clipboard operation
kindle-clippings copied to clipboard

Add more structured data when importing clippings & books

Open mammuth opened this issue 5 years ago • 8 comments

We're currently not storing all data which can be found in My Clippings.txt.

  • More book related information (if available) (ISBN, separate title & author, language
  • user added notes
  • Rethink the book-user mapping (currently a book is always owned by a user, but maybe that's dumb)
  • Add more clippings data (location, date, language, ...)

mammuth avatar Jul 22 '19 10:07 mammuth

Note that things like date are localized and seem to use different formats (maybe depending on the Kindle device?).

Examples

  • Paperwhite in German: Hinzugefügt am Samstag, 13. April 2019 10:25:27
  • ? in Spanish: Data de adição: terça-feira, 19 de março de 2019 23h32min06s GMT-03:00

mammuth avatar Sep 22 '19 08:09 mammuth

Why there is author_name as a separate field in the Clipping model, when there is the same field inside the Book model, and the Book model is used as a fk?

JSerwatka avatar Jun 28 '21 10:06 JSerwatka

Note that things like date are localized and seem to use different formats (maybe depending on the Kindle device?).

Do you know if the part which differentiate between a highligh, a bookmark, and a note is also localized? e.g.

- Your Highlight on page 119-119 | Added on Wednesday, 31 March 2021 18:57:00
- Your Note on page 119 | Added on Wednesday, 31 March 2021 18:57:13
- Your Bookmark at location 1607 | Added on Wednesday, 2 June 2021 11:50:54

JSerwatka avatar Jun 28 '21 11:06 JSerwatka

Why there is author_name as a separate field in the Clipping model, when there is the same field inside the Book model, and the Book model is used as a fk?

I don't quite remember. I think books were introduced later and there are plaintext(/non-kindle) clippings where we simply don't know the book (but the author)

mammuth avatar Jul 11 '21 19:07 mammuth

Do you know if the part which differentiate between a highligh, a bookmark, and a note is also localized?

No, I think I noticed the above localization difference in error logs or something like that. This certainly is also a reason to store the actual raw files - then we might learn better, how they work (I think they also differ based on the device model...)

mammuth avatar Jul 11 '21 19:07 mammuth

So do you want to store each MyClippings as plaintext inside db?

JSerwatka avatar Jul 22 '21 14:07 JSerwatka

Would there be a reason to prefer something else like the filesystem over the DB? 🤔 One reason would be DB size. I'm not sure whether there are any limits on the current hosting provider.

But I guess going with the DB makes most sense.

We could have a model for uploads which stores the timestamps and the file contents.

Going forward, created clippings could reference this model if we want to keep track of this information (not sure whether this might be a useful feature in the future?).

mammuth avatar Jul 22 '21 16:07 mammuth

Would there be a reason to prefer something else like the filesystem over the DB? 🤔

I think not. Besides, exporting all data from one model to txt files can be done with just a few lines of code.

We could have a model for uploads which stores the timestamps and the file contents.

Exactly, the MyClippings model would just have a TextField and a timestamp.

I would use MyClippings files just to understand how they work, but regarding additional fields (author, note, date) and overall data update process I would choose the solution mentioned here: https://github.com/mammuth/kindle-clippings/issues/16#issuecomment-874852617

JSerwatka avatar Jul 25 '21 13:07 JSerwatka