kindle-clippings
kindle-clippings copied to clipboard
Add more structured data when importing clippings & books
We're currently not storing all data which can be found in My Clippings.txt
.
- More book related information (if available) (ISBN, separate title & author, language
- user added notes
- Rethink the book-user mapping (currently a book is always owned by a user, but maybe that's dumb)
- Add more clippings data (location, date, language, ...)
Note that things like date are localized and seem to use different formats (maybe depending on the Kindle device?).
Examples
- Paperwhite in German:
Hinzugefügt am Samstag, 13. April 2019 10:25:27
- ? in Spanish:
Data de adição: terça-feira, 19 de março de 2019 23h32min06s GMT-03:00
Why there is author_name as a separate field in the Clipping model, when there is the same field inside the Book model, and the Book model is used as a fk?
Note that things like date are localized and seem to use different formats (maybe depending on the Kindle device?).
Do you know if the part which differentiate between a highligh, a bookmark, and a note is also localized? e.g.
- Your Highlight on page 119-119 | Added on Wednesday, 31 March 2021 18:57:00
- Your Note on page 119 | Added on Wednesday, 31 March 2021 18:57:13
- Your Bookmark at location 1607 | Added on Wednesday, 2 June 2021 11:50:54
Why there is author_name as a separate field in the Clipping model, when there is the same field inside the Book model, and the Book model is used as a fk?
I don't quite remember. I think books were introduced later and there are plaintext(/non-kindle) clippings where we simply don't know the book (but the author)
Do you know if the part which differentiate between a highligh, a bookmark, and a note is also localized?
No, I think I noticed the above localization difference in error logs or something like that. This certainly is also a reason to store the actual raw files - then we might learn better, how they work (I think they also differ based on the device model...)
So do you want to store each MyClippings as plaintext inside db?
Would there be a reason to prefer something else like the filesystem over the DB? 🤔 One reason would be DB size. I'm not sure whether there are any limits on the current hosting provider.
But I guess going with the DB makes most sense.
We could have a model for uploads which stores the timestamps and the file contents.
Going forward, created clippings could reference this model if we want to keep track of this information (not sure whether this might be a useful feature in the future?).
Would there be a reason to prefer something else like the filesystem over the DB? 🤔
I think not. Besides, exporting all data from one model to txt files can be done with just a few lines of code.
We could have a model for uploads which stores the timestamps and the file contents.
Exactly, the MyClippings model would just have a TextField and a timestamp.
I would use MyClippings files just to understand how they work, but regarding additional fields (author, note, date) and overall data update process I would choose the solution mentioned here: https://github.com/mammuth/kindle-clippings/issues/16#issuecomment-874852617