openreads icon indicating copy to clipboard operation
openreads copied to clipboard

[FEATURE_REQUEST] Add option to export database as a Goodreads CSV file

Open smlpt opened this issue 2 years ago • 3 comments

Is your feature request related to a problem? Please describe. Since there's already a CSV import option, it might be possible to implement an exporter as well? As it would be a great way to transfer your library to other services like Open Library or Bookwyrm, which accept Goodreads CSV files as an import option.

Describe the solution you'd like Another option under the "backup" settings page, called "Export Goodreads CSV" or similar. Renaming the settings page to "Import/Export" might make sense.

Describe alternatives you've considered I tried to write a script that converts the backup SQL files, but that endeavor wasn't successful.

smlpt avatar Jul 29 '22 15:07 smlpt

This would be a great improvement! I would love to see it too. It would also make it helpful so one could see the format for the CSV imports. I migrated from another app, but didn't know how to format it to import into Openreads, so I had to manually input hundreds of books--a tedious process! One question with a CSV export is if it would be possible to maintain the cover images. The app I migrated from kept the images in a separate folder and the relative file paths were recorded in the export; but that would probably be a much bigger change than just exporting a CSV without images, deleting the blob column.

biicwlim avatar Aug 08 '22 21:08 biicwlim

@biicwlim As a workaround I wrote a short python script that takes the books.sql file from the local backup, does some conversion and exports it as CSV. Its hacky and held together by spit and duct tape, but it works for my needs, meaning an import to Bookwyrm.

edit: removed mapping from 'item_bookNotes' to 'My Review' because that was incorrect

import sqlite3 as sq
import pandas as pd
import argparse

# extract data as dataframe
def sqlite_to_df(file):

    # connect to database
    conn = sq.connect(file)
    c = conn.cursor()

    # use temporary table to get rid of image blobs
    c.execute("CREATE TABLE Temp AS SELECT * FROM Book")
    c.execute("ALTER TABLE Temp DROP COLUMN item_bookCoverImg")

    df = pd.read_sql_query("SELECT * FROM Temp", conn)

    c.execute("DROP TABLE Temp;")
    conn.commit()
    return df

# change file stuff to make it compatible with GR
def openreads_to_goodreads(df):

    dfnew = pd.DataFrame()
    col_new = [
        'Title', 'Author', 'ISBN',
        'ISBN13', 'Number of Pages',
        'Year Published', 'Date Read', 'Date Started',
        'My Rating', 'Exclusive Shelf'
    ]
    col_old = [
        'item_bookTitle', 'item_bookAuthor', 'item_bookISBN10',
        'item_bookISBN13', 'item_bookNumberOfPages',
        'item_bookPublishYear', 'item_bookFinishDate', 'item_bookStartDate',
        'item_bookRating', 'item_bookStatus'
    ]
    # replace column names to make them compatible with goodreads
    dfnew[col_new] = df[col_old].copy()

    # fix incorrect naming scheme for some shelves
    dfnew['Exclusive Shelf'].replace({
        'to_read': 'to-read',
        'in_progress': 'currently-reading'
    }, inplace=True)

    # fix unix epoch time
    for k in ['Date Read', 'Date Started']:
        # make unix epoch strings to float
        dfnew[k] = pd.to_numeric(dfnew[k], errors='coerce')
        # convert them to datetime (why the fuck does it need to be ms??)
        dfnew[k] = pd.to_datetime(dfnew[k], unit='ms')
        # reformat to the/goodreads/date/format
        dfnew[k] = dfnew[k].dt.strftime('%Y/%m/%d')

    return dfnew

# path to the books.sql file in your unpacked openreads backup zip archive
file = 'books.sql'
df = sqlite_to_df(file)
df = openreads_to_goodreads(df)
# export to csv without indexing in the first column
df.to_csv("openreads_export.csv", index=False)

smlpt avatar Oct 26 '22 20:10 smlpt

*didn't mean to close this

smlpt avatar Oct 26 '22 20:10 smlpt

A more general request is in #175

apobrt avatar Sep 05 '23 13:09 apobrt

Yeah, closing as #175 makes more sense then preparing CSVs specially to migrate from Openreads to other apps.

mateusz-bak avatar Sep 20 '23 21:09 mateusz-bak