gritttt-rss icon indicating copy to clipboard operation
gritttt-rss copied to clipboard

Support postgres DBs in tt-rss

Open nhoening opened this issue 12 years ago • 17 comments

The inserting code currently only speaks to MySQL databases.

We have MySQL-specific code that sets the charset to be utf-8 and that gets the last inserted ID. For postgres, we would need equivalents.

Also, we would need a new option in the config file to select MySQL or postgres.

nhoening avatar Aug 16 '12 21:08 nhoening

I would like this with Google reader shutting down in July getting this data into postgresql would be nice.

salyavin avatar Mar 22 '13 09:03 salyavin

Well have you tried it on postgres yet? I didn't ever, so it might very well be that the import actually works on postgres as-is. As far as I can tell, the SQL code I generate is not fancy on any way (well it uses sub selects, but postgres has them, too).

I'd suggest you backup your database (or use a copy for testing) and give it a try...

salyavin [email protected] schrieb:

I would like this with Google reader shutting down in July getting this data into postgresql would be nice.


Reply to this email directly or view it on GitHub: https://github.com/nhoening/gritttt-rss/issues/37#issuecomment-15288656

nhoening avatar Mar 22 '13 10:03 nhoening

When I try it I get a lot of invalid commands and various things like this (I ran it with LC_ALL=C but my system language is Japanese ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません ERROR: 時間帯の置換が範囲外です: "2007-07-31 07-23-30" LINE 1: ...annewbie.com/2007/07/31/asaba-ryokan-izu-japan/', '2007-07-3... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません ERROR: "timestamp"型の入力構文が無効です: "2007-08-02 06-07-45" LINE 1: ...08/02/japanese-inn-meal-at-asaba-ryokan-in-ise/', '2007-08-0... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません ERROR: "timestamp"型の入力構文が無効です: "2007-08-03 12-04-21" LINE 1: ...m/2007/08/03/6-superfoods-that-prevent-disease/', '2007-08-0... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません

salyavin avatar Mar 24 '13 06:03 salyavin

I really can't say if this is a problem that has to do with postgres or with the fact that my tool doesn't handle Japanese correctly (or that something went wrong on your end). The fact that I can't read Japanese doesn't help :) I mean, the end result should be valid utf-8, but without understanding postgres complaints, I don't know what to do. Cab you shed light on them?

salyavin [email protected] schrieb:

When I try it I get a lot of invalid commands and various things like this (I ran it with LC_ALL=C but my system language is Japanese ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません ERROR: 時間帯の置換が範囲外です: "2007-07-31 07-23-30" LINE 1: ...annewbie.com/2007/07/31/asaba-ryokan-izu-japan/', '2007-07-3... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません ERROR: "timestamp"型の入力構文が無効です: "2007-08-02 06-07-45" LINE 1: ...08/02/japanese-inn-meal-at-asaba-ryokan-in-ise/', '2007-08-0... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません ERROR: "timestamp"型の入力構文が無効です: "2007-08-03 12-04-21" LINE 1: ...m/2007/08/03/6-superfoods-that-prevent-disease/', '2007-08-0... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: 式を書き換えるかキャストしなければなりません


Reply to this email directly or view it on GitHub: https://github.com/nhoening/gritttt-rss/issues/37#issuecomment-15352152

nhoening avatar Mar 24 '13 11:03 nhoening

Well, the most probable explanation is that the sub SELECT statement is giving postgres trouble (a syntax error). Still not sure. I don't have a postgres database with tt-rss installed nor time to get one. Would you be willing to import a test snippet from my starred items (say, one or two articles) to see if the problem is independent of japanese?

nhoening avatar Mar 25 '13 10:03 nhoening

Sure. I would be happy to.

salyavin avatar Mar 25 '13 10:03 salyavin

okay, I think I can send you something tonight (it's 11:45am where I am, as a reference).

----- Original Message ----- From: [email protected] To: [email protected] Date: 25.03.2013 11:32:41 Subject: Re: [gritttt-rss] Support postgres DBs in tt-rss (#37)

Sure. I would be happy to.


Reply to this email directly or view it on GitHub: https://github.com/nhoening/gritttt-rss/issues/37#issuecomment-15386202

nhoening avatar Mar 25 '13 10:03 nhoening

ERROR: 時間帯の置換が範囲外です: "2007-07-30 00-23-16" LINE 1: ...://feeds.feedburner.com/~r/tofugu/~3/139358638/', '2007-07-3... ^ ERROR: 列"published"は型booleanですが、式は型integerでした LINE 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',...

I am seeing errors like replacement date out of range and published row is a boolean when it should be an integer

salyavin avatar Mar 25 '13 10:03 salyavin

Here we go ERROR: time zone displacement out of range: "2007-07-30 00-23-16" 行 1: ...://feeds.feedburner.com/~r/tofugu/~3/139358638/', '2007-07-3... ^ ERROR: column "published" is of type boolean but expression is of type integer 行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: You will need to rewrite or cast the expression. ERROR: time zone displacement out of range: "2007-07-31 07-23-30" 行 1: ...annewbie.com/2007/07/31/asaba-ryokan-izu-japan/', '2007-07-3... ^ ERROR: column "published" is of type boolean but expression is of type integer 行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: You will need to rewrite or cast the expression. ERROR: invalid input syntax for type timestamp: "2007-08-02 06-07-45" 行 1: ...08/02/japanese-inn-meal-at-asaba-ryokan-in-ise/', '2007-08-0... ^ ERROR: column "published" is of type boolean but expression is of type integer 行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: You will need to rewrite or cast the expression. ERROR: invalid input syntax for type timestamp: "2007-08-03 12-04-21" 行 1: ...m/2007/08/03/6-superfoods-that-prevent-disease/', '2007-08-0... ^ ERROR: column "published" is of type boolean but expression is of type integer 行 1: ... SELECT '', '', '', max(id), NULL, 2, 0, 1, ' ',... ^ HINT: You will need to rewrite or cast the expression.

salyavin avatar Mar 25 '13 10:03 salyavin

Thanks. So maybe postgres wants a different kind of date format than MySQL. I used '%Y-%m-%d %H-%M-%S',but postgres probably wants the time separated with colons, not with dashes (e.g. see Section 8.5.1.3), so I should use '%Y-%m-%d %H:%M:%S', but I'm still just guessing bcs I have limited time.

And in MySQL I use the published flag as 0 or 1 (by casting the Python boolean), but Postgres wants a boolean expression (I think postgres uses true and false).

I think I can push out a version of the script tonight that would treat these two things differently when it's a postgres database. Then we can see how far that got us. Thanks for standing by :)

nhoening avatar Mar 25 '13 11:03 nhoening

Thanks a bunch, you are sure trying quickly. It is 8:20PM in here Japan, I will probably be able to try it later in the morning.

salyavin avatar Mar 25 '13 11:03 salyavin

I found a lot of issues with postgres. I managed to fix a few but it still doesn't quite work. I'll submit a pull req if I get it working completely.

Added:

I didn't see the later comments. The first problem I found is the multi-line inserts don't track 's properly, and (as mentioned above) true/false vs 0/1.

disconn3ct avatar Mar 25 '13 16:03 disconn3ct

Awesome! Then I'll hold back my blind flight efforts for now.

Dis McCarthy [email protected] schrieb:

I found a lot of issues with postgres. I managed to fix a few but it still doesn't quite work. I'll submit a pull req if I get it working completely.


Reply to this email directly or view it on GitHub: https://github.com/nhoening/gritttt-rss/issues/37#issuecomment-15403993

nhoening avatar Mar 25 '13 16:03 nhoening

I accepted a pull request (see above) by @tafryn which makes the MySQL/Postgres distinction nicely. What still could be done is to

  • [ ] deal with \n symbols and the like (@disconn3ct has been working on this, first trying E strings, a Postgres specialty)
  • [x] transaction support, such that if one INSERT statement fails for some reason, the previously imported ones are rolled back. That is also something @disconn3ct proposed. Edit: I made a new ticket for that, see #50.

nhoening avatar Mar 28 '13 11:03 nhoening

I'd like to hear from Postgres users if they are successful with this, as I didn't test it myself.

nhoening avatar Mar 28 '13 11:03 nhoening

This appeared to work perfectly for me, thank you very much!

salyavin avatar Mar 28 '13 11:03 salyavin

Wow, that was fast, thanks. Hey guys, someone successfully imported freaking Japanese articles into Postgres. Woohoo :)

nhoening avatar Mar 28 '13 11:03 nhoening