bitcannon icon indicating copy to clipboard operation
bitcannon copied to clipboard

Switch to an embedded database

Open stephen304 opened this issue 9 years ago • 19 comments

Seems to be what everyone wants. Needs text search and unique / distinct value functions

stephen304 avatar Jan 16 '15 04:01 stephen304

SQLite?

brandongalbraith avatar Jan 16 '15 04:01 brandongalbraith

It's a possibility. I'm looking at things like bolt, tiedot, and similar. As much as I'd like to find something that deals with json documents, has easy indexing built in, and supports text search and unique/distinct queries, so far it looks like I may not find something that perfect. SQLite if I'm not mistaken should be able to handle easy indexing and text search + distinct queries, but would complicate inserting and retrieving as I would have to convert it to and from a struct.

With MongoDB currently, I take structs and pass them straight into mgo, and when i retrieve them, I can take what's returned (struct) and pass it straight to martini.

I'll look into sqlite and see if I can make it work. This issue is a bit big and I worry it may delay other tasks, so if you have any other thoughts about it I'm glad to hear them.

stephen304 avatar Jan 16 '15 13:01 stephen304

I think SQLite may just work. My only concern is performance, as all the torrents would go in a single table, and I've seen a few tests that show slowdowns on inserts when the table nears 7GB, which is about the same as both Demonoid and Kickass' full archive combined.

Hopefully the average user won't intend to keep a historical database of every torrent in existance. BitCannon is more meant for a selection of recent (and non-dead) torrents.

I've managed to get bitcannon to create an empty sqlite database, so it compiles and works at least. I may need help getting full text search working though, so if anybody knows anything about sqlite and text searches, any pointers would be helpful.

stephen304 avatar Jan 16 '15 15:01 stephen304

something like CouchDB or what supports its protocol would allow syncing of the database ;-)

gbraad avatar Jan 19 '15 11:01 gbraad

Why not an abstraction layer, like an orm ? You could then use whatever SQL database you want.

Yamakaky avatar Jan 19 '15 14:01 Yamakaky

Your "market target" seems to be a personal use so I think SQLite is fine.

Yamakaky avatar Jan 19 '15 14:01 Yamakaky

meybe this will work https://github.com/HouzuoGuo/tiedot

just brainstorming here (and yes its for golang!)

nwgat avatar Jan 19 '15 23:01 nwgat

I saw that before and I got excited, but I'm not sure if it has loose matching search. From what I can tell it looks like if you wanted to find a torrent, you'd have to type the exact title in. Maybe I'm wrong?

stephen304 avatar Jan 19 '15 23:01 stephen304

meybe it would be wise to ask the developer about that, am unsure myself

nwgat avatar Jan 20 '15 00:01 nwgat

Hopefully the average user won't intend to keep a historical database of every torrent in existance. BitCannon is more meant for a selection of recent (and non-dead) torrents.

But some of us plan to do just this...

ohhdemgirls avatar Jan 20 '15 00:01 ohhdemgirls

Yes, I realize that. Which is why I'm not sure if I will be able to find something as performant and easy to use as Mongo. I am working on a performance fix that should make keeping a historical copy of every torrent ever much more feasible.

I'm still working on SQLite on and off because I think that would be the most viable alternative even if it isn't as fast. If SQLite is horribly slow for you data hoarders, I may end up having 2 branches of BitCannon, the self contained one and a high performant one with MongoDB.

Only time will tell.

stephen304 avatar Jan 20 '15 00:01 stephen304

i agree with @ohhdemgirls a historical record :P

nwgat avatar Jan 20 '15 01:01 nwgat

Have you looked at Unqlite (Embedded NoSql by the guys from SQlite) http://unqlite.org/ It is mostly for C/C++ though. But there is some golang bindings in this repo: https://github.com/nobonobo/unqlitego

hellst0rm avatar Jan 21 '15 20:01 hellst0rm

UnQLite looks interesting, do you happen to know if it wraps C++ or if it's full go?

I tried another database that was just C++ wrapped, and it ended up not being able to cross compile.

I'll be working on BitCannon later tonight to hopefully get manual tracker refresh and probably a bit of work on this too.

stephen304 avatar Jan 21 '15 22:01 stephen304

UnQLite is a witten completly in ANSI C.

UnQLiteGo is just a wrapper code. Three files total. UnQLite.h UnQlite.c and UnQLite.go

There is also a NodeJS UnQLite package but that is only for the Key/Value Store.

hellst0rm avatar Jan 22 '15 08:01 hellst0rm

@Stephen304 I was kicking around the idea of using Elasticsearch in a Docker container to accomplish this. Size limitations go away (until we're in the TBs of local space usage), and exporting/dumping the data out is stupid simple. Also, you can use a RESTful API to query/load the datastore. Thoughts?

brandongalbraith avatar Feb 12 '15 20:02 brandongalbraith

That could be a possibility. The only thing I worry about would be installation complexity - one of the main reasons for this issue.

I may be able to look at this during this weekend when I plan to make the next release with auto tracker scraping.

stephen304 avatar Feb 12 '15 21:02 stephen304

:+1:

brandongalbraith avatar Feb 12 '15 21:02 brandongalbraith

If you're still working on adding other backends, look at using http://www.blevesearch.com/ instead of ElasticSearch, it's pure Go, and easily embeddable

fortytw2 avatar Nov 11 '15 15:11 fortytw2