frontera icon indicating copy to clipboard operation
frontera copied to clipboard

[WIP] Added Cassandra-Backend

Open MichaelVIU opened this issue 9 years ago • 11 comments

Added Cassandra as Backend on base of SQLAlchemy Code.

MichaelVIU avatar Mar 30 '16 14:03 MichaelVIU

In general looks awesome! It definitely requires some small changes, but this is already a great contribution!

sibiryakov avatar Mar 31 '16 09:03 sibiryakov

So tests are broken, we either have to find a way how to test it with Travis CI, or disable this test for now.

sibiryakov avatar Mar 31 '16 09:03 sibiryakov

How can i activate an re-test in travis after i've made changes?

MichaelVIU avatar Apr 01 '16 11:04 MichaelVIU

OK, my fork now runs without errors through travis: https://travis-ci.org/wpxgit/frontera/builds/120262078

MichaelVIU avatar Apr 02 '16 16:04 MichaelVIU

I'm marking these PR as WIP. Meaning it's Work in Progress, and we shouldn't merge it. OK?

sibiryakov avatar Apr 04 '16 16:04 sibiryakov

Hey @wpxgit what do you think of all that? Do you plan to contribute more?

sibiryakov avatar Apr 22 '16 11:04 sibiryakov

Hello @sibiryakov @wpxgit , is there any plan to continue development?

bnopacheco avatar Nov 29 '17 19:11 bnopacheco

I haven't heard anything @maisumbruno . At Scrapinghub we're fine with HBase so far.

sibiryakov avatar Nov 30 '17 12:11 sibiryakov

We are comfortable with how Cassandra works. If there are no plans to implement, @sibiryakov would there be any hints on how I can do this myself?

bnopacheco avatar Dec 04 '17 15:12 bnopacheco

@maisumbruno Definitely. I would recommend to inspire from HBaseBackend, where you can find a queue suitable for large scale crawling. Also you can start implementing it by parts, say first States, then Queue and Metadata if needed. You can send a PR any time and I'll have a look.

But you know, the most important part is battle testing, on a large volume storages are starting to work slower and this often require refactoring, schema change or various optimizations.

sibiryakov avatar Dec 05 '17 09:12 sibiryakov

Thanks @sibiryakov

bnopacheco avatar Dec 05 '17 17:12 bnopacheco