openwayback icon indicating copy to clipboard operation
openwayback copied to clipboard

Consolidate indexes. Discard legacy stores and use CDXServer

Open kris-sigur opened this issue 9 years ago • 5 comments

The CDX server should become the default option and the other search result source's (CDX and BDB) should be discontinued.

The CDX server is already fully functional so this is largely a case of changing the defaults, making sure that it is easy to get up and running with CDX server and updating documentation.

The goal here is to a large extent separation of concern. The CDX server (which maybe should be renamed as the use of CDX files is incidental) should be solely responsible for translating URL+Timestamp searches into results. The OpenWayback webapp should be focused on presenting the result of those searches.

We would like to maintain compatibility with the equivalent separation in pywb.

kris-sigur avatar May 28 '15 15:05 kris-sigur

As part of this we should move entirely to using SURT ordered CDXs. Default for CDX generation etc. Warn when loading non-SURT CDXs.

kris-sigur avatar Oct 01 '15 15:10 kris-sigur

Make sure that the CDX Server supports ZipNum cluster for compressed CDXs

kris-sigur avatar Oct 01 '15 15:10 kris-sigur

is this still relevant? I'm asking because I'm starting a new project and I was going to use CDXCollections, is it deprecated? should I start with CDX Server?

eleclerc avatar Feb 11 '16 19:02 eleclerc

I don't know that it has been formally deprecated at this point, but it continues to sound like CDX Server will be required for OpenWayback 3.0, so it would probably be a good idea to start using CDX Server if you are starting something new.

ldko avatar Feb 11 '16 20:02 ldko

@johnerikhalse is working on CDX Server. At Bibliotheca Alexandrina, we are still using CDXCollection.xml and it is working gracefully.

MohammedElsayyed avatar Feb 14 '16 08:02 MohammedElsayyed