openwayback
openwayback copied to clipboard
Improved Canonicalization
The current status of Wayback's canonicalization is documented here.
This should be amended to be similar to that of Heritrix: i.e. a configurable series of steps, broadly outlined here, ideally sharing the same codebase to do so.
There is one element of canonicalization where Heritrix and Wayback will need to differ and that is that Wayback will need to handle changing canonicalization rules over time!