Add tests

Open Lukas0907 opened this issue 8 years ago • 1 comments

What primarily can go wrong (and already did in the past):

While trying to refine content extraction for a spider, a regression is introduced. I.e. the profil.at website has different templates for the author. Supporting all of them is a prime example for tests.
Sites change and thus the spider don't work any more.

Ad 1: Can be tested by creating a corpus of articles and integration tests. Ad 2: Only makes sense to test against live websites which is harder since some sources don't preserve content indefinitely.

Testing the glue code would be also nice, also the code for feed generation, caching, etc. However, the highest priority is the content extraction since that can regress easily and is not immediately noticeable.

May 10 '17 13:05 Lukas0907