scrapemark
scrapemark copied to clipboard
custom filters
Reported by toshiba13, Jun 23, 2009
New filter encoding? might be useful? international languages.
if f == 'utf-8':
if issubclass(type(s), basestring):
s = s.encode('utf8')
One could use a custom filter in a function?
Base href to add url automatically. within a filter?
What I'm trying and this time I liked. Sorry I did not spread but my English is very bad. use google translator
Thanks
Comment 1 by project member adamrshaw, Oct 22, 2009
yeah, this would be cool if you could pass an optional parameter to the scrape function. a dictionary of filter names -> filter functions
thanks
ps- adding as base href to a url is already possible with the 'abs' filter (though it might be better to rename this to be the 'url' filter; i'll think about it)
Done in an ugly way in my fork:
data = ['{{ foo|adda }}', '<a>hello</a>', {'foo':'helloa'}, {'processors':{'adda':self.adda} }]
def adda(self, string):
return string + "a"
def assertScrape(self, pattern, input, output, kwargs={}):
return self.assertEqual(scrape(pattern, html=input, **kwargs), output)
Currently works on data after the html has been removed.