scrapyrt
scrapyrt copied to clipboard
How to access to meta values?
Hi,
I'm doing this request:
curl -XPOST -d '{ "spider_name":"quotes", "start_requests":true, "request":{ "meta": {
"test": "1", } } }' "http://138.219.228.215:9080/crawl.json"
Then I try to access from my spider by print(response.meta) and this is what it shows:
{'depth': 0, 'download_latency': 0.03323054313659668, 'download_slot': 'URL', 'download_timeout': 180.0}
of course response.meta["test"] throws error.
I need to use this "test" parameter to fill the form request
EDIT: spider : https://pastebin.com/EFz818qL
thanks!
Do you have been slove the issues? can you tell me the method
indeed that doesn't work, I'm going to fix it.
edit: it only happens when you set start_requests: True and provide request object with meta. In this case meta is not passed to spider.
I guess this workaround might help you for now. You can patch scrapyrt to take parameters from json directly. I mean, not under meta key, but as key. To achieve this, modification of resources.py is required. In tl;dr version just paste
crawler_params = api_params.copy()
for api_param in ['max_requests', 'start_requests', 'spider_name', 'url']:
crawler_params.pop(api_param, None)
kwargs.update(crawler_params)
below first try:except block in prepare_crawl method, so it looks as on attached screenshot
.
After that when you set attributes in spider constructor like this:
def __init__(self, name=name, **kwargs):
super().__init__(name=name, **kwargs)
### Getting get/post args ###
for k, v in kwargs.items():
setattr(self, k, v)
test parameter should be avaiable as self.test
I'm not sure, but I think I found this solution in open PR, @pawelmhm maybe it is good idea to merge it? I'm using this patch around half year and it seems to be fine.