Why the Python Scrapy library class isn't executed


Why the Python Scrapy library class isn't executed



I'm trying to use Scrapy on IBM cloud as a function. My __main__.py is as follows:


__main__.py


import scrapy
from scrapy.crawler import CrawlerProcess

class AutoscoutListSpider(scrapy.Spider):
name = "vehicles list"

def __init__(self, params, *args, **kwargs):
super(AutoscoutListSpider, self).__init__(*args, **kwargs)
make = params.get("make", None)
model = params.get("model", None)
mileage = params.get("mileage", None)

init_url = "https://www.autoscout24.be/nl/resultaten?sort=standard&desc=0&ustate=N%2CU&size=20&page=1&cy=B&mmvmd0={0}&mmvmk0={1}&kmto={2}&atype=C&".format(
model, make, mileage)
self.start_urls = [init_url]

def parse(self, response):
# Get total result on list load
init_total_results = int(response.css('.cl-filters-summary-counter::text').extract_first().replace('.', ''))
if init_total_results > 400:
yield {"message": "There are MORE then 400 results"}
else:
yield {"message": "There are LESS then 400 results"}


def main(params):
process = CrawlerProcess()
try:
process.crawl(AutoscoutListSpider, params)
process.start()
return {"Success ": "The crawler (make: {0}, model: {1}, mileage: {2}) is successfully executed.".format(
params['make'], params['model'], params['mileage'])}
except Exception as e:
return {"Error ": e, "params ": params}



The whole process to add this function is as follows:


zip -r ascrawler.zip __main__.py common.py


ibmcloud wsk action create ascrawler --kind python:3 ascrawler.zip


ibmcloud wsk action invoke --blocking --result ascrawler --param make 9 --param model 1624 --param mileage 2500



After executing step three I get results as follows:


{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}



Thus I do not get any errors, but it didn't come in AutoscoutListSpider class at all. Why?


AutoscoutListSpider



It should return also {"message": "There are MORE then 400 results"}. Any idea?


{"message": "There are MORE then 400 results"}



When I run it from python console as follows:


main({"make":"9", "model":"1624", "mileage":"2500"})



It returns correct result:


{"message": "There are MORE then 400 results"}
{"Success ": "The crawler (make: 9, model: 1624, mileage: 2500) is successfully executed."}





Thanks for the detailed write-up, I'll take a look at this tomorrow.
– James Thomas
Jun 28 at 17:06





@JamesThomas Ok. Thanks
– Boky
Jun 29 at 5:05





Good news, everything is working! That information is written to the console logs, not returned from the function.
– James Thomas
Jun 29 at 8:34




1 Answer
1



{"message": "There are MORE then 400 results"} is available in the activation logs for the invocation, not the action result.


{"message": "There are MORE then 400 results"}



Once you have run the ibmcloud wsk action invoke command, retrieve the activation identifier for the previous invocation.


ibmcloud wsk action invoke


$ ibmcloud wsk activation list
activations
d13bd19b196d420dbbd19b196dc20d59 ascrawler
...



This activation identifer can then be used to retrieve all console logs from stdout and stderr written during the invocation.


$ ibmcloud wsk activation logs d13bd19b196d420dbbd19b196dc20d59 | grep LESS
2018-06-29T08:27:11.094873294Z stderr: {'message': 'There are LESS then 400 results'}






By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Comments

Popular posts from this blog

paramiko-expect timeout is happening after executing the command

Opening a url is failing in Swift

Export result set on Dbeaver to CSV