Scrapy Tutorial on Scrapy Web Services

description

a running scrapy web crawler can be controlled via json-rpc. it is enabled by jsonrpc_enabled setting. this service provides access to the main crawler object via json-rpc 2.0 protocol. the endpoint for accessing the crawler object is −

http://localhost:6080/crawler

the following table contains some of the settings which show the behavior of web service −

sr.no setting & description default value
1

jsonrpc_enabled

this refers to the boolean, which decides the web service along with its extension will be enabled or not.

true
2

jsonrpc_logfile

this refers to the file used for logging http requests made to the web service. if it is not set the standard scrapy log will be used.

none
3

jsonrpc_port

this refers to the port range for the web service. if it is set to none, then the port will be dynamically assigned.

[6080, 7030]
4

jsonrpc_host

this refers to the interface the web service should listen on.

'127.0.0.1'