description
to execute your spider, run the following command within your first_scrapy directory −
scrapy crawl first
where, first is the name of the spider specified while creating the spider.
once the spider crawls, you can see the following output −
2016-08-09 18:13:07-0400 [scrapy] info: scrapy started (bot: tutorial) 2016-08-09 18:13:07-0400 [scrapy] info: optional features available: ... 2016-08-09 18:13:07-0400 [scrapy] info: overridden settings: {} 2016-08-09 18:13:07-0400 [scrapy] info: enabled extensions: ... 2016-08-09 18:13:07-0400 [scrapy] info: enabled downloader middlewares: ... 2016-08-09 18:13:07-0400 [scrapy] info: enabled spider middlewares: ... 2016-08-09 18:13:07-0400 [scrapy] info: enabled item pipelines: ... 2016-08-09 18:13:07-0400 [scrapy] info: spider opened 2016-08-09 18:13:08-0400 [scrapy] debug: crawled (200) <get http://www.dmoz.org/computers/programming/languages/python/resources/> (referer: none) 2016-08-09 18:13:09-0400 [scrapy] debug: crawled (200) <get http://www.dmoz.org/computers/programming/languages/python/books/> (referer: none) 2016-08-09 18:13:09-0400 [scrapy] info: closing spider (finished)
as you can see in the output, for each url there is a log line which (referer: none) states that the urls are start urls and they have no referrers. next, you should see two new files named books.html and resources.html are created in your first_scrapy directory.