description
to execute your spider, run the following command within your first_scrapy directory −
scrapy crawl first
where, first is the name of the spider specified while creating the spider.
once the spider crawls, you can see the following output −
2016-08-09 18:13:07-0400 [scrapy] info: scrapy started (bot: tutorial)
2016-08-09 18:13:07-0400 [scrapy] info: optional features available: ...
2016-08-09 18:13:07-0400 [scrapy] info: overridden settings: {}
2016-08-09 18:13:07-0400 [scrapy] info: enabled extensions: ...
2016-08-09 18:13:07-0400 [scrapy] info: enabled downloader middlewares: ...
2016-08-09 18:13:07-0400 [scrapy] info: enabled spider middlewares: ...
2016-08-09 18:13:07-0400 [scrapy] info: enabled item pipelines: ...
2016-08-09 18:13:07-0400 [scrapy] info: spider opened
2016-08-09 18:13:08-0400 [scrapy] debug: crawled (200)
<get http://www.dmoz.org/computers/programming/languages/python/resources/> (referer: none)
2016-08-09 18:13:09-0400 [scrapy] debug: crawled (200)
<get http://www.dmoz.org/computers/programming/languages/python/books/> (referer: none)
2016-08-09 18:13:09-0400 [scrapy] info: closing spider (finished)
as you can see in the output, for each url there is a log line which (referer: none) states that the urls are start urls and they have no referrers. next, you should see two new files named books.html and resources.html are created in your first_scrapy directory.