抓图时出错,出现exceptions.ValueError?


items代码:


 from scrapy.item import Item, Field 
class DmozItem(Item):
    pic=Field()
    pic_src=Field()

pipeline代码:


 import json
import scrapy
from scrapy.contrib.pipeline.images import ImagesPipeline
from scrapy.exceptions import DropItem
from scrapy.http import Request

class TutorialPipeline(ImagesPipeline):

    def get_media_requests(self, item, info):
        for image_url in item['pic']:
            yield Request(image_url)

    def item_completed(self, results, item, info):
        image_paths = [x['path'] for ok, x in results if ok]
        if not image_paths:
            raise DropItem("Item contains no images")
        item['pic_src'] = image_paths
        return item

settings代码:


 BOT_NAME = 'tutorial'

SPIDER_MODULES = ['tutorial.spiders']
NEWSPIDER_MODULE = 'tutorial.spiders'

IMAGES_MIN_HEIGHT = 50
IMAGES_MIN_WIDTH = 50
IMAGES_STORE = '/ChemPic/'   #和pipelines.py同一目录
DOWNLOAD_TIMEOUT = 1200
ITEM_PIPELINES = ['scrapy.contrib.pipeline.images.ImagesPipeline',
'tutorial.pipelines.TutorialPipeline']

其中,item['pic']已经有图片链接的绝对路径。如果没有图片,就提示“Item contains no images”,如下图所示:

图片描述
如果pic有图片路径,就会出错,如下图所示:
图片描述

scrapy 图片

Gaomon 11 years, 2 months ago

Your Answer