开发者社区> 问答> 正文

CSV文件是空的,即使项目从网站中被刮除

我的要求是将抓取的项目转储到两个不同的csv文件中。我可以刮数据,但CSV文件是空的。在这方面谁能帮忙? 下面是pipeline.py文件和控制台日志的代码:

管道的代码:

# -*- coding: utf-8 -*-



# Define your item pipelines here
#
# Don't forget to add your pipeline to the ITEM_PIPELINES setting
# See: https://docs.scrapy.org/en/latest/topics/item-pipeline.html
from scrapy.exporters import CsvItemExporter
from scrapy import signals
from pydispatch import dispatcher



def item_type(item):
    # The CSV file names are used (imported) from the scrapy spider.
    return type(item)



class SecfilingsPipeline(object):
    fileNamesCsv = ['NonDerivatives','Derivatives']

    def __init__(self):
        self.files = {}
        self.exporters = {}
        dispatcher.connect(self.spider_opened, signal=signals.spider_opened)`enter code here`
        dispatcher.connect(self.spider_closed, signal=signals.spider_closed)

    def spider_opened(self, spider):
        self.files = dict([ (name, open(name+'.csv','wb')) for name in self.fileNamesCsv ])
        for name in self.fileNamesCsv:
            self.exporters[name] = CsvItemExporter(self.files[name])




            if name == "NonDerivatives":
                print("File Name 1" + name)
                self.exporters[name].fields_to_export = ['TitleofSecurity','TransactionDate','TransactionCode','Amount','SecuritiesAcquirednDisposed','AmountOfSecurityOwned','OwnershipForm']
                self.exporters[name].start_exporting()



            if name == "Derivatives":
                print("File Name 2" + name)
                self.exporters[name].fields_to_export = ['TitleofDerivativeSecurity','TransactionDate','TransactionCode','SecuritiesAcquired','SecuritiesDisposed','TitleOfSecurity','Amount','AmountOfSecurityOwned','OwnershipForm']
                self.exporters[name].start_exporting()



    def spider_closed(self, spider):
        [e.finish_exporting() for e in self.exporters.values()]
        [f.close() for f in self.files.values()]


    def process_item(self, item, spider):
        typesItem = item_type(item)
        if typesItem in set(self.fileNamesCsv):
            self.exporters[typesItem].export_item(item)
        return item

我还在settings .py中启用了管道配置 问题来源StackOverflow 地址:/questions/59387321/csv-files-are-empty-even-items-are-scraped-from-site

展开
收起
kun坤 2019-12-25 21:35:47 403 0
0 条回答
写回答
取消 提交回答
问答分类:
问答地址:
问答排行榜
最热
最新

相关电子书

更多
《新服务(第三期)》PDF 立即下载
低代码开发师(初级)实战教程 立即下载
阿里巴巴DevOps 最佳实践手册 立即下载