源码网商城,靠谱的源码在线交易网站 我的订单 购物车 帮助

源码网商城

python使用scrapy解析js示例

  • 时间:2020-04-14 02:29 编辑: 来源: 阅读:
  • 扫一扫,手机访问
摘要:python使用scrapy解析js示例
[u]复制代码[/u] 代码如下:
from selenium import selenium class MySpider(CrawlSpider):     name = 'cnbeta'     allowed_domains = ['cnbeta.com']     start_urls = ['http://www.1sucai.cn']     rules = (         # Extract links matching 'category.php' (but not matching 'subsection.php')         # and follow links from them (since no callback means follow=True by default).         Rule(SgmlLinkExtractor(allow=('/articles/.*\.htm', )),              callback='parse_page', follow=True),         # Extract links matching 'item.php' and parse them with the spider's method parse_item     )     def __init__(self):         CrawlSpider.__init__(self)         self.verificationErrors = []         self.selenium = selenium("localhost", 4444, "*firefox", "http://www.1sucai.cn")         self.selenium.start()     def __del__(self):         self.selenium.stop()         print self.verificationErrors         CrawlSpider.__del__(self)     def parse_page(self, response):         self.log('Hi, this is an item page! %s' % response.url)         sel = Selector(response)         from webproxy.items import WebproxyItem         sel = self.selenium         sel.open(response.url)         sel.wait_for_page_to_load("30000")         import time         time.sleep(2.5)
  • 全部评论(0)
联系客服
客服电话:
400-000-3129
微信版

扫一扫进微信版
返回顶部