找回密码
 立即注册

扫一扫,访问微社区

QQ登录

只需一步,快速开始

查看: 2196|回复: 4

[求助] 最简单的爬虫,未显示错误,但是不出内容

1

主题

1

帖子

1

积分

贫民

积分
1
剩爪 发表于 2019-5-8 22:33:45 | 显示全部楼层 |阅读模式
from lxml import etree
import requests
import csv
import time

def data_writer(item):
    with open('qfang.csv', 'a',encoding='utf-8',newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(item)
def spider():
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'}
    pre_url = 'http://shenzhen.qfang.com/sale/f'
    for x in range(1,11):
        html = requests.get(pre_url + str(x), headers=headers)
        time.sleep(1)
        selector = etree.HTML(html.text)
        house_list = selector.xpath("//*[@id='cycleListings']/ul/li")
        for house in house_list:
            xiaoqu = house.xpath("div[1]/p[1]/a/text()")[0]
            huxing = house.xpath("div[1]/p[2]/span[2]/text()")[0]
            mianji = house.xpath("div[1]/p[2]/span[4]/text()")[0]
            quyu = house.xpath("div[1]/p[3]/span[2]/a[1]/text()")[0]
            zongjia = house.xpath("div[2]/span[1]/text()")[0]
            item = [xiaoqu, huxing, mianji, quyu, zongjia]
            data_writer(item)
            print('正在抓取', xiaoqu)

if __name__ == '__main__':
    spider()
小白调试几天了,print不出内容,也没错误,也不出内容


回复

使用道具 举报

6

主题

30

帖子

30

积分

贫民

积分
30
你的一_LtH95 发表于 2019-5-17 13:49:47 | 显示全部楼层
url链接不对
回复 支持 反对

使用道具 举报

6

主题

30

帖子

30

积分

贫民

积分
30
你的一_LtH95 发表于 2019-5-17 14:00:47 | 显示全部楼层
import requests


def spider():
    data = {
        'referer': '',
        'width': '1920',
        'height': '1080',
        'resp**e_time': '49',
        'roomIds': '100563307,100574694,100574539,100569994,100567400,100567428,100562468,100563414,100567359,100573539,100563418,100568151,100555502,100568688,100563283,100574040,100553424,100567723,100574784,100561728,100569712,100564872,100575085,100570618,100573058,100574313,100575669,100573301,100573901,100562318',
    }
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'}
    # pre_url = 'http://shenzhen.qfang.com/sale/f'
    url = 'https://m.qfang.com/shenzhen/sale/f/2'  # 这个链接
    r = requests.post(url, data=data, headers=headers)
    print(r.text)
    # for x in range(1, 11):
    #     html = requests.get(pre_url + str(x), headers=headers)
    #     time.sleep(1)
        # selector = etree.HTML(html.text)
        # house_list = selector.xpath("//*[@id='cycleListings']/ul/li")
        # for house in house_list:
        #     xiaoqu = house.xpath("div[1]/p[1]/a/text()")[0]
        #     huxing = house.xpath("div[1]/p[2]/span[2]/text()")[0]
        #     mianji = house.xpath("div[1]/p[2]/span[4]/text()")[0]
        #     quyu = house.xpath("div[1]/p[3]/span[2]/a[1]/text()")[0]
        #     zongjia = house.xpath("div[2]/span[1]/text()")[0]
        #     item = [xiaoqu, huxing, mianji, quyu, zongjia]
        #     data_writer(item)
        #     print('正在抓取', xiaoqu)

if __name__ == '__main__':
    spider()
回复 支持 反对

使用道具 举报

6

主题

30

帖子

30

积分

贫民

积分
30
你的一_LtH95 发表于 2019-5-17 14:05:59 | 显示全部楼层
import requests


def spider():
    data = {
        'referer': '',
        'width': '1920',
        'height': '1080',
        'resp**e_time': '49',
        'roomIds': '100563307,100574694,100574539,100569994,100567400,100567428,100562468,100563414,100567359,100573539,100563418,100568151,100555502,100568688,100563283,100574040,100553424,100567723,100574784,100561728,100569712,100564872,100575085,100570618,100573058,100574313,100575669,100573301,100573901,100562318',
    }
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'}
    # pre_url = 'http://shenzhen.qfang.com/sale/f'
    url = 'https://m.qfang.com/shenzhen/sale/f/2'  # 这个链接
    r = requests.post(url, data=data, headers=headers)
    print(r.text)
    # for x in range(1, 11):
    #     html = requests.get(pre_url + str(x), headers=headers)
    #     time.sleep(1)
        # selector = etree.HTML(html.text)
        # house_list = selector.xpath("//*[@id='cycleListings']/ul/li")
        # for house in house_list:
        #     xiaoqu = house.xpath("div[1]/p[1]/a/text()")[0]
        #     huxing = house.xpath("div[1]/p[2]/span[2]/text()")[0]
        #     mianji = house.xpath("div[1]/p[2]/span[4]/text()")[0]
        #     quyu = house.xpath("div[1]/p[3]/span[2]/a[1]/text()")[0]
        #     zongjia = house.xpath("div[2]/span[1]/text()")[0]
        #     item = [xiaoqu, huxing, mianji, quyu, zongjia]
        #     data_writer(item)
        #     print('正在抓取', xiaoqu)

if __name__ == '__main__':
    spider()
回复 支持 反对

使用道具 举报

6

主题

30

帖子

30

积分

贫民

积分
30
你的一_LtH95 发表于 2019-5-17 14:11:41 | 显示全部楼层
import requests


def spider():
    data = {
        'referer': '',
        'width': '1920',
        'height': '1080',
        'resp**e_time': '49',
        'roomIds': '100563307,100574694,100574539,100569994,100567400,100567428,100562468,100563414,100567359,100573539,100563418,100568151,100555502,100568688,100563283,100574040,100553424,100567723,100574784,100561728,100569712,100564872,100575085,100570618,100573058,100574313,100575669,100573301,100573901,100562318',
    }
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/46.0.2490.80 Safari/537.36'}
    # pre_url = 'http://shenzhen.qfang.com/sale/f'
    url = 'https://m.qfang.com/shenzhen/sale/f/2'  # 这个链接
    r = requests.post(url, data=data, headers=headers)
    print(r.text)
    # for x in range(1, 11):
    #     html = requests.get(pre_url + str(x), headers=headers)
    #     time.sleep(1)
        # selector = etree.HTML(html.text)
        # house_list = selector.xpath("//*[@id='cycleListings']/ul/li")
        # for house in house_list:
        #     xiaoqu = house.xpath("div[1]/p[1]/a/text()")[0]
        #     huxing = house.xpath("div[1]/p[2]/span[2]/text()")[0]
        #     mianji = house.xpath("div[1]/p[2]/span[4]/text()")[0]
        #     quyu = house.xpath("div[1]/p[3]/span[2]/a[1]/text()")[0]
        #     zongjia = house.xpath("div[2]/span[1]/text()")[0]
        #     item = [xiaoqu, huxing, mianji, quyu, zongjia]
        #     data_writer(item)
        #     print('正在抓取', xiaoqu)

if __name__ == '__main__':
    spider()
回复 支持 反对

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

快速回复 返回顶部 返回列表