python爬虫程序卡死


import requests
import cchardet
import sqlite3
import time
import logging
from multiprocessing.pool import ThreadPool

def execute(url):


 total_time = time.time()
try:
    start_time = time.time()
    res = requests.get(url,timeout=1)
    download_time = time.time() - start_time

    start_time = time.time()
    res.encoding = cchardet.detect(res.content)['encoding']
    dencode_time = time.time() - start_time

except:
    logging.warning(url)

pool = ThreadPool(12)
pool.map(execute,links)# links(a list):more than 60000 url of different site
上面是我用python3写的部分爬虫代码,程序总是运行到中途就会卡死,经定位发现是 res = requests.get(url,timeout=1)这句代码出了问题,因为我去掉这句代码后程序是可以欢快地跑完的。初学python,恳请大神赐教,这究竟是什么原因。links是一个list,存放了来自多个不同网站的6w多条url

python3.x python-爬虫

寂寞的槑槑姬 9 years, 3 months ago

Your Answer