python爬虫程序卡死

0 0

python爬虫程序卡死

import requests
import cchardet
import sqlite3
import time
import logging
from multiprocessing.pool import ThreadPool

def execute(url):


 total_time = time.time()
try:
    start_time = time.time()
    res = requests.get(url,timeout=1)
    download_time = time.time() - start_time

    start_time = time.time()
    res.encoding = cchardet.detect(res.content)['encoding']
    dencode_time = time.time() - start_time

except:
    logging.warning(url)

pool = ThreadPool(12)
pool.map(execute,links)# links(a list):more than 60000 url of different site
上面是我用python3写的部分爬虫代码，程序总是运行到中途就会卡死，经定位发现是 res = requests.get(url,timeout=1)这句代码出了问题，因为我去掉这句代码后程序是可以欢快地跑完的。初学python，恳请大神赐教，这究竟是什么原因。links是一个list，存放了来自多个不同网站的6w多条url

python3.x python-爬虫

9 years, 8 months ago

寂寞的槑槑姬

寂寞的槑槑姬 9 years, 8 months ago

python爬虫程序卡死

寂寞的槑槑姬

Answers

Your Answer