Question

0 0

已经知道url，需要把该url的标题提取出来。请问有什么方法？有没有最快速的?

10 years, 2 months ago

share

ce9bxkf 10 years, 2 months ago

Answer 1

0

方法有很多中，速度主要取决于你的网速和解析速度。

如果只考虑解析速度，推荐用lxml. 这里写一些代码给你参考:

lxml 方式

import lxml.html
t = lxml.html.parse(url)
print t.find(".//title").text

BS方式（BeautifulSoup)

import urllib2
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(urllib2.urlopen(url))
print soup.title.string

Mechanize方式 你也可以使用mechanize，里面有一个title()函数

from mechanize import Browser
br = Browser()
br.open(url)
print br.title()

注意：网页的编码问题

answered 10 years, 2 months ago

share

天天搓管子 answered 10 years, 2 months ago