为什么/s/blog_4701280b0102elmo在正则表达生成器里生成不出来？

0 0

start_urls =[' http://blog.sina.com.cn ']
rules = [Rule(LinkExtractor(allow=[ '/s/blog_4701280b0102e[\da-zA-Z]+' ]), 'parse_torrent')]

rules里的正则表达式（黑体倾斜的）对吗？

11 years, 7 months ago

skykain 11 years, 7 months ago

不知道你是想要匹配出什么内容

answered 11 years, 7 months ago

善解人衣的狼 answered 11 years, 7 months ago

后面的 .html 没有匹配，这样就匹配不到完整的URL了。
应该加上后面的 .html ，正则规则如下：
/s/blog_4701280b0102e[\da-zA-Z]+.html

answered 11 years, 7 months ago

星光伴我心 answered 11 years, 7 months ago