Saturday, December 27, 2008

Python3.0 Urllib体验

首先推荐大家接触python3.0. 有争议的东西毕竟是好东西.
http://python.org/ftp/python/3.0/Python-3.0.tar.bz2 可下载python3.0 可以推荐大家玩玩. 如果担心和当前python环境影响没关系.解压后进入该目录.

sudo su
mkdir /opt/python3.0
./configure --prefix=/opt/python3.0 && make && make install
ln -s /opt/python3.0/bin/python3.0 /usr/bin/python3.0



python3.0出了之后.像我这样热衷爬虫的偏执狂最关心的就是它的urllib.
可以去python.org看看文档,urllib是修改比较剧烈的模块.urllib2和urllib被整合到了urllib包.相应的方法都被归到不同模块中. 在下面的代码中,urlopen, urlencode 被定义在request,parse模块.




#!/usr/bin/python3.0
# -*- coding:utf-8 -*-

#-------------------------------------------
# Designer : Free.Wang
# E-mail : freefis@Gmail.com
# Licence : License on GPL Licence
# Archieved : Dec 27 2008
#-------------------------------------------

from urllib import request,parse

class Makesocket:
""" Build for Make a full HttpRequest via POST/GET """
def __init__(self,loginurl,param):
# login url to recieve param , Post/Get param
self.loginurl = loginurl
self.param = param

def cookie(self):
"""make a Container for Cookie"""
self.cookies = request.HTTPCookieProcessor()
# opener will be used to open Url with cookie.
# build the cookie container.
self.opener = request.build_opener(self.cookies)
request.install_opener(self.opener)

def run(self):
""" get Response. """
# serialize the Param
# Login to Get Cookie ,Which will be Push into self.opener.
# get request with Cookie.
self.encodeparam = parse.urlencode(self.param)
request.urlopen(self.loginurl,self.encodeparam)
conn = self.opener.open("http://www.hellofreefis.com/profile/")
print(conn)

if __name__ == '__main__':
oginurl = "http://www.hellofreefis.com/info_oper.php?tag=signin&pageref="
param = {
'name':"freefis",
'pwd':"freefis' passwod"
}
conn = Makesocket(loginurl,param)
conn.cookie()
conn.run()


测试成功.Py3.0的改动还是很大的.但愿Python4.0出现的时候,不会再来革命.我个人角度对这次改动还是很肯定.感谢Guido把reduce函数保留到functools模块中.
希望各大公司和*nix操作系统发行者能尽快支持python3.0.

P.S:以上代码让我觉得很"悲剧".blogspot转义空格的支持不够到位.若要迁移代码的朋友,请自行缩进.

No comments:

Followers