使用Python Crypto.Cipher 测试AES 加密速度

Posted on 2012/06/04 by qing

Crypto 提供AES 加/解密。有几点需要注意：

密钥key 长度必须为16（AES-128）,24（AES-192）,或者32 （AES-256）Bytes 长度
每次使用encrypt( ) 方法加密的内容必须为16 Bytes 长度
Crypto AES加密模式有ECB/CBC/CFB 等模式，ECB 不需要使用iv 参数，CBC 等链式模式需要iv 参数
ECB 是最简单的分块加密方式，CBC 当前加密结果和之前所有加密块都有关系，抗频率分析
iv 是用于链式加密的参数，加密时长度必须等于AES.block_size，解密时长度为block_size+2 Bytes

Python 之ConfigParser

Posted on 2012/05/25 by qing

一、ConfigParser简介

ConfigParser 是用来读取配置文件的包。配置文件的格式如下：中括号“[ ]”内包含的为section。section 下面为类似于key-value 的配置内容。

   1: [db]

   2: db_host = 127.0.0.1

   3: db_port = 22

   4: db_user = root

   5: db_pass = rootroot

6:

   7: [concurrent]

   8: thread = 10

   9: processor = 20

中括号“[ ]”内包含的为section。紧接着section 为类似于key-value 的options 的配置内容。

二、ConfigParser 初始工作

使用ConfigParser 首选需要初始化实例，并读取配置文件：

   1: cf = ConfigParser.ConfigParser()

   2: cf.read("配置文件名")

三、ConfigParser 常用方法

1. 获取所有sections。也就是将配置文件中所有“[ ]”读取到列表中：

   1: s = cf.sections()

   2: print 'section:', s

将输出（以下将均以简介中配置文件为例）：

   1: section: ['db', 'concurrent']

2. 获取指定section 的options。即将配置文件某个section 内key 读取到列表中：

   1: o = cf.options("db")

   2: print 'options:', o

将输出：

   1: options: ['db_host', 'db_port', 'db_user', 'db_pass']

3. 获取指定section 的配置信息。

   1: v = cf.items("db")

   2: print 'db:', v

将输出：

   1: db: [('db_host', '127.0.0.1'), ('db_port', '22'), ('db_user', 'root'), ('db_pass', 'rootroot')]

4. 按照类型读取指定section 的option 信息。

同样的还有getfloat、getboolean。

   1: #可以按照类型读取出来

   2: db_host = cf.get("db", "db_host")

   3: db_port = cf.getint("db", "db_port")

   4: db_user = cf.get("db", "db_user")

   5: db_pass = cf.get("db", "db_pass")

6:

   7: # 返回的是整型的

   8: threads = cf.getint("concurrent", "thread")

   9: processors = cf.getint("concurrent", "processor")

10:

  11: print "db_host:", db_host

  12: print "db_port:", db_port

  13: print "db_user:", db_user

  14: print "db_pass:", db_pass

  15: print "thread:", threads

  16: print "processor:", processors

将输出：

   1: db_host: 127.0.0.1

   2: db_port: 22

   3: db_user: root

   4: db_pass: rootroot

   5: thread: 10

   6: processor: 20

5. 设置某个option 的值。（记得最后要写回）

   1: cf.set("db", "db_pass", "zhaowei")

   2: cf.write(open("test.conf", "w"))

6.添加一个section。（同样要写回）

   1: cf.add_section('liuqing')

   2: cf.set('liuqing', 'int', '15')

   3: cf.set('liuqing', 'bool', 'true')

   4: cf.set('liuqing', 'float', '3.1415')

   5: cf.set('liuqing', 'baz', 'fun')

   6: cf.set('liuqing', 'bar', 'Python')

   7: cf.set('liuqing', 'foo', '%(bar)s is %(baz)s!')

   8: cf.write(open("test.conf", "w"))

7. 移除section 或者option 。（只要进行了修改就要写回的哦）

   1: cf.remove_option('liuqing','int')

   2: cf.remove_section('liuqing')

   3: cf.write(open("test.conf", "w"))

四、其他

以 # 和 ; 开头的行将作为注释

python 的双下划线

Posted on 2012/05/23 by qing

“单下划线”“_”开始的成员为保护成员，只有类对象和子类对象可以访问到这些变量/方法。

“双下划线”“__”开始的是私有成员，只有类对象能够访问，子类对象都不可以访问。

“from xxx import ”不可以导入“_”开始的变量/方法

私有变量/方法在代码生成前会被转化成为长格式（变为保护类型），转换机制为：变量/方法前加上类名，再将前端加上下划线字符。

比如A 类中有方法和变量 __private 会在代码解释前替换为 _A__private（类似于C 中的宏替换）

上面的如果明白了，可以到这里测试下。

“__xxxx__”这类双下划线开始，双下划线结束的变量为python 特殊变量，常见的有“__name__”“__file__”“__loader__”“__package__”。如果一个文件是作为主程序调用的，其值就会设为__main__，如果是作为模块被其他文件导入，它的值就是其文件名，常可用于模块内置测试。在python 的官方文档中有这样的解释：

The special global variables __name__, __file__, __loader__ and __package__are set in the globals dictionary before the module code is executed (Note that this is a minimal set of variables – other variables may be set implicitly as an interpreter implementation detail).

__name__ is set to run_name if this optional argument is not None, to mod_name + '.__main__' if the named module is a package and to the mod_nameargument otherwise.

__file__ is set to the name provided by the module loader. If the loader does not make filename information available, this variable is set to None.

__loader__ is set to the PEP 302module loader used to retrieve the code for the module (This loader may be a wrapper around the standard import mechanism).

__package__ is set to mod_name if the named module is a package and to mod_name.rpartition('.')[0]otherwise.

If the argument alter_sys is supplied and evaluates to True, then sys.argv[0] is updated with the value of __file__and sys.modules[__name__] is updated with a temporary module object for the module being executed. Both sys.argv[0] andsys.modules[__name__] are restored to their original values before the function returns.

批量下载同类型文件脚本

Posted on 2012/05/06 by qing

有时候想将整个会议的论文下载下来，手动太麻烦，应该浏览器插件完成的，没有去搜，写了个python 脚本来解决。将url 页面的指定类型文件下载下来！

红玫瑰格式：python xx.py url file_type

参数一：url 地址；参数二：文件类型

   1: #!/usr/bin/env python

   2: #encoding=utf-8

3:

   4: import urllib, urllib2

   5: import re

   6: import os,sys

7:

   8: def get_files(ourl,file_type):

   9:     print "The URL is "+ourl

  10:     print "The File Type is "+file_type

  11:     path="E:\\temp\\"

  12:     if os.path.exists(path):

  13:         pass

  14:     else:

  15:         os.mkdir(path)

  16:     print "accessing "+ourl

  17:     print "===>>>href<<<==="

  18:     tempstr='href=\"(\S{3,50}\.'+file_type+'\w{0,2})\"'

  19:     htmldata=urllib2.urlopen(ourl).read()

  20:     fileslist=re.findall(tempstr,htmldata)

  21:     if len(fileslist)==0:

  22:         print "no"+" ."+file_type+" files"

  23:     else:

  24:         for app in fileslist:

  25:             if (ourl[-1]=='/'):

  26:                 pass

  27:             else:

  28:                 ourl=ourl[:ourl.rindex("/")+1]

  29:             if (app[0:7]=='http://'):

  30:                 url=app

  31:             else:

  32:                 url=ourl+app

  33:             filedata=app

  34:             try:

  35:                 print url+"\tdownloading ......"

  36:                 filedata=urllib2.urlopen(url).read()

  37:                 print "read "+url

  38:                 filestr=path+url[url.rindex("/")+1:]

  39:                 print "file is "+filestr

  40:                 fp=open(filestr,'wb')

  41:                 fp.write(filedata)

  42:                 fp.close()

  43:             except:

  44:                 print "cann't get "+url

  45:     print "===>>>src<<<==="

  46:     tempstr='src=\"(\S{3,50}\.'+file_type+'\w{0,2})\"'

  47:     htmldata=urllib2.urlopen(ourl).read()

  48:     fileslist=re.findall(tempstr,htmldata)

  49:     if len(fileslist)==0:

  50:         print "no"+" ."+file_type+" files"

  51:     else:

  52:         for app in fileslist:

  53:             if (app[0:7]=='http://'):

  54:                 url=app

  55:             else:

  56:                 url=ourl+app

  57:             filedata=app

  58:             try:

  59:                 print url+"\tdownloading ......"

  60:                 filedata=urllib2.urlopen(url).read()

  61:                 print "read "+url

  62:                 filestr=path+url[url.rindex("/")+1:]

  63:                 print "file is "+filestr

  64:                 fp=open(filestr,'wb')

  65:                 fp.write(filedata)

  66:                 fp.close()

  67:             except:

  68:                 print "cann't get >> "+url

69:

  70: if __name__ == "__main__":

  71:     ourl=sys.argv[1];

  72:     file_type=sys.argv[2];

  73:     get_files(ourl,file_type)

Consistent hashing 的python 实现

Posted on 2012/04/27 by qing

部分代码参考之前的“写一个分布式存储系统有多简单？”，保持数据服务器端不变，修改客户端节点选择方式即可。

修改：1、添加HashRing 类，进行一致性哈希环的管理，保存在客户端client ，因此节点/数据服务器端可以不做任何修改（添加hello 测试）。2、客户端修改节点选择方式，使用consistent hashing。

接受一致性哈希的国内的也比较多了，可以参考这里

继续阅读 →

人人网关系爬虫

Posted on 2012/03/22 by qing

想写个爬人人网关系的一个小爬虫，能够从我的账户开始，爬我的朋友的朋友的朋友……。当遇到朋友隐私设置为不可读时，希望后期能够通过遍历所有其他人的朋友反推出该人的朋友，这就就需要把人人网绝大部分人的关系爬出来。单线程实测了一下，大概每小时能爬一万个用户，流量在200kb左右，速度明显不够，后面希望多线程能给力些，并且有问题再爬到7 万用户的时候出现了死循环，问题还没找出来。

考虑到后面优化的几点

多线程

效率（换语言？）

爬id 和id 差重应该分开处理

程序说明：getone() 从todolist 中获取一个未爬过的用户，savenews() 将爬出来的用户id 保存并去重。每一千个用户保存一个文件，保存形式为：该用户ID@用户好友1ID 用户好友2ID ……

源码：

   1: #!/usr/bin/env python

   2: #encoding=utf-8

3:

   4: import urllib, urllib2, cookielib, re, sys

5:

   6: class  Renren(object):

7:

   8:     def __init__(self,email,password):

   9:         self.email=email

  10:         self.password=password

  11:         self.origURL='http://www.renren.com/Home.do'

  12:         self.domain='renren.com'

  13:         # 如果有本地cookie，登录时无需验证。

  14:         self.cj = cookielib.LWPCookieJar()

  15:         try:

  16:             self.cj.revert('renren,cookie')

  17:         except:

  18:             None

  19:         self.opener = urllib2.build_opener(urllib2.HTTPCookieProcessor(self.cj))

  20:         urllib2.install_opener(self.opener)

21:

  22:     def login(self):

  23:         params = {'email':self.email,'password':self.password,'origURL':self.origURL,'domain':self.domain}

  24:         req = urllib2.Request(

  25:             'http://www.renren.com/PLogin.do',

  26:             urllib.urlencode(params)

  27:         )

  28:         r = self.opener.open(req)

29:

  30:     def friends(self):

  31:         #好友目录地址

  32:         req='http://friend.renren.com/myfriendlistx.do'

  33:         print "Get my friends"

  34:         r=self.opener.open(req)

  35:         data=r.read()

  36:         f=re.findall('"id":(\d{6,15}),',data)

  37:         print "friends list"

  38:         print f,len(f)

  39:         #todo

  40:         self.todolist=f

  41:         self.donelist=[]

  42:         #write data

  43:         fdata=open('data0.txt','w')

  44:         for item in f:

  45:             fdata.write(item+' ')

  46:         fdata.close()

  47:         sernum=0

  48:         while True:

  49:             temp1w={}

  50:             sernum=sernum+1

  51:             print "data"+str(sernum)

  52:             count=1000

  53:             while count>0:

  54:                 count=count-1

  55:                 rrid=self.getone()        #从todo 里面取一个数据

  56:                 print count,rrid

  57:                 f=self.getfriends(rrid)

  58:                 self.savenews(f)    #保存该组数据到todo

  59:                 templst=''

  60:                 for eachid in f:

  61:                     templst=eachid+' '+templst

  62:                 temp1w[rrid]=templst

  63:             #将count 个结果写到文件

  64:             filename="data_"+str(sernum)+".txt"

  65:             fp=open(filename,'w')

  66:             for each in temp1w:

  67:                 fp.write(each+'@'+temp1w[each])

  68:                 fp.write('\n')

  69:     def getfriends(self,rrid):

  70:         friends=[]

  71:         count=0

  72:         while True:

  73:             req="http://friend.renren.com/GetFriendList.do?curpage="+str(count)+'&id='+rrid

  74:             print 'Get',req

  75:             r=self.opener.open(req)

  76:             data=r.read()

  77:             f=re.findall('profile.do\?id=(\d{7,15})"><img',data)

  78:             friends=friends+f

  79:             count=count+1

  80:             if f==[]:

  81:                 exit()

  82:         return friends

83:

  84:     def getone(self):

  85:         if self.todolist==[]:

  86:             print "Empty todo list"

  87:         popup=self.todolist[1]        #选择第一个id

  88:         self.donelist.append(popup)    #加入到done 列表

  89:         del self.todolist[1]        #在todo 中删除

  90:         return popup                #返回

91:

  92:     def savenews(self,newlist):

  93:         for item in newlist:                    #删除出现在了done 列表中的id

  94:             if item in self.donelist:

  95:                 newlist.remove(item)

  96:         self.todolist=self.todolist+newlist        #添加到todo 列表中

  97:         self.todolist=list(set(self.todolist))    #去掉重复元素

98:

99:

 100:

 101: if __name__ == "__main__":

 102:     semail='[email protected]'

 103:     spassword='xxxxxxxxxx'

 104:     a=Renren(semail,spassword)

 105:     print "your account and password are %s %s" % (semail, spassword)

 106:     a.login()

 107:     a.friends()

呆鸥

Brains first and then Hard Work

标签归档：python

使用Python Crypto.Cipher 测试AES 加密速度

Crypto 提供AES 加/解密。有几点需要注意：

Python 之ConfigParser

一、ConfigParser简介

二、ConfigParser 初始工作

三、ConfigParser 常用方法

四、其他

python 的双下划线

批量下载同类型文件脚本

Consistent hashing 的python 实现

人人网关系爬虫

2024年 4月
一	二	三	四	五	六	日
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30