本文将从阿里云存储中基本概念,python API 和几个例子来简述阿里云python API 的使用,使新手能够简单上手阿里云的云存储。
阿里云存储中的基本概念
对象 Object
阿里云存储中,任何一个存储的文件都是一个对象,包含了key、data 和user meta,分别对应着对象的名称、数据和对象描述。对象名使用utf-8 编码,且长度在1-1023 之间。操作对象的操作有:get、range query、delete、list。
对象组 Object Group
多个对象组成的松散集合。可以像操作对象一样操作对象组。
桶 Bucket
桶就是用来装对象数据的,在整个阿里云存储服务(OSS)中具有全局唯一性,相当于是阿里云给每个用户的二级域名,每个用户最多可以创建10 个桶。所有的对象都是放在桶中的,至于能放多少没有限制。桶的命名包括小写字母、数字、下划线和短横线,必须以小写字母或者数字开头。通过浏览器设置存储空间长度只能在6-16 个字符内,而通过python 的API 可以设置在3-255 个字符内。
Access ID 和Access key
在OSS 注册时,用户会给用户分配Access ID 和Access key 分别用来标识用户和认证用户用。Access key 是作为私钥发送给用户,每次用户为认证自己,将一段随机数字加上用私钥加密随机数字的结果发送给服务器端,服务器用用户的公钥解密对应的加密结果,如果和随机数字一致则用户得到验证。服务器端如果公钥泄密问题不大,但如果用户的私钥泄露出去了则就要更换公私密钥对了,生成公私密钥对的大素数对是可以不变的,虽然这也有一定风险。【如果不明白上面一段说了什么可以忽略】
总结一下:
- Bucket 、Object 和Object Group 都支持的操作:创建(上传)、查看、list、删除
- 可修改Bucket 访问权限(三种:私有、可读不可写、可读可写)
- 访问时支持If-Modified-Since 和If-Match 等HTTP 参数
OSS API SDK
包含的py 文件
这部分将根据阿里云存储服务提供的python API 看看支持哪些操作,下载python SDK 看到压缩包里包含这么几个文件:oss_api.py oss_cmd.py oss_fs.py oss_sample.py oss_util.py oss_xml_handler.py ,作用分别如下:
oss_api.py 用户直接使用该文件的类和方法操作Object、Bucket 等
oss_util.py 通用方法类,包含不被直接调用,但比较常用的类和方法
oss_xml_handler.py 阿里云存储的错误信息都是xml 格式的,用来处理xml 的
oss_cmd.py 用户可以以命令行的形式操作Object、Bucket 【optional】
oss_fs.py 利用OSS 搭建的一个简单的云存储文件系统 【optional】
oss_sample.py 简单的一些实用API 的例子【optional】
需要说明的是文件里面使用的类和方法有的比较古老了,编译的时候可能出现错误,比如oss_util.py 中md5 已经合并到hashlib 类了,md5.new() 方法也没有了,在编译oss_sample.py 的时候就会发现这样的错误。
包含的主要方法
看看oss_api.py 提供了哪些类和方法:整个文件就一个OssAPI 类,在使用该API 时首先应该通过
oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
的方法生成一个该类的实例。该类对外公开的API 有以下方法:
sign_url_auth_with_expire_time(self, method, url, headers = {}, resource=”/”, timeout = 60)
通过输入的方法、url、body 和headers 创建认证,返回签名链接,method 方法有PUT, GET, DELETE, HEAD。这类似于生成一个公开的限时签名链接地址。
bucket_operation(self, method, bucket, headers={}, params={})
发送桶bucket 操作请求,method 方法有PUT, GET, DELETE, HEAD 。
object_operation(self, method, bucket, object, headers = {}, data=””)
发送对象object 操作请求,method 方法有PUT, GET, DELETE, HEAD 。
get_service(self)
list 所有buckets,等价调用list_all_my_buckets(self) 。
get_bucket_acl(self, bucket)
获取bucket 访问控制权限,有public-read-write,public-read 和private 三类。
get_bucket(self, bucket, prefix=”, marker=”, delimiter=”, maxkeys=”, headers = {})
list bucket 中所有对象,等价调用list_bucket() ,因此参数、结果相同。
create_bucket(self, bucket, acl=”, headers = {})
创建bucket,实际上是调用的put_bucket() 。
delete_bucket(self, bucket)
删除bucket 。
put_object_with_data(self, bucket, object, input_content, content_type=DefaultContentType, headers = {})
向对象中写入数据,等价调用put_object_from_string() 。
put_object_from_file(self, bucket, object, filename, content_type=DefaultContentType, headers = {})
从文件读取写入到对象中,实际上就是上传该文件,对应的参数是文件绝对路径。
put_object_from_fp(self, bucket, object, fp, content_type=DefaultContentType, headers = {})
从文件指针写入数据到对象,和上面方法不同是的参数为文件指针。
get_object(self, bucket, object, headers = {})
获取一个对象,即是以GET 方式调用object_operation() 方法。
get_object_to_file(self, bucket, object, filename, headers = {})
获取一个对象并写入到文件,即下载文件,DELETE 方式调用object_operation() 。
head_object(self, bucket, object, headers = {})
获取对象的头信息,而不用下载整个对象内容。
post_object_group(self, bucket, object, object_group_msg_xml, headers = {}, params = {})
上传对象组,将所有在object_group_msg_xml 中的对象合并为一个对象上传。
get_object_group_index(self, bucket, object, headers = {})
获取对象组索引,通过GET 方式的object_operation() 方法获得。
put_object_from_file_given_pos(self, bucket, object, filename, offset, partsize, content_type=DefaultContentType, headers = {})
从特定文件filename 的指定位置offset 读取指定大小partsize 到对象,并上传到桶bucket 中。
upload_large_file(self, bucket, object, filename, thread_num = 10, max_part_num = 1000)
上传一个大文件,分为1000 或者很多份,分别上传到桶bucket 中,最终合并多份内容为一个大的对象。
响应信息
conn 是httplib.HTTPConnection(self.host) 返回的一个连接,通过制定方法和报头可以操作self.host 网站,通过res = conn.getresponse() 获得响应结果。res 具有以下属性和方法:
res.version
res.reason
res.status
res.msg
res.getheaders()
和服务器操作的方法会返回一个响应值,即是HTTP 协议回应的响应状态代码(res.status),三位数表示,第一个数代表响应类别:
1XX:指示信息,表示请求已接收,继续处理
2XX:成功,表示请求已成功接收,理解和接受
3XX:重定向,要求请求必须进一步操作
4XX:客户端错误,请求有语法错误或者请求无法实现
5XX:服务器端错误,服务器未能实现合法的请求
这是一个大致的响应信息,如果需要更详细的了解,参考错误响应文档。
几个例子
在自带的oss_sample.py 就包含了一些主要操作的例子,可以参考借鉴。下面给出几个常见操作的范例:
list buckets 罗列所有存储空间
1: from oss_api import *
2: from oss_xml_handler import *
3:
4: HOST="storage.aliyun.com"
5: ACCESS_ID = ""
6: SECRET_ACCESS_KEY = ""
7:
8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
9:
10: res = oss.list_all_my_buckets()
11: if 2 == (res.status / 100):
12: http_body = res.read()
13: bucket_list = GetServiceXml(http_body)
14: for bucket in bucket_list.list():
15: print bucket
16: else:
17: print "ERROR"
list objects 罗列存储空间中所有对象
1: from oss_api import *
2: from oss_xml_handler import *
3:
4: HOST="storage.aliyun.com"
5: ACCESS_ID = ""
6: SECRET_ACCESS_KEY = ""
7:
8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
9:
10: res = oss.list_bucket(bucket_name)
11: if 2 == (res.status / 100):
12: data = res.read()
13: h = GetBucketXml(data)
14: (file_list, common_list) = h.list()
15: for each in file_list:
16: print each
17:
create a bucket 创建一个存储空间
1: from oss_api import *
2: from oss_xml_handler import *
3: import sys
4:
5: HOST="storage.aliyun.com"
6: ACCESS_ID = ""
7: SECRET_ACCESS_KEY = ""
8:
9: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
10: new_bucket = sys.argv[1]
11: res = oss.put_bucket(new_bucket)
12: if 2 == (res.status / 100) :
13: print "Succeed"
14: else:
15: print "Fail\n%s" % res.read()
修改指定存储空间访问控制权限
因为创建存储空间默认为private ,需要自己设置权限
1: from oss_api import *
2: from oss_xml_handler import *
3: import sys
4:
5: HOST="storage.aliyun.com"
6: ACCESS_ID = ""
7: SECRET_ACCESS_KEY = ""
8: bucket_name = sys.argv[1]
9:
10: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
11: res = oss.put_bucket(bucket_name, "public-read")
12:
13: if 2 == (res.status / 100) :
14: print "Succeed"
15: else:
16: print "Fail\n%s" % res.read()
上传一个文件到存储空间
1: from oss_api import *
2: from oss_xml_handler import *
3:
4: HOST="storage.aliyun.com"
5: ACCESS_ID = ""
6: SECRET_ACCESS_KEY = ""
7: bucket_name = sys.argv[1]
8: file_name = sys.argv[2]
9:
10: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
11:
12: res = oss.put_object_from_file(bucket_name, "1.jpg", file_name, "image/jpg")
13: if 2 == (res.status / 100):
14: print "Succeed"
15: else:
16: print "Fail\n%s" % res.read()
从存储空间下载指定文件
1: from oss_api import *
2: from oss_xml_handler import *
3:
4: HOST="storage.aliyun.com"
5: ACCESS_ID = ""
6: SECRET_ACCESS_KEY = ""
7:
8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
9: res = oss.get_object_to_file("hust", "1.jpg", "./new.JPEG")
10: if 2 == (res.status / 100):
11: print "Succeed"
12: else:
13: print "Fail\n%s" % res.read()
从存储空间删除指定文件
API 没有直接提供像delete_bucket() 样的delete_object() 的操作,但是用object_operation() 也是一样的,实际上delete_bucket() 也只是调用DELETE 方式的bucket_operation() 方法。
1: from oss_api import *
2: from oss_xml_handler import *
3:
4: HOST="storage.aliyun.com"
5: ACCESS_ID = ""
6: SECRET_ACCESS_KEY = ""
7:
8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
9: res = oss.object_operation("DELETE","hust","1.jpg")
10: if 2 == (res.status/100):
11: print "Succeed!"
12: else:
13: print res.read()
合并对象(object)为对象组(object group)
1: from oss_api import *
2: from oss_xml_handler import *
3:
4: HOST="storage.aliyun.com"
5: ACCESS_ID = ""
6: SECRET_ACCESS_KEY = ""
7:
8: obs_msg_list = [[0,"800.png","43B0177C78BF904970F7B1C9005001F3"],[1,"new.JPEG","5C6E71FDF26A82FBD42BC5D31787FE20"]]
9:
10: xml_string = r'<;CreateFileGroup>'
11: for part in obs_msg_list:
12: if isinstance(part[1], unicode):
13: file_path = part[1].encode('utf-8')
14: else:
15: file_path = part[1]
16: xml_string += r'<;Part>'
17: xml_string += r'<;PartNumber>' + str(part[0]) + r'</PartNumber>'
18: xml_string += r'<;PartName>' + str(file_path) + r'</PartName>'
19: xml_string += r'<;ETag>"' + str(part[2]).upper() + r'"</ETag>'
20: xml_string += r'<;/Part>'
21: xml_string += r'<;/CreateFileGroup>'
22:
23: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)
24: res = oss.post_object_group('hust','group.msh',xml_string)
25: if 2 == (res.status / 100):
26: print "Successed!"
27: else:
28: print res.read()
开始我试图自己写xml <CreateFileGroup>………</CreateFileGroup>,老是返回 InvalidXMLFormat ,估计还是编码的问题。
来个复杂点的
下面对上面的操作进行了封装,能够提供一个简单的命令行操作,写的有点乱。
1: #coding=utf8
2: from oss_api import *
3: from oss_xml_handler import *
4: import os
5: import sys
6:
7: HOST="storage.aliyun.com"
8: ACCESS_ID = ""
9: SECRET_ACCESS_KEY = ""
10:
11: class OssFS:
12: def __init__(self, oHost, oId="", oKey=""):
13: self.oHost = oHost
14: self.oId = oId
15: self.oKey = oKey
16: self.oss = OssAPI(oHost,oId,oKey)
17: self.buckets = []
18: res = self.oss.list_all_my_buckets()
19: http_body = res.read()
20: if 2 == (res.status/100):
21: bucket_list = GetServiceXml(http_body)
22: for bucket in bucket_list.list():
23: (h1,h2) = bucket
24: self.buckets.append(str(h1))
25: else:
26: print http_body
27:
28: def show_buckets(self):
29: '''Show all buckets'''
30: for each in self.buckets:
31: print each
32:
33: def sizetoKMGT(self,snum):
34: '''convert size number to KB/MB/GB/TB'''
35: if 0 == int(snum):
36: return "obj grp"
37: lsize = ['Bytes', 'KB', 'MB', 'GB', 'TB']
38: i = 0
39: snum = int(snum) * 1.0
40: while snum >; 1024:
41: snum = snum / 1024.0
42: i += 1;
43: return ('%.2f' + " " +lsize[i])%(snum)
44:
45:
46: def show_objects(self, bucket = '', path = ''):
47: '''Show all objects in the path from the bucket'''
48: res = self.oss.list_bucket(bucket)
49: http_body = res.read()
50: if 2 == (res.status / 100):
51: h = GetBucketXml(http_body)
52: (file_list ,common_list) = h.list()
53: for each in file_list:
54: (fname, ctime, etag, fsize, owner, owner2, fstyle) = each
55: print "%10s\t %s\t %s" % (str(fname),self.sizetoKMGT(fsize),etag)
56: else:
57: print http_body
58:
59: def file_upload(self, bucket='' ,path='' ,file_name=''):
60: '''Upload a file from local PC to the OSS'''
61: res = self.oss.put_object_from_file(bucket, file_name, file_name)
62: if 2 == (res.status/100):
63: print "Success to upload the file: " + file_name
64: return True
65: else:
66: print res.read()
67:
68: def is_file(self, bucket, file_name):
69: '''Wether the file_name is in the bucket or not'''
70: res = self.oss.list_bucket(bucket)
71: http_body = res.read()
72: if 2 ==(res.status / 100):
73: h = GetBucketXml(http_body)
74: (file_list, common_list) = h.list()
75: for each in file_list:
76: fname = str(each[0])
77: if fname == file_name:
78: return True
79: return False
80:
81: def file_download(self, bucket, file_name):
82: '''Download the file(file_name) from the bucket'''
83: res = self.get_object_to_file(bucket, file_name, file_name)
84: if 2 == (res.status / 100):
85: print "Get file: "+file_name
86: else:
87: print res.read()
88:
89: def file_delete(self, bucket, file_name):
90: '''Delete the file(file_name) from the bucket'''
91: res = self.oss.object_operation("DELETE", bucket, file_name)
92: if 2 == (res.status/100):
93: print "Success to delete the file: " + file_name
94: return True
95: else:
96: print res.read()
97: return False
98:
99: if __name__=="__main__":
100: isRoot = True;
101: #isRoot wether is root floder
102: uBucket = ''
103: uPath = ''
104: uFs = OssFS(HOST,ACCESS_ID,SECRET_ACCESS_KEY)
105:
106: print "==============================================="
107: while True:
108: strCmd = raw_input("Input your command:")
109: cmd = strCmd.split(' ')
110:
111: if "ls" == cmd[0]:
112: if 1 == len(cmd):
113: if True == isRoot:
114: uFs.show_buckets()
115: continue
116: else:
117: uFs.show_objects(uBucket,uPath)
118: continue
119: elif 2 == len(cmd):
120: if "local" == cmd[1]:
121: for each in os.listdir('.'):
122: print each
123: continue
124: else:
125: print ">>>ERROR: BAD option! (Try \"ls local\")"
126: continue
127:
128: elif "cd" == cmd[0]:
129: if len(cmd) != 2:
130: print ">>>ERROR: BAD option! (Bucket name or file name need)"
131: continue
132: else:
133: if '.' == cmd[1]:
134: continue
135: elif '..' == cmd[1]:
136: if True == isRoot:
137: print "Root already"
138: else:
139: isRoot = True
140: continue
141: else:
142: if False == isRoot:
143: print ">>>ERROR BAD bucket, you are in bucket already"
144: continue
145: uBucket = cmd[1]
146: if uBucket not in uFs.buckets:
147: print ">>>ERROR BAD bucket"+uBucket+" is not in your buckets"
148: continue
149: isRoot = False
150: continue
151:
152: elif "put" == cmd[0]:
153: if True == isRoot:
154: print ">>>ERROR: BAD PATH (you must cd a bucket first )"
155: continue
156: if len(cmd) != 2:
157: print ">>>ERROR: BAD option! (put a file one time)"
158: continue
159: else:
160: file_name = cmd[1]
161: if os.path.isfile(file_name): #file exist
162: uFs.file_upload(uBucket,uPath,file_name)
163: continue
164: else:
165: print ">>>ERROR: BAD file name (file not exist!)"
166: continue
167:
168: elif "get" == cmd[0]:
169: if len(cmd) !=2:
170: print ">>>ERROR: BAD option"
171: continue
172: if True == isRoot:
173: print ">>>ERROR: BAD get, access a bucket first"
174: continue
175: file_name = cmd[1]
176: if False == uFs.is_file(uBucket, file_name):
177: print file_name + " is not in your bucket" + uBucket
178: continue
179: uFs.file_download(uBacket,file_name) #didnot consider the file has already exist
180: continue
181:
182: elif "delete" == cmd[0]:
183: if True == isRoot:
184: print ">>>BAD delete, I will not let you delete your bucket!"
185: continue
186: if len(cmd) != 2:
187: print ">>>BAD options"
188: continue
189: file_name = cmd[1]
190: if False == uFs.is_file(uBucket, file_name):
191: print file_name + " is not in your bucket" + uBucket
192: continue
193: uFs.file_delete(uBucket, file_name)
194: continue
195:
196: elif "help" == cmd[0]:
197: pass
198: continue
199:
200: elif "quit" == cmd[0]:
201: break
202:
203: else:
204: print "Unable to recognize your command!"
205: continue
206: print "==============================================="
学习了
这是比较老的API了,建议你去官网下载最新的API 和文档,比我这应该详细很多。