阿里云存储服务OSS API(python)

本文将从阿里云存储中基本概念,python API 和几个例子来简述阿里云python API 的使用,使新手能够简单上手阿里云的云存储。


阿里云存储中的基本概念

对象 Object

阿里云存储中,任何一个存储的文件都是一个对象,包含了key、data 和user meta,分别对应着对象的名称、数据和对象描述。对象名使用utf-8 编码,且长度在1-1023 之间。操作对象的操作有:get、range query、delete、list。

对象组 Object Group

多个对象组成的松散集合。可以像操作对象一样操作对象组。

Bucket

桶就是用来装对象数据的,在整个阿里云存储服务(OSS)中具有全局唯一性,相当于是阿里云给每个用户的二级域名,每个用户最多可以创建10 个桶。所有的对象都是放在桶中的,至于能放多少没有限制。桶的命名包括小写字母、数字、下划线和短横线,必须以小写字母或者数字开头。通过浏览器设置存储空间长度只能在6-16 个字符内,而通过python 的API 可以设置在3-255 个字符内。

Access IDAccess key

在OSS 注册时,用户会给用户分配Access ID 和Access key 分别用来标识用户和认证用户用。Access key 是作为私钥发送给用户,每次用户为认证自己,将一段随机数字加上用私钥加密随机数字的结果发送给服务器端,服务器用用户的公钥解密对应的加密结果,如果和随机数字一致则用户得到验证。服务器端如果公钥泄密问题不大,但如果用户的私钥泄露出去了则就要更换公私密钥对了,生成公私密钥对的大素数对是可以不变的,虽然这也有一定风险。【如果不明白上面一段说了什么可以忽略】

总结一下:

  • Bucket 、Object 和Object Group 都支持的操作:创建(上传)、查看、list、删除
  • 可修改Bucket 访问权限(三种:私有、可读不可写、可读可写)
  • 访问时支持If-Modified-Since 和If-Match 等HTTP 参数

OSS API SDK

包含的py 文件

这部分将根据阿里云存储服务提供的python API 看看支持哪些操作,下载python SDK 看到压缩包里包含这么几个文件:oss_api.py  oss_cmd.py  oss_fs.py  oss_sample.py  oss_util.py  oss_xml_handler.py ,作用分别如下:

oss_api.py    用户直接使用该文件的类和方法操作Object、Bucket 等

oss_util.py    通用方法类,包含不被直接调用,但比较常用的类和方法

oss_xml_handler.py   阿里云存储的错误信息都是xml 格式的,用来处理xml 的

oss_cmd.py   用户可以以命令行的形式操作Object、Bucket 【optional】

oss_fs.py       利用OSS 搭建的一个简单的云存储文件系统 【optional】

oss_sample.py   简单的一些实用API 的例子【optional】

需要说明的是文件里面使用的类和方法有的比较古老了,编译的时候可能出现错误,比如oss_util.py 中md5 已经合并到hashlib 类了,md5.new() 方法也没有了,在编译oss_sample.py 的时候就会发现这样的错误。

包含的主要方法

看看oss_api.py 提供了哪些类和方法:整个文件就一个OssAPI 类,在使用该API 时首先应该通过

oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

的方法生成一个该类的实例。该类对外公开的API 有以下方法:

sign_url_auth_with_expire_time(self, method, url, headers = {}, resource=”/”, timeout = 60)

通过输入的方法、url、body 和headers 创建认证,返回签名链接,method 方法有PUT, GET, DELETE, HEAD。这类似于生成一个公开的限时签名链接地址。

bucket_operation(self, method, bucket, headers={}, params={})

发送桶bucket 操作请求,method 方法有PUT, GET, DELETE, HEAD 。

object_operation(self, method, bucket, object, headers = {}, data=””)

发送对象object 操作请求,method 方法有PUT, GET, DELETE, HEAD 。

get_service(self)

list 所有buckets,等价调用list_all_my_buckets(self) 。

get_bucket_acl(self, bucket)

获取bucket 访问控制权限,有public-read-write,public-read 和private 三类。

get_bucket(self, bucket, prefix=”, marker=”, delimiter=”, maxkeys=”, headers = {})

list bucket 中所有对象,等价调用list_bucket() ,因此参数、结果相同。

create_bucket(self, bucket, acl=”, headers = {})

创建bucket,实际上是调用的put_bucket() 。

delete_bucket(self, bucket)

删除bucket 。

put_object_with_data(self, bucket, object, input_content, content_type=DefaultContentType, headers = {})

向对象中写入数据,等价调用put_object_from_string() 。

put_object_from_file(self, bucket, object, filename, content_type=DefaultContentType, headers = {})

从文件读取写入到对象中,实际上就是上传该文件,对应的参数是文件绝对路径。

put_object_from_fp(self, bucket, object, fp, content_type=DefaultContentType, headers = {})

从文件指针写入数据到对象,和上面方法不同是的参数为文件指针。

get_object(self, bucket, object, headers = {})

获取一个对象,即是以GET 方式调用object_operation() 方法。

get_object_to_file(self, bucket, object, filename, headers = {})

获取一个对象并写入到文件,即下载文件,DELETE 方式调用object_operation() 。

head_object(self, bucket, object, headers = {})

获取对象的头信息,而不用下载整个对象内容。

post_object_group(self, bucket, object, object_group_msg_xml, headers = {}, params = {})

上传对象组,将所有在object_group_msg_xml 中的对象合并为一个对象上传。

get_object_group_index(self, bucket, object, headers = {})

获取对象组索引,通过GET 方式的object_operation() 方法获得。

put_object_from_file_given_pos(self, bucket, object, filename, offset, partsize, content_type=DefaultContentType, headers = {})

从特定文件filename 的指定位置offset 读取指定大小partsize 到对象,并上传到桶bucket 中。

upload_large_file(self, bucket, object, filename, thread_num = 10, max_part_num = 1000)

上传一个大文件,分为1000 或者很多份,分别上传到桶bucket 中,最终合并多份内容为一个大的对象。

响应信息

conn 是httplib.HTTPConnection(self.host) 返回的一个连接,通过制定方法和报头可以操作self.host 网站,通过res = conn.getresponse() 获得响应结果。res 具有以下属性和方法:

                                      res.version

                                      res.reason

                                      res.status

                                      res.msg

                                      res.getheaders()

和服务器操作的方法会返回一个响应值,即是HTTP 协议回应的响应状态代码(res.status),三位数表示,第一个数代表响应类别:

                                     1XX:指示信息,表示请求已接收,继续处理

                                     2XX:成功,表示请求已成功接收,理解和接受

                                     3XX:重定向,要求请求必须进一步操作

                                     4XX:客户端错误,请求有语法错误或者请求无法实现

                                     5XX:服务器端错误,服务器未能实现合法的请求

这是一个大致的响应信息,如果需要更详细的了解,参考错误响应文档。


几个例子

在自带的oss_sample.py 就包含了一些主要操作的例子,可以参考借鉴。下面给出几个常见操作的范例:

list buckets  罗列所有存储空间

   1: from oss_api import *

   2: from oss_xml_handler import *

   3:  

   4: HOST="storage.aliyun.com"

   5: ACCESS_ID = ""

   6: SECRET_ACCESS_KEY = ""

   7:  

   8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

   9:  

  10: res = oss.list_all_my_buckets()

  11: if 2 == (res.status / 100):

  12:     http_body = res.read()

  13:     bucket_list = GetServiceXml(http_body)

  14:     for bucket in bucket_list.list():

  15:         print bucket

  16: else:

  17:     print "ERROR"

 

list objects 罗列存储空间中所有对象

   1: from oss_api import *

   2: from oss_xml_handler import *

   3:  

   4: HOST="storage.aliyun.com"

   5: ACCESS_ID = ""

   6: SECRET_ACCESS_KEY = ""

   7:  

   8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

   9:  

  10: res = oss.list_bucket(bucket_name)

  11: if 2 == (res.status / 100):

  12:     data = res.read()

  13:     h = GetBucketXml(data)

  14:     (file_list, common_list) = h.list()

  15:     for each in file_list:

  16:         print each

  17:  

 

create a bucket  创建一个存储空间

   1: from oss_api import *

   2: from oss_xml_handler import *

   3: import sys

   4:  

   5: HOST="storage.aliyun.com"

   6: ACCESS_ID = ""

   7: SECRET_ACCESS_KEY = ""

   8:  

   9: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

  10: new_bucket = sys.argv[1]

  11: res = oss.put_bucket(new_bucket)

  12: if 2 == (res.status / 100) :

  13:     print "Succeed"

  14: else: 

  15:     print "Fail\n%s" % res.read()

 

修改指定存储空间访问控制权限

因为创建存储空间默认为private ,需要自己设置权限

   1: from oss_api import *

   2: from oss_xml_handler import *

   3: import sys

   4:  

   5: HOST="storage.aliyun.com"

   6: ACCESS_ID = ""

   7: SECRET_ACCESS_KEY = ""

   8: bucket_name = sys.argv[1]

   9:  

  10: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

  11: res = oss.put_bucket(bucket_name, "public-read")

  12:  

  13: if 2 == (res.status / 100) :

  14:     print "Succeed"

  15: else:

  16:     print "Fail\n%s" % res.read()

 

上传一个文件到存储空间

   1: from oss_api import *

   2: from oss_xml_handler import *

   3:  

   4: HOST="storage.aliyun.com"

   5: ACCESS_ID = ""

   6: SECRET_ACCESS_KEY = ""

   7: bucket_name = sys.argv[1]

   8: file_name = sys.argv[2]

   9:  

  10: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

  11:  

  12: res = oss.put_object_from_file(bucket_name, "1.jpg", file_name, "image/jpg")

  13: if 2 == (res.status / 100):

  14:     print "Succeed"

  15: else: 

  16:     print "Fail\n%s" % res.read()

 

从存储空间下载指定文件

   1: from oss_api import *

   2: from oss_xml_handler import *

   3:  

   4: HOST="storage.aliyun.com"

   5: ACCESS_ID = ""

   6: SECRET_ACCESS_KEY = ""

   7:  

   8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

   9: res = oss.get_object_to_file("hust", "1.jpg", "./new.JPEG")

  10: if 2 == (res.status / 100):

  11:     print "Succeed"

  12: else:

  13:     print "Fail\n%s" % res.read()

 

从存储空间删除指定文件

API 没有直接提供像delete_bucket() 样的delete_object() 的操作,但是用object_operation() 也是一样的,实际上delete_bucket() 也只是调用DELETE 方式的bucket_operation() 方法。

   1: from oss_api import *

   2: from oss_xml_handler import *

   3:  

   4: HOST="storage.aliyun.com"

   5: ACCESS_ID = ""

   6: SECRET_ACCESS_KEY = ""

   7:  

   8: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

   9: res = oss.object_operation("DELETE","hust","1.jpg")

  10: if 2 == (res.status/100):

  11:         print "Succeed!"

  12: else:

  13:         print res.read()

 

合并对象(object)为对象组(object group)

   1: from oss_api import *

   2: from oss_xml_handler import *

   3:  

   4: HOST="storage.aliyun.com"

   5: ACCESS_ID = ""

   6: SECRET_ACCESS_KEY = ""

   7:  

   8: obs_msg_list = [[0,"800.png","43B0177C78BF904970F7B1C9005001F3"],[1,"new.JPEG","5C6E71FDF26A82FBD42BC5D31787FE20"]]

   9:  

  10: xml_string = r'<;CreateFileGroup>'

  11: for part in obs_msg_list:

  12:     if isinstance(part[1], unicode):

  13:         file_path = part[1].encode('utf-8')

  14:     else:

  15:         file_path = part[1]

  16:         xml_string += r'<;Part>'

  17:         xml_string += r'<;PartNumber>' + str(part[0]) + r'</PartNumber>'

  18:         xml_string += r'<;PartName>' + str(file_path) + r'</PartName>'

  19:         xml_string += r'<;ETag>"' + str(part[2]).upper() + r'"</ETag>'

  20:         xml_string += r'<;/Part>'

  21: xml_string += r'<;/CreateFileGroup>'

  22:  

  23: oss = OssAPI(HOST, ACCESS_ID, SECRET_ACCESS_KEY)

  24: res = oss.post_object_group('hust','group.msh',xml_string)

  25: if 2 == (res.status / 100):

  26:     print "Successed!"

  27: else:

  28:     print res.read()

开始我试图自己写xml <CreateFileGroup>………</CreateFileGroup>,老是返回 InvalidXMLFormat ,估计还是编码的问题。

来个复杂点的

下面对上面的操作进行了封装,能够提供一个简单的命令行操作,写的有点乱。

   1: #coding=utf8

   2: from oss_api import *

   3: from oss_xml_handler import *

   4: import os

   5: import sys

   6:  

   7: HOST="storage.aliyun.com"

   8: ACCESS_ID = ""

   9: SECRET_ACCESS_KEY = ""

  10:  

  11: class OssFS:

  12:     def __init__(self, oHost, oId="", oKey=""):

  13:         self.oHost = oHost

  14:         self.oId = oId

  15:         self.oKey = oKey

  16:         self.oss = OssAPI(oHost,oId,oKey)

  17:         self.buckets = []

  18:         res = self.oss.list_all_my_buckets()

  19:         http_body = res.read()

  20:         if 2 == (res.status/100):

  21:             bucket_list = GetServiceXml(http_body)

  22:             for bucket in bucket_list.list():

  23:                 (h1,h2) = bucket

  24:                 self.buckets.append(str(h1))

  25:         else:

  26:                 print http_body

  27:  

  28:     def show_buckets(self):

  29:         '''Show all buckets'''

  30:         for each in self.buckets:

  31:             print each

  32:  

  33:     def sizetoKMGT(self,snum):

  34:         '''convert size number to KB/MB/GB/TB'''

  35:         if 0 == int(snum):

  36:             return "obj grp"

  37:         lsize = ['Bytes', 'KB', 'MB', 'GB', 'TB']

  38:         i = 0

  39:         snum = int(snum) * 1.0

  40:         while snum >; 1024:

  41:             snum = snum / 1024.0

  42:             i += 1;

  43:         return ('%.2f' + " " +lsize[i])%(snum)

  44:  

  45:  

  46:     def show_objects(self, bucket = '', path = ''):

  47:         '''Show all objects in the path from the bucket'''

  48:         res = self.oss.list_bucket(bucket)

  49:         http_body = res.read()

  50:         if 2 == (res.status / 100):

  51:             h = GetBucketXml(http_body)

  52:             (file_list ,common_list) = h.list()

  53:             for each in file_list:

  54:                 (fname, ctime, etag, fsize, owner, owner2, fstyle) = each

  55:                 print "%10s\t %s\t %s" % (str(fname),self.sizetoKMGT(fsize),etag)

  56:         else:

  57:             print http_body

  58:  

  59:     def file_upload(self, bucket='' ,path='' ,file_name=''):

  60:         '''Upload a file from local PC to the OSS'''

  61:         res = self.oss.put_object_from_file(bucket, file_name, file_name)

  62:         if 2 == (res.status/100): 

  63:             print "Success to upload the file: " + file_name

  64:             return True

  65:         else:

  66:             print res.read()

  67:     

  68:     def is_file(self, bucket, file_name):

  69:         '''Wether the file_name is in the bucket or not'''

  70:         res = self.oss.list_bucket(bucket)

  71:         http_body = res.read()

  72:         if 2 ==(res.status / 100):

  73:             h = GetBucketXml(http_body)

  74:             (file_list, common_list) = h.list()

  75:             for each in file_list:

  76:                 fname = str(each[0])

  77:                 if fname == file_name:

  78:                     return True

  79:             return False

  80:  

  81:     def file_download(self, bucket, file_name):

  82:         '''Download the file(file_name) from the bucket'''

  83:         res = self.get_object_to_file(bucket, file_name, file_name)

  84:         if 2 == (res.status / 100):

  85:             print "Get file: "+file_name

  86:         else:

  87:             print res.read()

  88:     

  89:     def file_delete(self, bucket, file_name):

  90:         '''Delete the file(file_name) from the bucket'''

  91:         res = self.oss.object_operation("DELETE", bucket, file_name)

  92:         if 2 == (res.status/100):

  93:             print "Success to delete the file: " + file_name

  94:             return True

  95:         else:

  96:             print res.read()

  97:             return False

  98:  

  99: if __name__=="__main__":

 100:     isRoot = True;

 101:     #isRoot wether is root floder

 102:     uBucket = ''

 103:     uPath = ''

 104:     uFs = OssFS(HOST,ACCESS_ID,SECRET_ACCESS_KEY)

 105:  

 106:     print "==============================================="

 107:     while True:

 108:         strCmd = raw_input("Input your command:")

 109:         cmd = strCmd.split(' ')

 110:  

 111:         if "ls" == cmd[0]:

 112:             if 1 == len(cmd):

 113:                 if True == isRoot:

 114:                     uFs.show_buckets()

 115:                     continue

 116:                 else:

 117:                     uFs.show_objects(uBucket,uPath)

 118:                     continue

 119:             elif 2 == len(cmd):

 120:                 if "local" == cmd[1]:

 121:                     for each in os.listdir('.'):

 122:                         print each

 123:                     continue

 124:                 else:

 125:                     print ">>>ERROR: BAD option! (Try \"ls local\")"

 126:                     continue

 127:  

 128:         elif "cd" == cmd[0]:

 129:             if len(cmd) != 2:

 130:                 print ">>>ERROR: BAD option! (Bucket name or file name need)"

 131:                 continue

 132:             else:

 133:                 if '.' == cmd[1]:

 134:                     continue

 135:                 elif '..' == cmd[1]:

 136:                     if True == isRoot:

 137:                         print "Root already"

 138:                     else:

 139:                         isRoot = True

 140:                         continue

 141:                 else:

 142:                     if False == isRoot:

 143:                         print ">>>ERROR BAD bucket, you are in bucket already"

 144:                         continue

 145:                     uBucket = cmd[1]

 146:                     if uBucket not in uFs.buckets:

 147:                         print ">>>ERROR BAD bucket"+uBucket+" is not in your buckets"

 148:                         continue

 149:                     isRoot = False

 150:                     continue

 151:  

 152:         elif "put" == cmd[0]:

 153:             if True == isRoot:

 154:                 print ">>>ERROR: BAD PATH (you must cd a bucket first )"

 155:                 continue

 156:             if len(cmd) != 2:

 157:                 print ">>>ERROR: BAD option! (put a file one time)"

 158:                 continue

 159:             else:

 160:                 file_name = cmd[1]

 161:                 if os.path.isfile(file_name):                    #file exist

 162:                     uFs.file_upload(uBucket,uPath,file_name)

 163:                     continue

 164:                 else:

 165:                     print ">>>ERROR: BAD file name (file not exist!)"

 166:                     continue

 167:  

 168:         elif "get" == cmd[0]:

 169:             if len(cmd) !=2:

 170:                 print ">>>ERROR: BAD option"

 171:                 continue

 172:             if True == isRoot:

 173:                 print ">>>ERROR: BAD get, access a bucket first"

 174:                 continue

 175:             file_name = cmd[1]

 176:             if False == uFs.is_file(uBucket, file_name):

 177:                 print file_name + " is not in your bucket" + uBucket

 178:                 continue

 179:             uFs.file_download(uBacket,file_name)    #didnot consider the file has already exist

 180:             continue

 181:         

 182:         elif "delete" == cmd[0]:

 183:             if True == isRoot:

 184:                 print ">>>BAD delete, I will not let you delete your bucket!"

 185:                 continue

 186:             if len(cmd) != 2:

 187:                 print ">>>BAD options"

 188:                 continue

 189:             file_name = cmd[1]

 190:             if False == uFs.is_file(uBucket, file_name):

 191:                 print file_name + " is not in your bucket" + uBucket

 192:                 continue

 193:             uFs.file_delete(uBucket, file_name)

 194:             continue

 195:             

 196:         elif "help" == cmd[0]:

 197:             pass

 198:             continue

 199:  

 200:         elif "quit" == cmd[0]:

 201:             break

 202:  

 203:         else:

 204:             print "Unable to recognize your command!"

 205:             continue

 206:     print "==============================================="

阿里云存储服务OSS API(python)》上有2条评论

    • 这是比较老的API了,建议你去官网下载最新的API 和文档,比我这应该详细很多。

发表回复

您的电子邮箱地址不会被公开。 必填项已用*标注

此站点使用Akismet来减少垃圾评论。了解我们如何处理您的评论数据