Differentiated Storage Services(差异化存储服务)

原文地址

  • Main idea

文章提出一个分类结构减少计算机系统和存储系统的日益扩大的语义差距。(我的理解文章做的工作就是将文件分级,按等级进行缓存提高效率)

We propose an I/O classification architecture to close the widening semantic gap between computer systems and storage systems.

 

  • 实验用例(Workload)

Filesystem prototypes and a database proof-of-concept that classify all disk I/O—with very little modification to the filesystem, database, and operation system.

 

  • 存在问题(Problems)

Block-based storage interface makes it difficult for computer systems to optimize for increasingly complex storage storage system internals, and storage systems do not have the semantic information to optimize independently. 块设备较稳定,但是问题就是带来存储没有上层语义信息进行存储优化。

 

  • 现有的解决方法(existing solutions)
    1. obtain more knowledge of storage system internals and use this information to guide block allocation. 计算机系统获取更多存储系统信息
    2. discover more about on-disk data structures and optimize I/O accesses to these structures. 对磁盘上不同数据结构进行I/O 优化
    3. I/O interface can evolve and become more expressive; object-based storage(天啊,我想到了老板) and type-safe disks fall into this category 增强型I/O 接口

这些解决方法也对应的遇到问题:1.计算机很难稳定地获得存储系统的内部信息;2.增加了计算机系统复杂度;3.受到块设备的限制。

另外还提到的一个解决方法也是我在之前考虑过的:通过预测模型加上预取提高I/O 性能,但这样不仅预测模型很难获取,而且很可能适得其反。

 

  • 文中提出的解决方法(solution provided by the paper)

We modify the OS block layer so that every I/O request carries a classifier. In this way, a storage system can provide block-level differentiated services—and do so on a class-by-class basis.

 

  • OS/FS/APP requirements(操作系统、文件系统、应用需求)
    1. In UNIX and Windows, we add a classification field to the OS data structure for block I/O (the Linux “BIO” and the Windows “IRP”) and we copy this field into the actual I/O command (SCSI or ATA)before it is sent to the storage system.
    2. FS have a classification scheme fot its I/O and assigns a policy to each class.
    3. Differentiated Storage services is a stateless protocol  ……(有点不懂既然每次读写都带了classificator,和存储设备就应该没有关系了,难道是因为分类在实现上是写到了不同存储介质,期望以后block 位置都不改变了,而其他情况可能改变block 的分配?)
    4. 在读写的时候添加分类符:classificator
  • Classifications(分类)

Class ID 从1-18,0 分别对应的是Ext3 superblock,group descriptor,bitmap,inode,directoryentry,journal entry,File<=4KB,File<=16KB ……File>1GB,unclassified 19个类,对应的优先级是0,0,0,0,0,0,0,1到12(0 最高)。

 

  • Evaluation(实验评估)

Workload: fileserver(文件服务器)、e-mail server(邮件服务器)and PostgreSQL database

Method: in-house, file-based workload generator for file and e-mail server workloads. Record the number of files written/read per second. Compare the performance of three storage configurations: no SSD cache, an LRU cache, and an enhanced LRU cache(LRU-S) that uses selective allocation and selective eviction.

Result: LRU-S caches much more smile file and metadata, so all results accord with it.

 

文章提出了减少存储系统和计算机系统语义差距的一个方法,即在读写时添加标识符,对操作的文件根据其属性进行分级,提高系统性能,但缺点也是显而易见的,需要对从应用程序,操作系统,存储系统甚至存储设备进行修改,虽然文章强调对每个部分的修改都是很少的;其次测试中选了了10GB SSD 进行缓存,更多容量的缓存得出怎么的结果并没有给出,能不能适用于现在的缓存还很难说,对于像momentus MX 这样的混合硬盘可能比较合适;最后不管怎么说,文章给块设备(或者说文件)添加了一个属性,不也就是对象的思想么?既然块设备转变为面向对象的存储有难度,可以像本文所提到的做一个简单的对象存储,这或许是个好的思路。