来自dirkmeister 关于存储系统的总结

原文:http://dirkmeister.blogspot.com/2010/01/storage-systems-course-my-own-idea.html 需翻墙

 

我上一篇文章总结了国际顶级学校及其存储系统实验室的一些存储课程。

在这篇文章中,我将提取我自己对存储课程的一些观点和想法并组织起来。我假设有个15 周的课程并且每周有1 个半小时讲演(就像我们在德国):

引言,概述,磁盘驱动器结构

Material: Ruemmler, Wilkes An introduction to disk drive modeling

  1. Disk Scheduling / SSD 磁盘调度/固态硬盘
    Material: Iyer, Druschel. Anticipatory scheduling: A disk scheduling framework to overcome deceptive idleness in synchronous I/O, Agrawal et al. Design Tradeoffs for SSD Performance
  2. RAID  冗余磁盘整列
    Material: Patterson et al. Introduction to Redundant Arrays of Inexpensive Disk (RAID), Corbett. Row-Diagonal Parity for Double Disk Failure Correction
  3. Local File Systems  本地文件系统
  4. Local File System Case Studies: ext3, btrfs
    Material: Valerie Aurora. A short history of btrfs, Card et al. Design and Implementation of the Second Extended Filesystem
  5. Local File Structures (Sequential, Hashing, B-Tree)  本地文件结构
    Material: Comer. The Ubiquitous B-Tree
  6. SAN / NAS / Object-based Storage
    Material: Sacks. Demystifying DAS, SAN, NAS, NAS Gateways, Fibre Channel, and iSCSI
  7. Examples: NFS, Ceph, GoogleFS/Hadoop DFS
    Material: Weil. Ceph, A scalable, high-performance distributed file system, Ghemawat et al. The Google File System
  8. Snapshots and Log-based Storage Designs  快照和基于日志存储设计
    Material: Brinkmann, Effert. Snapshots and Continuous Data Replication in Cluster Storage Environments, Hitz et al.File System Design for an NFS File Server Appliance, Rosenblum, Ousterhout. The Design and Implementation of a Log-Structured File System
  9. Fault Tolerance, Journaling, and Soft Updates 容错,日志和软件升级
    Material: Prabhakaran et al. Analysis and Evolution of Journaling File Systems, Seltzer et al. Journaling Versus Soft Updates: Asynchronous Meta-data Protection in File Systems
  10. Advanced Hashing: Consistent Hashing, Share, and Crush
    Material: Karger et al. Consistent hashing and random trees: distributed caching protocols for relieving hot spots on the World Wide Web, Weil et al. CRUSH: controlled, scalable, decentralized placement of replicated data
  11. Caching, Replication  缓存,副本管理
    Material: Nelson et al. Caching in the Sprite network file system, Kistler et al. Disconnected operation in the Coda File System
  12. Consistency, Availability, and Partition Tolerance  一致性,可用性和分区容忍性
    Material: DeCandia et al. Dynamo: Amazon’s Highly Available Key-value Store, Helland, Life beyond Distributed Transaction: An Apostate’s Opinion
  13. Data Deduplication  重复数据删除
    Material: Muthitacharoen et al., A Low-bandwidth Network File System, Douglis, Iyengar. Application-specific Delta-encoding via Resemblance Detection
  14. Performance Analysis  新能分析
    Material: Traeger, A nine year study of file system and storage benchmarking (at least parts of it)

我推荐的一些书:

对于我来说,有些关键点非常重要:

  • To clearly separate between classes of file systems and a concrete example. The best example is the class of network file systems vs. NFS. At the end there should be no much question if something is an inherent property of a class of file systems or of the concrete implementation  能清楚的分辨不同类型文件系统和具体的例子。最好的例子就是网络文件系统类和NFS。最后应该对某个东西是某类文件系统的相关性质还是具体应用不存在疑问
  • To have enough time to handle the basic concepts independently from concrete usages. For example explaining B-Trees as an important file structures independent from the usage in e.g. BTRFS.    独立于具体应用,花足够的时间掌握基本概念。比如解释B-树是一个重要的文件结构,独立于比如在BTRFS 中的使用
  • The concepts are more important than the current technology or standards.  概念远比当前的一些技术和标准重要。