CRUSH：受控的、可扩展的数据副本分散存储方案(Part 1)

deltamaster posted @ 13 年前 in 未分类 with tags Ceph CRUSH 分布式存储 decentralization , 6476 阅读

** 以下内容为翻译

** 翻译：softrank.net@gmail.com

** 转载请保留以上信息

** 欢迎各位游客指正翻译中的错误，或提出建议

CRUSH：受控的、可扩展的数据副本分散存储方案

作者：Sage A. Weil Scott A. Brandt Ethan L. Miller Carlos Maltzahn

美国加州大学圣克鲁兹分校存储系统研究中心

{sage, scott, elm, carlosm}@cs.ucsc.edu

摘要

　　新兴的大规模分布式存储系统正面临着将PB级的数据分散到成千上万个存储设备上的挑战，这样的系统必须要均匀分布数据与负载，是的系统资源及性能能够得到最大程度的利用，同时应对系统的增长和硬件故障。我们开发了CRUSH算法，它是一个可扩展的伪随机数据分布函数，用于在基于对象的存储系统中，将数据对象高效地与存储设备建立映射，而无需通过查找中心目录的机制。由于大规模的系统向来都是动态化的，因此CRUSH的设计有利于存储设备的增加和移除，并尽可能减少不必要的数据移动。算法能够适应各种数据冗余和可靠性机制的要求，按照用户定义的方式管理数据的分布，增强了跨故障域数据副本的隔离性。

1 简介

　　基于对象的存储是一种新兴的存储架构，增强了可维护性、可扩展性和性能[Azagury et al. 2003]。与基于块存储的传统硬盘驱动器不同，基于对象的存储设备在内部管理物理块的分配，仅对外提供读写不同大小的命名对象。在这种系统中，每个文件的数据通常都被条带化地分散于相对较少数的命名对象中，分散地存储于存储集群中。对象在多个设备间保留副本（或者使用其他某种数据冗余模式），以防止发生故障时可能导致的数据丢失。基于对象的存储系统用较小的对象列表代替庞大的块映射表，分散了底层块分配问题的处理压力，使得数据的布局更加简化。尽管通过减少文件分配元信息的复杂性，已经极大提高了可扩展性，然而如何在成千上万的设备上（尤其是容量、性能迥异的设备）分发数据的基本问题仍然存在。

　　大多数系统只是简单地将新数据写入到尚有空余的设备上，这种方法的基本问题是，数据只有在被写入时可能会移动，其他情况几乎不会移动。即便是一种完美的分布，随着系统规模的增长也会变得难以平衡，因为新磁盘要么是空的，要么包含了新数据，要么旧的磁盘忙，要么新的磁盘忙，这取决于系统的负载，而只有在某种极为罕见的特定情况下，才能使得可用资源平衡地得到最大限度的充分利用。

　　一种健壮的解决方案是将所有数据随机分布在可用的存储设备中。这实现了依概率的均衡数据分布，以及新老数据以类似的方式的混合存放在一起。当新设备添加到系统中时，一些随机的数据样本会被迁移到新的设备上以维持其负载的平衡。这种做法的好处是，平均来讲，所有设备的负载情况都差不多，使得系统在各种可能的负载情况下都能够高效运作[Santos et al. 2000]。此外，在一个大的存储系统中，单个大文件会被随机分布到许多可用的存储设备中，提供了高度的并行性和聚合带宽。然而简单的基于散列函数的分布无法应对设备数量的变化，将造成了大规模的数据迁移。而现有的随机分布模式将磁盘副本随机分配到其他设备上，增加了由于意外的设备故障导致数据丢失的可能性。

　　我们开发了CRUSH（可扩展散列下的受控复制算法）是一个伪随机数据分布算法，高效且可靠地将数据分发到异构的、结构化的存储集群中。CRUSH是伪随机的、确定性的算法，将一个输入值（往往是对象或对象组标识符）映射到一组存有相应对象副本的设备上。与其他传统方法明显不同的一点是，我们不需要任何形式的文件或对象目录，CRUSH只需要一个包含存储集群及其数据存放策略的紧凑的而层次化的设备描述。这种方式有两个关键的优越性：首先，它是完全分布式的，这意味着一个大型系统的任何参与者都可以独立计算出对象所在的位置；第二，仅需要的少量元数据是静态的，这些元数据仅当增加或移除设备时才发生变化。

　　CRUSH的设计使得它能够以最佳的方式分发数据，充分地利用资源，在设备增加或移除时高效地组织数据，灵活地限定数据对象副本的存放策略，最大程度地保障设备的意外或相关故障下的数据安全。各种数据安全策略都有广泛的支持，包括n-way复制（镜像）、RAID奇偶校验策略或者其他的校验策略，或者是其他某种混合的方式（例如RAID-10）。这些特性使得CRUSH非常适合在扩展能力、性能和可靠性都非常关键的存储系统中，管理超大规模的数据（PB级数据量）。

* 本文在CC BY-SA（署名-相同方式共享）协议下发布。

相关文章
全局相关文章

[回复]

Digital Ali 说:
4 年前 This is such a great resource that you are providing and you give it away for free. 123movie

[回复]

SEO 说:
3 年前

Awesome and interesting article. Great things you've always shared with us. Thanks. Just continue composing this kind of post. 오피아트

[回复]

SEO 说:
3 年前 Took me time to read all the comments, but I really enjoyed the article. It proved to be Very helpful to me and I am sure to all the commenters here! It’s always nice when you can not only be informed, but also entertained! 오피시티

[回复]

bilal 说:
3 年前

Really I enjoy your site with effective and useful information. It is included very nice post with a lot of our resources.thanks for share. i enjoy this post. judi slots

[回复]

Forgot Router Passwo 说:
2 年前

Now a-days the WiFi router has become one of the daily life electronic gadget as the Networking is spreading its wings across the world, the router connects different devices to the network interface, Forgot Router Password any user needed to use the username and password to login to the router credential interface and then can configure router settings, as it general human nature to forget something after a long time.

[回复]

global news 说:
2 年前

Good to become visiting your weblog again, it has been months for me. Nicely this article that i've been waited for so long. I will need this post to total my assignment in the college, and it has exact same topic together with your write-up. Thanks, good share.

[回复]

Cladder 说:
2 年前

Really appreciate this wonderful post that you have provided for us.

[回复]

f88tw 说:
2 年前

敬啟者：個人小網站希望大家多多支持感謝您對我們熱心的支持 f88tw｜華歌爾｜I appreciate your kind assistance.
https://mypaper.pchome.com.tw/f88tw
f88tw|修墳|修墓|新竹|桃園|苗栗|撿骨|拾骨|發票
 https://mypaper.pchome.com.tw/f88tw/post/1370781143
https://mypaper.m.pchome.com.tw/f88tw/post/1370781143

[回复]

Ower Shelf 说:
2 年前

It’s very informative and you are obviously very knowledgeable in this area. You have opened my eyes to varying views on this topic with interesting and solid content.

[回复]

Front Door 说:
大约 1 年前

I exactly got what you mean, thanks for posting. And, I am too much happy to find this website on the world of Google.

[回复]

Falcon Media Marketi 说:
大约 1 年前

Hey, I am so thrilled I found your blog, I am here now and could like to thank you for a tremendous post and all-round interesting website. Please keep up the great work. I cannot be without visiting your blog again and again.

[回复]

mg comet ev price 说:
大约 1 年前

I definitely enjoying every little bit of it. It is a great website and nice share. I want to thank you. Good job! You guys do a great blog, and have some great contents. Keep up the good work.

[回复]

how2invest 说:
大约 1 年前

I was exceptionally satisfied to find this site.I needed to thank you for this extraordinary read!! I most certainly partaking in every single piece of it and I have you bookmarked to look at new stuff you post.

[回复]

SEBA 10th Question 说:
大约 1 年前

SEVA Board Students can Download Their Subject wise PDF Files it will be Beneficial for Study Material Practice and Regular Check Bit Bank and Question Bank, Important Questions, Students who are Attending Assam 10th Annual Exam SEBA 10th Question Paper 2024 can make use of this Information which we have Provided here. This Website Available Assam HSLC Last year Exam Question Paper for Students can make a better Performance in Exam 2024.Students who have completed their Preparation can Start Practicing the SEBA HSLC Previous Question Paper 2024.

[回复]

Podcast 说:
大约 1 年前

I’m happy to find so many useful info here in the post, thanks for sharing. I love the variety of content available on your website. I’ll bookmark your blog and take the feeds also!

Deltamaster Tech Center

CRUSH：受控的、可扩展的数据副本分散存储方案(Part 1)

CRUSH：受控的、可扩展的数据副本分散存储方案

摘要

1 简介

deltamaster

搜索

分类

标签云

RSS

功能

链接

最新评论

最新留言