DNACloud:一种基于DNA的大数据存储工具
还在思考海量的数据到底该存储在何处吗?来自欧洲生物信息研究所的Goldman 和他的同事们告诉我们,DNA可以作为存储介质,在1克的DNA上可以储存1PB的信息,DNACloud软件可以更好地支持在这种存储。
我们去哪里存储人们十分关注的海量数据呢?如何在 DNACloud存储每克DNA中1PB字节的信息?
写入有点慢。你需要把你的的数据文件转换成DNA描述,然后发送给任何一家生物科技公司,他们会给你合成DNA,你可以将其存储在你的冰箱。
读写也有点慢。很明显这些数据可以被非常准确地读取,但是要读写它,你必须先按顺序排好DNA,这可能需要一段时间。
大数据通常是用来形容来自人类使用的数字媒体,如照相机,互联网,手机,传感器等产生的巨额数据。 基于大数据建立先进的分析模型,我们可以预测用户的很多事情诸如行为,兴趣等。然而在使用这些数据之前,必须解决大数据存储的许多问题。其中两个主要的问题是大存储设备的需要和与之关联的成本。
合成DNA存储似乎是解决这些大数据问题的一个合适的解决方案。在2013年,European Bioinformatics Institute(欧洲生物信息研究所)的Goldman 和他的同事们证实使用合成DNA作为存储介质,可以在1克的DNA上储存1PB的信息,并且还可以成功地以较低的错误率检索数据。这重要的一步意味着合成DNA对于未来的数据存储是一项有用的技术。而DNACloud软件的研发,使得数据更易于被储存在DNA中。
DNACloud1.0版本能够使用任意格式的格式(.text,.PDF,.PNG,.MKV,MP3播放等)将数据文件编码到DNA,也能解码回到原始的文件。您可以使用这款软件在DNA中存储您的Facebook数据或者视频。
最新的版本DNACloud 2.0允许用户使用一种有效的编码技术——非线性编码进行DNA数据编码。不同长度的DNA格雷码(DNA Golay Codes)可根据纠错率和更好的编码速率的要求用于编码。
DNACloud: A Tool For Storing Big Data On DNA
“From the dawn of civilization until 2003, humankind generated five exabytes (1 exabytes = 1 billion gigabytes) of data. Now we produce five exabytes every two days and the pace is accelerating.”
— Eric Schmidt, Executive Chairman, Google, August 4, 2010.
Where are we going to store the deluge of data everyone is warning us about? How about in a DNACloud that can store store 1 petabyte of information per gram of DNA?
Writing is a little slow. You have to convert your data file to a DNA description that is sent to a biotech company that will send you back a vile of synthetic DNA. Where do you store it? Your refrigerator.
Reading is a little slow too. The data can apparently be read with great accuracy, but to read it you have to sequence the DNA first, and that might take awhile.
The how of it is explained in DNACloud: A Tool for Storing Big Data on DNA (poster). Abstract:
The term Big Data is usually used to describe huge amount of data that is generated by humans from digital media such as cameras, internet, phones, sensors etc. By building advanced analytics on the top of big data, one can predict many things about the user such as behavior, interest etc. However before one can use the data, one has to address many issues for big data storage. Two main issues are the need of large storage devices and the cost associated with it. Synthetic DNA storage seems to be an appropriate solution to address these issues of the big data. Recently in 2013, Goldman and his collegues from European Bioinformatics Institute demonstrated the use of the DNA as storage medium with capacity of storing 1 peta byte of information on one gram of DNA and retrived the data successfully with low error rate [1]. This significant step shows a promise for synthetic DNA storage as a useful technology for the future data storage. Motivated by this, we have developed a software called DNACloud which makes it easy to store the data on the DNA. In this work, we present detailed description of the software.
via :CSDN,翻译/王辉
End.