现在的位置: 首页 > 综合 > 正文

HDFS

2013年07月28日 ⁄ 综合 ⁄ 共 1781字 ⁄ 字号 评论关闭
------------------------------------------------------------------------------------
简介
------------------------------------------------------------------------------------
(1)Files are stored in a redundant fashion across multiple machines to ensure their durability to failure and high availability to very parallel applications.
------------------------------------------------------------------------------------
Distributed File System Basics
------------------------------------------------------------------------------------
(2)NFS
While its design is straightforward, it is also very constrained. NFS provides remote access to a single logical volume stored on a single machine.
An NFS server makes a portion of its local file system visible to external clients. The clients can then mount this remote file system directly into their own Linux file system, and interact with it as though
it were part of the local drive.
One of the primary advantages of this model is its transparency.
缺点:(a)The files in an NFS volume all reside on a single machine
             (b)  all the clients must go to this machine to retrieve their data. This can overload the server if a large number of clients must be handled
------------------------------------------------------------------------------------
(3)HDFS
  (a)HDFS is designed to store a very large amount of information (terabytes or petabytes). This requires spreading the data across a large number of machines. It also supports much larger file sizes than NFS.
  (b)HDFS should store data reliably. If individual machines in the cluster malfunction, data should still be available.

  (c)HDFS should provide fast, scalable access to this information. It should be possible to serve a larger number of clients by simply adding more machines to the cluster.
  (d)HDFS should integrate well with Hadoop MapReduce, allowing data to be read and computed upon locally when possible.
(4) HDFS三特征: 分片(block64MB), 副本(3 replica),同步
(5) NameNode 存元数据,主要放在主存中, DataNode存数据片
(6) NameNode的可靠性非常重要,通过冗余来完成

抱歉!评论已关闭.