现在的位置: 首页 > 综合 > 正文

windows下安装配置hadoop

2014年11月06日 ⁄ 综合 ⁄ 共 11675字 ⁄ 字号 评论关闭
--安装cygwin

安装cygwin, 并安装sshd.(跟openssh有关的都选上)

启动cygwin..........

Copying skeleton files.
These files are for the users to personalise their cygwin experience.

They will never be overwritten nor automatically updated.

`./.bashrc' -> `/home/pengxuan.lipx//.bashrc'
`./.bash_profile' -> `/home/pengxuan.lipx//.bash_profile'
`./.inputrc' -> `/home/pengxuan.lipx//.inputrc'
`./.profile' -> `/home/pengxuan.lipx//.profile'

pengxuan.lipx@ALIBABA-25725 ~
$ chmod +r /etc/group

pengxuan.lipx@ALIBABA-25725 ~
$ chmod +r /etc/passwd

pengxuan.lipx@ALIBABA-25725 ~
$ chmod +rwx /var

pengxuan.lipx@ALIBABA-25725 ~
$ ssh-host-config
*** Info: Generating /etc/ssh_host_key
*** Info: Generating /etc/ssh_host_rsa_key
*** Info: Generating /etc/ssh_host_dsa_key
*** Info: Generating /etc/ssh_host_ecdsa_key
*** Info: Creating default /etc/ssh_config file
*** Info: Creating default /etc/sshd_config file
*** Info: Privilege separation is set to yes by default since OpenSSH 3.3.
*** Info: However, this requires a non-privileged account called 'sshd'.
*** Info: For more info on privilege separation read /usr/share/doc/openssh/READ
ME.privsep.
*** Query: Should privilege separation be used? (yes/no) no
*** Info: Updating /etc/sshd_config file
*** Info: Added ssh to C:\WINDOWS\system32\driversc\services

*** Warning: The following functions require administrator privileges!

*** Query: Do you want to install sshd as a service?
*** Query: (Say "no" if it is already installed as a service) (yes/no) yes
*** Query: Enter the value of CYGWIN for the daemon: [] ntsec

*** Info: The sshd service has been installed under the LocalSystem
*** Info: account (also known as SYSTEM). To start the service now, call
*** Info: `net start sshd' or `cygrunsrv -S sshd'.  Otherwise, it
*** Info: will start automatically after the next reboot.

*** Info: Host configuration finished. Have fun!

pengxuan.lipx@ALIBABA-25725 ~
$ net start sshd
CYGWIN sshd 服务正在启动 .
CYGWIN sshd 服务已经启动成功。

pengxuan.lipx@ALIBABA-25725 ~
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
Generating public/private dsa key pair.
Created directory '/home/pengxuan.lipx/.ssh'.
Your identification has been saved in /home/pengxuan.lipx/.ssh/id_dsa.
Your public key has been saved in /home/pengxuan.lipx/.ssh/id_dsa.pub.
The key fingerprint is:
85:de:88:32:51:ad:9b:8c:68:e2:da:c1:4e:5b:ee:3f pengxuan.lipx@ALIBABA-25725
The key's randomart image is:
+--[ DSA 1024]----+
|      ..         |
|     .  ..       |
|    .  .. .      |
|     ..o +       |
|   .oo.oS .      |
|..o .o+          |
|.o+ .            |
| = =  E          |
|o +.o...         |
+-----------------+

pengxuan.lipx@ALIBABA-25725 ~
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

pengxuan.lipx@ALIBABA-25725 ~
$ ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is 18:e0:c4:75:cb:7a:bf:01:3b:29:b9:58:ca:a2:e4:16.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.

pengxuan.lipx@ALIBABA-25725 ~
$ ssh localhost
Last login: Thu Jul  7 15:45:34 2011 from 127.0.0.1

pengxuan.lipx@ALIBABA-25725 ~
$ who
pengxuan.lipx tty0         2011-07-07 15:45 (127.0.0.1)
pengxuan.lipx tty1         2011-07-07 15:46 (127.0.0.1)

pengxuan.lipx@ALIBABA-25725 ~
$

pengxuan.lipx@ALIBABA-25725 ~
$

下载安装hadoop:

http://www.apache.org/dyn/closer.cgi/hadoop/core/

直接解压即可.

--config hadoop:

hadoop-env.sh

# The java implementation to use.  Required.
# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
  export JAVA_HOME='D:\Program Files\Java\jdk1.6.0_10'
 

 
--配置core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
<description>The name of the default file system. Either the literal string "local" or a host:port for DFS.</description>
</property>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
<description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task.</description>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/hadoop-${user.name}</value>
<description>A base for other temporary directories.</description>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/hadoop/hadoop/filesystem/name</value>
<description>Determines where on the local filesystem the DFS name node should store the name table. If this is a comma-delimited list of directories then the name table is replicated in all of the directories, for redundancy. </description>
</property>
<property>
<name>dfs.data.dir</name>
<value>/home/hadoop/hadoop/filesystem/data</value>
<description>Determines where on the local filesystem an DFS data node should store its blocks. If this is a comma-delimited list of directories, then data will be stored in all named directories, typically on different devices. Directories that do not exist are ignored.</description>
</property>
<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication. The actual number of replications can be specified when the file is created. The default is used if replication is not specified in create time.</description>
</property>
</configuration>

--使用hdfs-default.xml配置hdfs-site.xml

--使用mapred-default.xml配置mapred-site.xml
修改mapred.job.tracker value
<property>
  <name>mapred.job.tracker</name>
  <value>localhost:9999</value>
  <description>The host and port that the MapReduce job tracker runs
  at.  If "local", then jobs are run in-process as a single map
  and reduce task.
  </description>
</property>

--启动hadoop
pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop/bin
$ cd d:

pengxuan.lipx@ALIBABA-25725 /cygdrive/d
$ cd hadoop/bin

--format the namenode
pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop/conf
$ source hadoop-env.sh

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop/conf
$ hadoop namenode -format
\hadoop\bin/hadoop: line 297: D:\Program: command not found
11/07/07 17:02:38 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = ALIBABA-25725/10.16.48.33
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 0.20.203.0
STARTUP_MSG:   build = http://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20-security-203 -r 1099333; compiled by 'oom' on Wed May  4 07:57:50 PDT 2011
************************************************************/
11/07/07 17:02:39 INFO util.GSet: VM type       = 32-bit
11/07/07 17:02:39 INFO util.GSet: 2% max memory = 19.84625 MB
11/07/07 17:02:39 INFO util.GSet: capacity      = 2^22 = 4194304 entries
11/07/07 17:02:39 INFO util.GSet: recommended=4194304, actual=4194304
11/07/07 17:02:39 INFO namenode.FSNamesystem: fsOwner=SYSTEM
11/07/07 17:02:39 INFO namenode.FSNamesystem: supergroup=supergroup
11/07/07 17:02:39 INFO namenode.FSNamesystem: isPermissionEnabled=true
11/07/07 17:02:39 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100
11/07/07 17:02:39 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)
11/07/07 17:02:39 INFO namenode.NameNode: Caching file names occuring more than 10 times
11/07/07 17:02:40 INFO common.Storage: Image file of size 112 saved in 0 seconds.
11/07/07 17:02:40 INFO common.Storage: Storage directory \tmp\hadoop-SYSTEM\dfs\name has been successfully formatted.
11/07/07 17:02:40 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at ALIBABA-25725/10.16.48.33
************************************************************/

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop/bin
$ ./start-all.sh
starting namenode, logging to /cygdrive/d/hadoop/bin/../logs/hadoop-pengxuan.lipx-namenode-ALIBABA-25725.out
/cygdrive/d/hadoop/bin/../bin/hadoop: line 297: D:\Program: command not found
localhost: starting datanode, logging to /cygdrive/d/hadoop/bin/../logs/hadoop-pengxuan.lipx-datanode-ALIBABA-25725.out
localhost: /cygdrive/d/hadoop/bin/../bin/hadoop: line 297: D:\Program: command not found
localhost: starting secondarynamenode, logging to /cygdrive/d/hadoop/bin/../logs/hadoop-pengxuan.lipx-secondarynamenode-ALIBABA-25725.out
starting jobtracker, logging to /cygdrive/d/hadoop/bin/../logs/hadoop-pengxuan.lipx-jobtracker-ALIBABA-25725.out
/cygdrive/d/hadoop/bin/../bin/hadoop: line 297: D:\Program: command not found
localhost: starting tasktracker, logging to /cygdrive/d/hadoop/bin/../logs/hadoop-pengxuan.lipx-tasktracker-ALIBABA-25725.out

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop/bin
$ jps
5028 Jps
3256 JobTracker
5276 NameNode

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ echo "hello hadoop world" > mytest/test_file1.txt

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ echo "hello hadoop world" > mytest/test_file2.txt

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ bin/hadoop dfs -mkdir test-in

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ bin/hadoop dfs -copyFromLocal mytest/test*.txt test-in

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ hadoop dfs -ls test-in
Found 2 items
-rw-r--r--   1 pengxuan.lipx supergroup         19 2011-07-08 19:05 /user/pengxuan.lipx/test-in/test_file1.txt
-rw-r--r--   1 pengxuan.lipx supergroup         19 2011-07-08 19:05 /user/pengxuan.lipx/test-in/test_file2.txt

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount test-in test-out

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ bin/hadoop jar hadoop-0.20.2-examples.jar wordcount test-in test-out
bin/hadoop: line 258: D:\Program: command not found
11/07/08 19:06:44 INFO input.FileInputFormat: Total input paths to process : 2
11/07/08 19:06:44 INFO mapred.JobClient: Running job: job_201107081905_0001
11/07/08 19:06:45 INFO mapred.JobClient:  map 0% reduce 0%
11/07/08 19:06:53 INFO mapred.JobClient:  map 100% reduce 0%
11/07/08 19:07:05 INFO mapred.JobClient:  map 100% reduce 100%
11/07/08 19:07:07 INFO mapred.JobClient: Job complete: job_201107081905_0001
11/07/08 19:07:07 INFO mapred.JobClient: Counters: 17
11/07/08 19:07:07 INFO mapred.JobClient:   Job Counters
11/07/08 19:07:07 INFO mapred.JobClient:     Launched reduce tasks=1
11/07/08 19:07:07 INFO mapred.JobClient:     Launched map tasks=2
11/07/08 19:07:07 INFO mapred.JobClient:     Data-local map tasks=2
11/07/08 19:07:07 INFO mapred.JobClient:   FileSystemCounters
11/07/08 19:07:07 INFO mapred.JobClient:     FILE_BYTES_READ=161
11/07/08 19:07:07 INFO mapred.JobClient:     HDFS_BYTES_READ=38
11/07/08 19:07:07 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=311
11/07/08 19:07:07 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=25
11/07/08 19:07:07 INFO mapred.JobClient:   Map-Reduce Framework
11/07/08 19:07:07 INFO mapred.JobClient:     Reduce input groups=3
11/07/08 19:07:07 INFO mapred.JobClient:     Combine output records=6
11/07/08 19:07:07 INFO mapred.JobClient:     Map input records=2
11/07/08 19:07:07 INFO mapred.JobClient:     Reduce shuffle bytes=86
11/07/08 19:07:07 INFO mapred.JobClient:     Reduce output records=3
11/07/08 19:07:07 INFO mapred.JobClient:     Spilled Records=12
11/07/08 19:07:07 INFO mapred.JobClient:     Map output bytes=62
11/07/08 19:07:07 INFO mapred.JobClient:     Combine input records=6
11/07/08 19:07:07 INFO mapred.JobClient:     Map output records=6
11/07/08 19:07:07 INFO mapred.JobClient:     Reduce input records=6

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ bin/hadoop dfs -ls test-out
Found 2 items
drwxr-xr-x   - pengxuan.lipx supergroup          0 2011-07-08 19:06 /user/pengxuan.lipx/test-out/_logs
-rw-r--r--   1 pengxuan.lipx supergroup         25 2011-07-08 19:06 /user/pengxuan.lipx/test-out/part-r-00000

pengxuan.lipx@ALIBABA-25725 /cygdrive/d/hadoop
$ bin/hadoop dfs -cat /user/pengxuan.lipx/test-out/part-r-00000
hadoop  2
hello   2
world   2

--FQA:

$ hadoop namenode -format
\hadoop\bin/hadoop: line 258: D:\Program: command not found
java.lang.NoClassDefFoundError: org/apache/hadoop/hdfs/server/namenode/NameNode
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hdfs.server.namenode.NameNode
        at java.net.URLClassLoader$1.run(URLClassLoader.java:200)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:188)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:252)
        at java.lang.ClassLoader.loadClassInternal(ClassLoader.java:320)
Could not find the main class: org.apache.hadoop.hdfs.server.namenode.NameNode.  Program will exit.
Exception in thread "main"

---------------
hadoop-env.sh  hadoop path config is error

抱歉!评论已关闭.