创建一个目录用于管理软件包
1.安装JDK
(1)下载
密码:kbem
(2)使用sftp服务器(Alt+P 打开)上传jdk 至linux服务器
(3)查看服务器是否有jdk
(4)解压JDK至app目录下
(5)检验jdk是否可以运行
(6)配置全局环境变量
修改配置文件
加入配置
(7)配置文件生效
2.安装hadoop
(1)下载
密码:ujvh
(2)使用sftp服务器(Alt+P 打开)上传hadoop至linux服务器
(3)查看服务器是否有hadoop
(4)解压hadoop至app目录下
(5)修改配置文件etc/hadoop/hadoop-env.sh
[hadoop@weekend110 hadoop]$ vi hadoop-env.sh
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65
(6)修改配置文件etc/hadoop/core-site.xml
fs.defaultFS hdfs://weekend110:9000 hadoop.tmp.dir /home/hadoop/app/hadoop-2.4.1/data/
(7)修改配置文件etc/hadoop/hdfs-site.xml(配置副本个数一般为3个)
dfs.replication 1
(8)修改配置文件etc/hadoop/mapred-site.xml.template
文件重命名,否则无法加载
mv mapred-site.xml.template mapred-site.xml
mapreduce.framework.name yarn
(9)修改配置文件etc/hadoop/yarn-site.xml
yarn.resourcemanager.hostname weekend110 yarn.nodemanager.aux-services mapreduce_shuffle
(10)关闭防火墙
查看防火墙的状态
[hadoop@weekend110 ~]$ sudo service iptables status[sudo] password for hadoop: Table: filterChain INPUT (policy ACCEPT)num target prot opt source destination 1 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 state RELATED,ESTABLISHED 2 ACCEPT icmp -- 0.0.0.0/0 0.0.0.0/0 3 ACCEPT all -- 0.0.0.0/0 0.0.0.0/0 4 ACCEPT tcp -- 0.0.0.0/0 0.0.0.0/0 state NEW tcp dpt:22 5 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited Chain FORWARD (policy ACCEPT)num target prot opt source destination 1 REJECT all -- 0.0.0.0/0 0.0.0.0/0 reject-with icmp-host-prohibited Chain OUTPUT (policy ACCEPT)num target prot opt source destination
关闭防火墙
[hadoop@weekend110 ~]$ sudo service iptables stopiptables: Flushing firewall rules: [ OK ]iptables: Setting chains to policy ACCEPT: filter [ OK ]iptables: Unloading modules: [ OK ]
查看自启服务的状态及列表
[hadoop@weekend110 ~]$ sudo chkconfig iptables statuschkconfig version 1.3.49.3 - Copyright (C) 1997-2000 Red Hat, Inc.This may be freely redistributed under the terms of the GNU Public License.usage: chkconfig [--list] [--type] [name] chkconfig --add chkconfig --del chkconfig --override chkconfig [--level ] [--type ] [hadoop@weekend110 ~]$ sudo chkconfig iptables --listiptables 0:off 1:off 2:on 3:on 4:on 5:on 6:off
关闭自启服务
[hadoop@weekend110 ~]$ sudo chkconfig iptables off[sudo] password for hadoop: [hadoop@weekend110 ~]$ sudo chkconfig iptables --listiptables 0:off 1:off 2:off 3:off 4:off 5:off 6:off
(11)配置环境变量
[hadoop@weekend110 /]$ sudo vi /etc/profile
export JAVA_HOME=/home/hadoop/app/jdk1.7.0_65export HADOOP_HOME=/home/hadoop/app/hadoop-2.4.1export PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
(12)配置文件生效
[hadoop@weekend110 /]$ source /etc/profile
(13)格式化hadoop
[hadoop@weekend110 hadoop]$ hadoop namenode -formatDEPRECATED: Use of this script to execute hdfs command is deprecated.Instead use the hdfs command for it.16/07/13 21:42:36 INFO namenode.NameNode: STARTUP_MSG: /************************************************************STARTUP_MSG: Starting NameNodeSTARTUP_MSG: host = weekend110/192.168.2.100STARTUP_MSG: args = [-format]STARTUP_MSG: version = 2.4.1........................................................16/07/13 21:44:13 INFO namenode.FSImage: Allocated new BlockPoolId: BP-1726093789-192.168.2.100-146847145312416/07/13 21:44:13 INFO common.Storage: Storage directory /home/hadoop/app/hadoop-2.4.1/data/dfs/name has been successfully formatted.16/07/13 21:44:13 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 016/07/13 21:44:13 INFO util.ExitUtil: Exiting with status 016/07/13 21:44:13 INFO namenode.NameNode: SHUTDOWN_MSG: /************************************************************SHUTDOWN_MSG: Shutting down NameNode at weekend110/192.168.2.100************************************************************/
(14)格式化完成后hadoop目录下会产生一些目录
/home/hadoop/app/hadoop-2.4.1/data/dfs/name/current
3.启动hadoop
(1)启动hdfs
[hadoop@weekend110 hadoop-2.4.1]$ start-dfs.shStarting namenodes on [weekend110]hadoop@weekend110's password: weekend110: starting namenode, logging to /home/hadoop/app/hadoop-2.4.1/logs/hadoop-hadoop-namenode-weekend110.outhadoop@localhost's password: localhost: starting datanode, logging to /home/hadoop/app/hadoop-2.4.1/logs/hadoop-hadoop-datanode-weekend110.outStarting secondary namenodes [0.0.0.0]hadoop@0.0.0.0's password: 0.0.0.0: starting secondarynamenode, logging to /home/hadoop/app/hadoop-2.4.1/logs/hadoop-hadoop-secondarynamenode-weekend110.out
查看hdfs服务进程
[hadoop@weekend110 hadoop-2.4.1]$ jps28770 SecondaryNameNode28864 Jps28622 DataNode28502 NameNode
修改配置文件
[hadoop@weekend110 hadoop]$ vi slaves
启动的主机名
weekend110
(2)启动yarn
[hadoop@weekend110 hadoop]$ start-yarn.shstarting yarn daemonsstarting resourcemanager, logging to /home/hadoop/app/hadoop-2.4.1/logs/yarn-hadoop-resourcemanager-weekend110.outhadoop@weekend110's password: weekend110: starting nodemanager, logging to /home/hadoop/app/hadoop-2.4.1/logs/yarn-hadoop-nodemanager-weekend110.out
查看yarn服务进程
[hadoop@weekend110 hadoop]$ jps28770 SecondaryNameNode28960 ResourceManager29344 Jps29242 NodeManager28622 DataNode28502 NameNode
(3)验证dfs是否可以用
http://192.168.2.100:50070/dfshealth.html#tab-overview
查看dfs根目录文件
上传文件至dfs
[hadoop@weekend110 ~]$ hadoop fs -put jdk-7u65-linux-i586.tar.gz hdfs://weekend110:9000/
从dfs取文件到主机
[hadoop@weekend110 ~]$ hadoop fs -get hdfs://weekend110:9000/jdk-7u65-linux-i586.tar.gz[hadoop@weekend110 ~]$ lltotal 275636drwxrwxr-x. 4 hadoop hadoop 4096 Jul 13 19:46 app-rw-rw-r--. 1 hadoop hadoop 138656756 Jan 20 20:33 hadoop-2.4.1.tar.gz-rw-r--r--. 1 hadoop hadoop 143588167 Jul 13 22:33 jdk-7u65-linux-i586.tar.gz
(4)测试mapreduce程序
使用mapreduce程序测试圆周率
[hadoop@weekend110 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.4.1.jar pi 5 5
使用mapreduce程序统计字符数
创建文本
[hadoop@weekend110 mapreduce]$ vi test.txtheloo aahelllo aaheeej sdkfadkfdj kesdfsdkf sdfsdhello sd
在fds下创建目录,用于存储该文件
[hadoop@weekend110 mapreduce]$ hadoop fs -mkdir /workcount[hadoop@weekend110 mapreduce]$ hadoop fs -mkdir /workcount/input
将该文件上传的fds
[hadoop@weekend110 mapreduce]$ hadoop fs -put test.txt /workcount/input
使用程序测试
[hadoop@weekend110 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.4.1.jar workcount /workcount/input /workcount/output
查看统计结果的目录文件
[hadoop@weekend110 mapreduce]$ hadoop fs -ls /workcount/outputFound 2 items-rw-r--r-- 1 hadoop supergroup 0 2016-07-13 22:54 /workcount/output/_SUCCESS-rw-r--r-- 1 hadoop supergroup 82 2016-07-13 22:54 /workcount/output/part-r-00000
查看统计结果的具体信息
[hadoop@weekend110 mapreduce]$ hadoop fs -cat /workcount/output/part-r-00000aa 2adkfdj 1heeej 1helllo 1hello 1heloo 1ke 1sd 1sdfsd 1sdfsdkf 1sdkf 1