总算是将ganglia安装到了CentOS6.4上,在linux上进行应用的安装十分艰难,感觉是要碰运气,如果找到了一篇比较好的安装说明文档,则很快就可以完成,不然就是一个漫长痛苦的经历,分析原因:
1.对linux系统不熟悉,各种命令的意思不懂,对各种包的依赖关系不理解;
2.网上的很多写文章之人基础不同,文章基于自己的水平写,其中埋藏各种“想当然”的陷阱。
对于学习资料的整理原则:
1.进可能的全面打包各种相关资源。
2.假设拿到文档的人都是初学者。
3.记录各种异常情况。
有的地方没有走通,自己将问题解决!
下面是我基于上面文章的修改版,自己的安装过程:
版本说明
CentOS版本:6.4 final;
Ganglia 版本:ganglia-3.4.0,ganglia-web-3.5.4;
Web服务:httpd-2.2.24;
PHP服务:php-5.3.18。
一、安装服务端
1、安装依赖
yum -y install apr-devel apr-utilcheck-devel cairo-devel pango-devel libxml2-devel rpmbuild glib2-develdbus-devel freetype-devel fontconfig-devel gcc-c++ expat-devel python-devellibXrender-devel perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker make rsync wget
yumgroupinstall chinese-support #安装中文支持
如果没有连接互联网,则需要将yum源配置到安装光盘(见第六章),使用以下命令:
yum --disablerepo=\* --enablerepo=c6-media -y install apr-devel apr-util check-devel cairo-devel pango-devel libxml2-devel rpmbuild glib2-devel dbus-devel freetype-devel fontconfig-devel gcc-c++ expat-devel python-devel libXrender-devel perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker make rsync wget
yum --disablerepo=\* --enablerepo=c6-media groupinstall chinese-support
2、安装apache的组件
安装apr-1.4.6
wget http://apache.etoak.com//apr/apr-1.4.6.tar.gz
tar -xf apr-1.4.6.tar.gz && cdapr-1.4.6
./configure --prefix=/usr/local/apr&& make && make install && cd ..
安装apr-util-1.5.2
wget http://apache.etoak.com//apr/apr-util-1.5.2.tar.gz
tar -xf apr-util-1.5.2.tar.gz &&cd apr-util-1.5.2
./configure --prefix=/usr/local/apr-util--with-apr=/usr/local/apr && make && make install && cd..
安装httpd-2.2.24
wget http://mirror.esocc.com/apache//httpd/httpd-2.2.24.tar.gz
tar -xf httpd-2.2.24.tar.gz && cdhttpd-2.2.24
./configure --prefix=/usr/local/apache2 --enable-so --enable-mods-shared=most--with-included-apr --with-apr=/usr/local/apr--with-apr-util=/usr/local/apr-util && make && make install&& cd ..
#将httpd写入启动脚本并启动(可选)
echo '/usr/local/apache2/bin/apachectlstart' >>/etc/rc.d/rc.local && /usr/local/apache2/bin/apachectlstart
3、安装和配置php
安装
wgethttp://www.php.net/get/php-5.3.18.tar.gz/from/cn2.php.net/mirror
tar -xf php-5.3.18.tar.gz && cdphp-5.3.18
./configure --prefix=/usr/local/php--with-apxs2=/usr/local/apache2/bin/apxs && make && make install&& cd ..
在httpd中配置php
修改/usr/local/apache2/conf/httpd.conf配置文件
添加内容
<FilesMatch "\.php$">
SetHandler application/x-httpd-php
</FilesMatch>
<FilesMatch "\.ph(p[2-6]?|tml)$">
SetHandler application/x-httpd-php
</FilesMatch>
<FilesMatch "\.phps$">
SetHandler application/x-httpd-php-source
</FilesMatch>
修改内容
在<IfModule dir_module>中添加index.php
4、安装libconfuse相关
wgethttp://pkgs.repoforge.org/libconfuse/libconfuse-2.6-2.el5.rf.x86_64.rpm
wgethttp://pkgs.repoforge.org/libconfuse/libconfuse-devel-2.6-2.el5.rf.x86_64.rpm
rpm -ivh libconfuse-*
5、安装pcre
wgetftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-8.31.tar.gz
tar -xf pcre-8.31.tar.gz && cdpcre-8.31
./configure && make &&make install && cd ..
echo '/usr/local/lib'>/etc/ld.so.conf.d/libpcre.conf && ldconfig -v
6、安装rrdtool
wgethttp://oss.oetiker.ch/rrdtool/pub/rrdtool-1.4.7.tar.gz
tar -xf rrdtool-1.4.7.tar.gz && cdrrdtool-1.4.7
./configure --prefix=/usr/local &&make && make install && cd ..
echo '/usr/local/lib'>/etc/ld.so.conf.d/librrd.conf && ldconfig -v
7、安装ganglia后台服务并设置
安装
wgethttp://sourceforge.net/projects/ganglia/files/ganglia%20monitoring%20core/3.4.0/ganglia-3.4.0.tar.gz/download
tar -xf ganglia-3.4.0.tar.gz && cdganglia-3.4.0
./configure --prefix=/usr/local/ganglia--with-gmetad --with-librrd=/usr/local/lib --sysconfdir=/etc/ganglia &&make && make install && cd ..
将gmond和gmetad作为服务运行,并加入开机服务启动中(可选)
复制到服务,并设开机启动
cp ganglia-3.4.0/gmond/gmond.init/etc/rc.d/init.d/gmond
cp ganglia-3.4.0/gmetad/gmetad.init/etc/rc.d/init.d/gmetad
chkconfig --add gmond && chkconfiggmond on
chkconfig --add gmetad &&chkconfig gmetad on
修改服务配置信息
修改/etc/rc.d/init.d/gmetad文件将GMETAD变量改为:GMETAD=/usr/local/ganglia/sbin/gmetad
和/etc/rc.d/init.d/gmond文件将GMOND变量改为:GMOND=/usr/local/ganglia/sbin/gmond
设置rrd的存储位置
mkdir -p /var/lib/ganglia/rrds
chown nobody:nobody /var/lib/ganglia/rrds
生成gmond的配置文件并修改配置信息(可选)
/usr/local/ganglia/sbin/gmond -t |tee/etc/ganglia/gmond.conf
修改cluster配置段内容,例如: name = "Cluster"。可以默认不修改
修改以下配置项,注释掉的为默认内容:
udp_send_channel {
#bind_hostname = yes
# Highly recommended, soon to be default.
# This option tells gmond to use a source address
# that resolves to the machine's hostname. Without
# this, the metrics may appear to come from any
# interface and the DNS names associated with
# those IPs will be used to create the RRDs.
/*mcast_join = 239.2.11.71*/
mcast_join = 192.168.1.5 #服务端IP
port = 8649
ttl = 1
}
/* You can specify as many udp_recv_channels as you like as well. */
udp_recv_channel {
/*mcast_join = 239.2.11.71*/
port = 8649
/*bind = 239.2.11.71*/
retry_bind = true
}
修改gmetad的配置信息(可选)
修改/etc/ganglia/gmetad.conf配置文件
把data_source配置名称以及服务器名称修改为自己的。可以默认
把gridname前的注释放开,改为自己的名称,例如:"cluster"。可以默认
8、安装ganglia的web端以及设置
下载和解压
wgethttp://sourceforge.net/projects/ganglia/files/ganglia-web/3.5.4/ganglia-web-3.5.4.tar.gz/download
tar -xf ganglia-web-3.5.4.tar.gz
cp -r ganglia-web-3.5.4/usr/local/apache2/htdocs/ganglia
修改Makefile文件并安装
cd /usr/local/apache2/htdocs/ganglia
修改Makefile中GDESTDIR和APACHE_USER参数,然后执行make install安装
GDESTDIR=/usr/local/apache2/htdocs/ganglia
APACHE_USER=daemon
make install
修改php配置文件
cp conf_default.php conf.php
修改conf.php文件
如果设置rrd的存储位置的时候是按照上文设置的话,就可以跳过对$conf['gmetad_root']和$conf['rrds']的修改,否则改为相应位置
$conf['rrdtool'] ="/usr/local/bin/rrdtool";
$conf['external_location'] = http://localhost/ganglia;
$conf['case_sensitive_hostnames'] = false;
9、启动ganglia
启动或重启httpd
/usr/local/apache2/bin/apachectl start
启动gmetad
service gmetad start
启动gmond
service gmond start
二、安装客户端
1、安装依赖
yum -y install apr-devel apr-utilcheck-devel cairo-devel pango-devel libxml2-devel rpmbuild glib2-develdbus-devel freetype-devel fontconfig-devel gcc-c++ expat-devel python-devellibXrender-devel perl-ExtUtils-CBuilder perl-ExtUtils-MakeMaker make rsync wget
2、安装libconfuse相关
rpm -ivh libconfuse-*
3、安装pcre
tar -xf pcre-8.31.tar.gz && cdpcre-8.31
./configure && make &&make install && cd ..
echo '/usr/local/lib'>/etc/ld.so.conf.d/libpcre.conf && ldconfig -v
4、安装和配置ganglia
安装
tar -xf ganglia-3.4.0.tar.gz && cdganglia-3.4.0
./configure --prefix=/usr/local/ganglia--sysconfdir=/etc/ganglia && make && make install && cd..
配置
复制服务端配置
scp 192.168.1.130:/etc/rc.d/init.d/gmond/etc/rc.d/init.d/gmond
scp 192.168.1.130:/etc/ganglia/gmond.conf /etc/ganglia
把gmond加入服务并开机启动
chkconfig --add gmond && chkconfiggmond on
5、启动ganglia客户端
service gmondstart
或者:/etc/init.d/gmond start
ps –ef | grep gmond
检查gmond进程是否启动成功
三、特殊要求
如果你的服务器有两块网卡,eth0使用公网地址,eth1使用局域网地址,而你的监控服务器和被监控服务器之间的通信你希望通过局域网地址实现以减少公网网卡的负载,那么可以使用以下命令
ip route add 239.2.11.71 dev eth1
因为239.2.11.71是ganglia默认的多点传输通道,所以要加一条路由使它通过eth1,也就是内网网卡,239.2.11.71这个地址你可以在/etc/ganglia/gmond.conf中修改
四、错误分析
1、apr错误
Checking for apr
checking for apr-1-config... no
configure: error: apr-1-config binary notfound in path
解决
yum -y install apr-devel apr-utilexpat-devel
2、confuse错误
Checking for confuse
checking for cfg_parse in -lconfuse... no
Trying harder including gettext
checking for cfg_parse in -lconfuse... no
Trying harder including iconv
checking for cfg_parse in -lconfuse... no
libconfuse not found
解决
rpm -ivh libconfuse-*
3、expat错误
Checking for expat
checking for XML_ParserCreate in-lexpat... no
libexpat not found
解决
yum install expat-devel
4、pcre错误
Checking for pcre
checking pcre/pcre.h usability... no
checking pcre/pcre.h presence... no
checking for pcre/pcre.h... no
checking pcre.h usability... no
checking pcre.h presence... no
checking for pcre.h... no
checking for pcre_compile in -lpcre... no
libpcre not found, specify--with-libpcre=no to build without PCRE support
解决
安装pcre
5、乱码错误
缺少字体文件,由于CentOS最小化安装缺少一些字体文件,可以从别的机器复制过来,或这见附件,放到/usr/share/fonts/中,或者使用命令:yumgroupinstall chinese-support,然后重启httpd
6、ganglia的web页面显示错误
It is not safe to rely on the system'stimezone settings. You are *required* to use the date.timezone setting or thedate_default_timezone_set() function.
解决
修改php.ini文件,修改参数:date.timezone = PRC
修改/usr/local/apache2/htdocs/ganglia/header.php文件,在第二行添加:date_default_timezone_set("PRC");
错误中的错误
找不到php.ini文件
解决
复制php安装文件夹下的php.ini-production文件到/etc/php.ini
也可以重新编译安装php,加入编译参数--with-config-file-path=/usr/local/php/etc来手工指定php配置文件路径,然后把php.ini-production文件复制到/usr/local/php/etc/php.ini
7、httpd启动报错
错误1
httpd: apr_sockaddr_info_get() failed forganglia
httpd: Could not reliably determine theserver's fully qualified domain name, using 127.0.0.1 for ServerName
原因
没有在apache的conf/http.conf中设定ServerName。所以apache会用主机上的名称来取代,首先会去找 /etc/hosts 中有没有主机的定义。
解决
设定httpd.conf文件中的 ServerName:ServerName localhost:80
在 /etc/hosts 中填入自己的主机名称:127.0.0.1 server1
错误2
httpd: Syntax error on line 140 of/usr/local/apache2/conf/httpd.conf: Cannot load modules/mod_dir.so into server:/usr/local/apache2/modules/mod_dir.so: undefined symbol: apr_array_clear
原因
apache编译的时候缺少模块
解决
重新编译,制定参数--with-included-apr
8、ganglia启动失败
[function.mkdir]: Permission denied
应该是Makefile中设置的用户名daemon错误,这个用户名要和httpd.conf中的User和Group一致,否则导致无权限读写文件,并且修改Makefile后要运行make install安装,否则仍然报错
9、yum提示anotherapp is currently holding the yum lock;waiting for it to exit
可能是系统自动升级正在运行,yum在锁定状态中。
可以通过强制关掉yum进程:
rm -f /var/run/yum.pid
然后就可以使用yum了
五、监控Hadoop
1、配置hadoop-metrics文件
Hadoop版本为chd3u3,修改conf文件夹下的hadoop-metrics.properties文件,将ganglia相关的配置条目的注释符去除,注意ganglia的版本部分。如下:
# Configuration of the "dfs" context for null
dfs.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "dfs" context for file
#dfs.class=org.apache.hadoop.metrics.file.FileContext
#dfs.period=10
#dfs.fileName=/tmp/dfsmetrics.log
# Configuration of the "dfs" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext
dfs.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
dfs.period=10
dfs.servers=192.168.1.130:8649
# Configuration of the "mapred" context for null
mapred.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "mapred" context for file
#mapred.class=org.apache.hadoop.metrics.file.FileContext
#mapred.period=10
#mapred.fileName=/tmp/mrmetrics.log
# Configuration of the "mapred" context for ganglia
# Pick one: Ganglia 3.0 (former) or Ganglia 3.1 (latter)
# mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext
mapred.class=org.apache.hadoop.metrics.ganglia.GangliaContext31
mapred.period=10
mapred.servers=192.168.1.130:8649
# Configuration of the "jvm" context for null
#jvm.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "jvm" context for file
#jvm.class=org.apache.hadoop.metrics.file.FileContext
#jvm.period=10
#jvm.fileName=/tmp/jvmmetrics.log
# Configuration of the "jvm" context for ganglia
jvm.class=org.apache.hadoop.metrics.ganglia.GangliaContext
jvm.period=10
jvm.servers=192.168.1.130:8649
# Configuration of the "ugi" context for null
ugi.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "fairscheduler" context for null
#fairscheduler.class=org.apache.hadoop.metrics.spi.NullContext
# Configuration of the "fairscheduler" context for file
#fairscheduler.class=org.apache.hadoop.metrics.file.FileContext
#fairscheduler.period=10
#fairscheduler.fileName=/tmp/fairschedulermetrics.log
# Configuration of the "fairscheduler" context for ganglia
fairscheduler.class=org.apache.hadoop.metrics.ganglia.GangliaContext
fairscheduler.period=10
fairscheduler.servers=192.168.1.130:8649
#
部分为修改内容,192.168.1.130为ganglia服务端ip地址!
2、重启hadoop集群
六、配置yum本地源
首先将CentOS安装光盘挂载到文件系统上,我们这里以挂载到/mnt/CentOS目录下为例,执行“mkdir /mnt/CentOS”命令创建挂在目录,然后执行“mount /dev/cdrom /mnt/CentOS”命令,将光盘挂载到指定目录下。
进入/etc/yum.repos.d目录,将该目录下的三个文件备份,分别执行“cp CentOS-Base.repoCentOS-Base.repo.bak”“cp CentOS-Media.repoCentOS-Media.repo.bak”“cp CentOS-Debuginfo.repoCentOS-Debuginfo.repo.bak”命令备份成功后,将CentOS-Base.repo文件删除,然后修改CentOS-Media.repo文件
修改前文件内容如下:
[c6-media]
name=CentOS-$releasever - Media
baseurl=file:///media/CentOS/
file:///media/cdrom/
file:///media/cdrecorder/
gpgcheck=1
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-6
修改后的文件内容如下:
[c6-media]
name=CentOS-$releasever - Media
baseurl=file:///mnt/CentOS/
file:///media/cdrom/
file:///media/cdrecorder/
…..
注意,要在yum命令中使用本地源,需要在命令中加入以下参数:
--disablerepo=\* --enablerepo=c6-media
例如:
yum --disablerepo=\* --enablerepo=c6-media[command]