gmetad configure

简介:
gmetad的用途是从gmond接收metric统计信息, 并将统计信息存入rrdtools数据文件.
或者将接收到的metric forward给外部系统, 如Graphite.
同时gmetad还支持在gmetad之间传输XML dump数据, 组成一个大的监控系统.
gmetad还提供interactivei请求, 例如gweb使用interactive向gmetad请求不能再rrd文件中描述的信息.
基本的架构图如下, gmetad从gmond获取metric信息, 一个cluster只需要从一个gmond主机获取即可(例如mute节点).
注意这里使用的是TCP连接, 而不是UDP, 
TCP端口对应gmond.conf中的tcp_accept_channel.port.
gmetad configure - 德哥@Digoal - PostgreSQL research

HA架构, gmetad支持在一个data_source中配置一个集群中的多个gmond节点, 顺序请求, 第一个配置的主机返回成功即可.
gmetad configure - 德哥@Digoal - PostgreSQL research

异构架构, 当一个gmetad不能收集所有cluster的信息时, 可以使用另外的gmetad来汇聚多个gmetad的信息.
即gmetad和gmetad之间互联.
需要配置的有gridname, 表示gmetad属于哪个grid, 多个gmetad组成了一个grid. 注意gridname和cluster name要区分开来,
结构是这样的 : 
grid下面是cluster(s), 
cluster下面是host(s), 
gmetad configure - 德哥@Digoal - PostgreSQL research

最后这张图表示使用rrdcached, 合并IO, 减少rrd文件的频繁IO请求.
gmetad configure - 德哥@Digoal - PostgreSQL research

默认的配置文件如下 : 
[root@db-172-16-3-221 etc]# cat gmetad.conf 
# This is an example of a Ganglia Meta Daemon configuration file
#                http://ganglia.sourceforge.net/
#
#
#-------------------------------------------------------------------------------
# Setting the debug_level to 1 will keep daemon in the forground and
# show only error messages. Setting this value higher than 1 will make 
# gmetad output debugging information and stay in the foreground.
# default: 0
# debug_level 10
#
#-------------------------------------------------------------------------------
# What to monitor. The most important section of this file. 
#
# The data_source tag specifies either a cluster or a grid to
# monitor. If we detect the source is a cluster, we will maintain a complete
# set of RRD databases for it, which can be used to create historical 
# graphs of the metrics. If the source is a grid (it comes from another gmetad),
# we will only maintain summary RRDs for it.
#
# Format: 
# data_source "my cluster" [polling interval] address1:port addreses2:port ...
# 
# The keyword 'data_source' must immediately be followed by a unique
# string which identifies the source, then an optional polling interval in 
# seconds. The source will be polled at this interval on average. 
# If the polling interval is omitted, 15sec is asssumed. 
#
# If you choose to set the polling interval to something other than the default,
# note that the web frontend determines a host as down if its TN value is less
# than 4 * TMAX (20sec by default).  Therefore, if you set the polling interval
# to something around or greater than 80sec, this will cause the frontend to
# incorrectly display hosts as down even though they are not.
#
# A list of machines which service the data source follows, in the 
# format ip:port, or name:port. If a port is not specified then 8649
# (the default gmond port) is assumed.
# default: There is no default value
#
# data_source "my cluster" 10 localhost  my.machine.edu:8649  1.2.3.5:8655
# data_source "my grid" 50 1.3.4.7:8655 grid.org:8651 grid-backup.org:8651
# data_source "another source" 1.3.4.7:8655  1.3.4.8

data_source "my cluster" localhost

rrd归档配置 , 含义参考man rrdcreate
#
# Round-Robin Archives
# You can specify custom Round-Robin archives here (defaults are listed below)
#
# Old Default RRA: Keep 1 hour of metrics at 15 second resolution. 1 day at 6 minute
# RRAs "RRA:AVERAGE:0.5:1:244" "RRA:AVERAGE:0.5:24:244" "RRA:AVERAGE:0.5:168:244" "RRA:AVERAGE:0.5:672:244" \
#      "RRA:AVERAGE:0.5:5760:374"
# New Default RRA
# Keep 5856 data points at 15 second resolution assuming 15 second (default) polling. That's 1 day
# Two weeks of data points at 1 minute resolution (average)
#RRAs "RRA:AVERAGE:0.5:1:5856" "RRA:AVERAGE:0.5:4:20160" "RRA:AVERAGE:0.5:40:52704"

当前节点的grid相关配置, 
如果scalable配置为on , 则表示从下游的gmetad收集过来的信息会进行汇聚.
如果scalable配置为off , 则表示从下游的gmetad收集过来的信息不进行汇聚(2.5.0的风格).
#
#-------------------------------------------------------------------------------
# Scalability mode. If on, we summarize over downstream grids, and respect
# authority tags. If off, we take on 2.5.0-era behavior: we do not wrap our output
# in <GRID></GRID> tags, we ignore all <GRID> tags we see, and always assume
# we are the "authority" on data source feeds. This approach does not scale to
# large groups of clusters, but is provided for backwards compatibility.
# default: on
# scalable off
# gridname , 必须和cluster name区分开来.
#-------------------------------------------------------------------------------
# The name of this Grid. All the data sources above will be wrapped in a GRID
# tag with this name.
# default: unspecified
# gridname "MyGrid"
#
#-------------------------------------------------------------------------------
# The authority URL for this grid. Used by other gmetads to locate graphs
# for our data sources. Generally points to a ganglia/
# website on this machine.
# default: "http://hostname/ganglia/",
#   where hostname is the name of this machine, as defined by gethostname().
# authority "http://mycluster.org/newprefix/"
# 配置受信任的上游主机, 不在这个列表的上游gmetad, 不能从这个gmetad获取信息.
#-------------------------------------------------------------------------------
# List of machines this gmetad will share XML with. Localhost
# is always trusted. 
# default: There is no default value
# trusted_hosts 127.0.0.1 169.229.50.165 my.gmetad.org
# 
#-------------------------------------------------------------------------------
# If you want any host which connects to the gmetad XML to receive
# data, then set this value to "on"
# default: off
# all_trusted on
#
#-------------------------------------------------------------------------------
# If you don't want gmetad to setuid then set this to off
# default: on
# setuid off
#
#-------------------------------------------------------------------------------
# User gmetad will setuid to (defaults to "nobody")
# default: "nobody"
# setuid_username "nobody"
#
#-------------------------------------------------------------------------------
# Umask to apply to created rrd files and grid directory structure
# default: 0 (files are public)
# umask 022

#  上游节点的gmetad, 通过这个端口访问下游节点的gmetad, 获取xml dump
#-------------------------------------------------------------------------------
# The port gmetad will answer requests for XML
# default: 8651
# xml_port 8651

# 交互式访问的端口, 例如可以通过telnet获取xml信息.
#-------------------------------------------------------------------------------
# The port gmetad will answer queries for XML. This facility allows
# simple subtree and summation views of the XML tree.
# default: 8652
# interactive_port 8652
#
#-------------------------------------------------------------------------------
# The number of threads answering XML requests
# default: 4
# server_threads 10
#
rrd数据文件目录.
#-------------------------------------------------------------------------------
# Where gmetad stores its round-robin databases
# default: "/var/lib/ganglia/rrds"
# rrd_rootdir "/some/other/place"
#
不汇聚数据的metric列表.
#-------------------------------------------------------------------------------
# List of metric prefixes this gmetad will not summarize at cluster or grid level.
# default: There is no default value
# unsummarized_metrics diskstat CPU
#
#-------------------------------------------------------------------------------
# In earlier versions of gmetad, hostnames were handled in a case
# sensitive manner
# If your hostname directories have been renamed to lower case,
# set this option to 0 to disable backward compatibility.
# From version 3.2, backwards compatibility will be disabled by default.
# default: 1   (for gmetad < 3.2)
# default: 0   (for gmetad >= 3.2)
case_sensitive_hostnames 0

/* gmetad收集到的信息, forward给graphite.
那么需要如下配置 */
#-------------------------------------------------------------------------------
# It is now possible to export all the metrics collected by gmetad directly to
# graphite by setting the following attributes. 
#
# The hostname or IP address of the Graphite server
# default: unspecified
# carbon_server "my.graphite.box"
#
# The port and protocol on which Graphite is listening
# default: 2003
# carbon_port 2003
#
# default: tcp
# carbon_protocol udp
#
# **Deprecated in favor of graphite_path** A prefix to prepend to the 
# metric names exported by gmetad. Graphite uses dot-
# separated paths to organize and refer to metrics. 
# default: unspecified
# graphite_prefix "datacenter1.gmetad"
#
# A user-definable graphite path. Graphite uses dot-
# separated paths to organize and refer to metrics. 
# For reverse compatibility graphite_prefix will be prepended to this
# path, but this behavior should be considered deprecated.
# This path may include 3 variables that will be replaced accordingly:
# %s -> source (cluster name)
# %h -> host (host name)
# %m -> metric (metric name)
# default: graphite_prefix.%s.%h.%m
# graphite_path "datacenter1.gmetad.%s.%h.%m

# Number of milliseconds gmetad will wait for a response from the graphite server 
# default: 500
# carbon_timeout 500

memcached相关配置.
#-------------------------------------------------------------------------------
# Memcached configuration (if it has been compiled in)
# Format documentation at http://docs.libmemcached.org/libmemcached_configuration.html
# default: ""
# memcached_parameters "--SERVER=127.0.0.1"
#


简单的配置如下 : 
创建rrd数据文件目录, 修改权限 : 
# mkdir -p /data01/rrd
# chown nobody /data01/rrd

配置gmetad.conf 
# vi /opt/ganglia-core-3.6.0/etc/gmetad.conf 
data_source "my cluster" 10 172.16.3.221 
RRAs "RRA:AVERAGE:0.5:1:5856" "RRA:AVERAGE:0.5:4:20160" "RRA:AVERAGE:0.5:40:52704" 
setuid on
setuid_username "nobody"
umask 022
xml_port 8651
interactive_port 8652
server_threads 10
rrd_rootdir "/data01/rrd"
case_sensitive_hostnames 0


启动gmetad
[root@db-172-16-3-221 etc]# gmetad -c /opt/ganglia-core-3.6.0/etc/gmetad.conf 
[root@db-172-16-3-221 etc]# netstat -anp|grep gmetad
tcp        0      0 0.0.0.0:8651                0.0.0.0:*                   LISTEN      4982/gmetad         
tcp        0      0 0.0.0.0:8652                0.0.0.0:*                   LISTEN      4982/gmetad  


产生的rrd文件如下 : 
注意产生的文件和XML里面解析出来的主机名有关 : 

[root@db-172-16-3-221 rrd]# cat /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
#::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
172.16.3.221 db-172-16-3-221.localdomain db-172-16-3-221
172.16.3.150 150.sky-mobi.com


我们查看一台gmond的XML信息如下 : 
[root@db-172-16-3-221 rrd]# telnet 172.16.3.221 8649
<GANGLIA_XML VERSION="3.6.0" SOURCE="gmond">
<CLUSTER NAME="test" LOCALTIME="1410403311" OWNER="digoal" LATLONG="111 122" URL="http://dba.sky-mobi.com">
<HOST NAME="db-172-16-3-221.localdomain" IP="172.16.3.221" TAGS="" REPORTED="1410403301" TN="9" TMAX="20" DMAX="86400" LOCATION="1,2,3" GMOND_STARTED="1410332246">


查看gmetad收集到的这台gmond的rrd信息 : 
目录结构包括summary , 集群名, 集群下面的主机名, summary : 

# cd /data01/rrd/
[root@db-172-16-3-221 rrd]# ll -R
.:
total 4
drwxr-xr-x 2 nobody nobody 4096 Sep 10 17:38 __SummaryInfo__
drwxr-xr-x 4 nobody nobody   62 Sep 10 17:37 test

./__SummaryInfo__:
total 35728
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 boottime.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 bytes_in.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 bytes_out.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_aidle.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_idle.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_nice.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_num.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_speed.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_steal.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_system.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_user.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_wio.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 disk_free.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 disk_total.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 load_fifteen.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 load_five.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 load_one.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_buffers.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_cached.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_free.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_shared.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_total.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 part_max_used.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 pkts_in.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 pkts_out.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 proc_run.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 proc_total.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 swap_free.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 swap_total.rrd

./test:
total 8
drwxr-xr-x 2 nobody nobody 4096 Sep 10 17:37 db-172-16-3-221.localdomain
drwxr-xr-x 2 nobody nobody 4096 Sep 10 17:37 __SummaryInfo__

./test/db-172-16-3-221.localdomain:
total 17864
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 boottime.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 bytes_in.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 bytes_out.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_aidle.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_idle.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_nice.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_num.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_speed.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_steal.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_system.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_user.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 cpu_wio.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 disk_free.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 disk_total.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 load_fifteen.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 load_five.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 load_one.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 mem_buffers.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 mem_cached.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 mem_free.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 mem_shared.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 mem_total.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 part_max_used.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 pkts_in.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 pkts_out.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 proc_run.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 proc_total.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 swap_free.rrd
-rw-r--r-- 1 nobody nobody 630760 Sep 10 17:38 swap_total.rrd

./test/__SummaryInfo__:
total 35728
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 boottime.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 bytes_in.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 bytes_out.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_aidle.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_idle.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_nice.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_num.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_speed.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_steal.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_system.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_user.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 cpu_wio.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 disk_free.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 disk_total.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 load_fifteen.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 load_five.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 load_one.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_buffers.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_cached.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_free.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_shared.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 mem_total.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 part_max_used.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 pkts_in.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 pkts_out.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 proc_run.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 proc_total.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 swap_free.rrd
-rw-r--r-- 1 nobody nobody 1260992 Sep 10 17:38 swap_total.rrd


与gmetad交互, 同样可以使用telnet : 
8651端口获取这台gmetad的XML元数据.

[root@db-172-16-3-221 db-172-16-3-221.localdomain]# netstat -anp|grep gmeta
tcp        0      0 0.0.0.0:8651                0.0.0.0:*                   LISTEN      4982/gmetad         
tcp        0      0 0.0.0.0:8652                0.0.0.0:*                   LISTEN      4982/gmetad         
unix  2      [ ]         DGRAM                    39681  4982/gmetad         
[root@db-172-16-3-221 db-172-16-3-221.localdomain]# telnet 127.0.0.1 8651
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
<!DOCTYPE GANGLIA_XML [
   <!ELEMENT GANGLIA_XML (GRID|CLUSTER|HOST)*>
      <!ATTLIST GANGLIA_XML VERSION CDATA #REQUIRED>
      <!ATTLIST GANGLIA_XML SOURCE CDATA #REQUIRED>
   <!ELEMENT GRID (CLUSTER | GRID | HOSTS | METRICS)*>
      <!ATTLIST GRID NAME CDATA #REQUIRED>
      <!ATTLIST GRID AUTHORITY CDATA #REQUIRED>
      <!ATTLIST GRID LOCALTIME CDATA #IMPLIED>
   <!ELEMENT CLUSTER (HOST | HOSTS | METRICS)*>
      <!ATTLIST CLUSTER NAME CDATA #REQUIRED>
      <!ATTLIST CLUSTER OWNER CDATA #IMPLIED>
      <!ATTLIST CLUSTER LATLONG CDATA #IMPLIED>
      <!ATTLIST CLUSTER URL CDATA #IMPLIED>
      <!ATTLIST CLUSTER LOCALTIME CDATA #REQUIRED>
   <!ELEMENT HOST (METRIC)*>
      <!ATTLIST HOST NAME CDATA #REQUIRED>
      <!ATTLIST HOST IP CDATA #REQUIRED>
      <!ATTLIST HOST LOCATION CDATA #IMPLIED>
      <!ATTLIST HOST TAGS CDATA #IMPLIED>
      <!ATTLIST HOST REPORTED CDATA #REQUIRED>
      <!ATTLIST HOST TN CDATA #IMPLIED>
      <!ATTLIST HOST TMAX CDATA #IMPLIED>
      <!ATTLIST HOST DMAX CDATA #IMPLIED>
      <!ATTLIST HOST GMOND_STARTED CDATA #IMPLIED>
   <!ELEMENT METRIC (EXTRA_DATA*)>
      <!ATTLIST METRIC NAME CDATA #REQUIRED>
      <!ATTLIST METRIC VAL CDATA #REQUIRED>
      <!ATTLIST METRIC TYPE (string | int8 | uint8 | int16 | uint16 | int32 | uint32 | float | double | timestamp) #REQUIRED>
      <!ATTLIST METRIC UNITS CDATA #IMPLIED>
      <!ATTLIST METRIC TN CDATA #IMPLIED>
      <!ATTLIST METRIC TMAX CDATA #IMPLIED>
      <!ATTLIST METRIC DMAX CDATA #IMPLIED>
      <!ATTLIST METRIC SLOPE (zero | positive | negative | both | unspecified) #IMPLIED>
      <!ATTLIST METRIC SOURCE (gmond) 'gmond'>
   <!ELEMENT EXTRA_DATA (EXTRA_ELEMENT*)>
   <!ELEMENT EXTRA_ELEMENT EMPTY>
      <!ATTLIST EXTRA_ELEMENT NAME CDATA #REQUIRED>
      <!ATTLIST EXTRA_ELEMENT VAL CDATA #REQUIRED>
   <!ELEMENT HOSTS EMPTY>
      <!ATTLIST HOSTS UP CDATA #REQUIRED>
      <!ATTLIST HOSTS DOWN CDATA #REQUIRED>
      <!ATTLIST HOSTS SOURCE (gmond | gmetad) #REQUIRED>
   <!ELEMENT METRICS (EXTRA_DATA*)>
      <!ATTLIST METRICS NAME CDATA #REQUIRED>
      <!ATTLIST METRICS SUM CDATA #REQUIRED>
      <!ATTLIST METRICS NUM CDATA #REQUIRED>
      <!ATTLIST METRICS TYPE (string | int8 | uint8 | int16 | uint16 | int32 | uint32 | float | double | timestamp) #REQUIRED>
      <!ATTLIST METRICS UNITS CDATA #IMPLIED>
      <!ATTLIST METRICS SLOPE (zero | positive | negative | both | unspecified) #IMPLIED>
      <!ATTLIST METRICS SOURCE (gmond) 'gmond'>
]>
<GANGLIA_XML VERSION="3.6.0" SOURCE="gmetad">
<GRID NAME="unspecified" AUTHORITY="http://db-172-16-3-221/ganglia/" LOCALTIME="1410403531">
<CLUSTER NAME="test" LOCALTIME="1410403521" OWNER="digoal" LATLONG="111 122" URL="http://dba.sky-mobi.com">
<HOST NAME="db-172-16-3-221.localdomain" IP="172.16.3.221" REPORTED="1410403521" TN="10" TMAX="20" DMAX="86400" LOCATION="1,2,3" GMOND_STARTED="1410332246" TAGS="">
<METRIC NAME="disk_free" VAL="3209.276" TYPE="double" UNITS="GB" TN="181" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond">
<EXTRA_DATA>
<EXTRA_ELEMENT NAME="GROUP" VAL="disk"/>
<EXTRA_ELEMENT NAME="DESC" VAL="Total free disk space"/>
<EXTRA_ELEMENT NAME="TITLE" VAL="Disk Space Available"/>
</EXTRA_DATA>
...
略


8652端口的查询语法, 交互式查询方法 : 
例如本例的集群名为test, 其中一台主机名为 db-172-16-3-221.localdomain
用法如下 : 

As mentioned previously, gmetad listens on TCP port
8652 (by default) for interactive queries. The interactive query functionality enables
client programs to get XML dumps of the state of only the portion of the Grid in which
they’re interested.
Interactive queries are performed via a text protocol (similar to SMTP or HTTP). Queries are hierarchal, and begin with a forward slash (/). For example, the following query
returns an XML dump of the entire grid state:
/
To narrow the query result, specify the name of a cluster:
/cluster1
To narrow the query result further, specify the name of a host in the cluster:
/cluster1/host1
Queries may be suffixed with a filter to modify the type of metric information returned
by the query (as of this writing, summaryis the only filter available). For example, you
can request only the summary metric data from cluster1like so:
/cluster1?filter=summary


例如导出一台主机的METADATA
[root@db-172-16-3-221 db-172-16-3-221.localdomain]# telnet 127.0.0.1 8652
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
/test/db-172-16-3-221.localdomain 
......
<GANGLIA_XML VERSION="3.6.0" SOURCE="gmetad">
<GRID NAME="unspecified" AUTHORITY="http://db-172-16-3-221/ganglia/" LOCALTIME="1410403721">
<CLUSTER NAME="test" LOCALTIME="1410403711" OWNER="digoal" LATLONG="111 122" URL="http://dba.sky-mobi.com">
<HOST NAME="db-172-16-3-221.localdomain" IP="172.16.3.221" REPORTED="1410403710" TN="0" TMAX="20" DMAX="86400" LOCATION="1,2,3" GMOND_STARTED="1410332246" TAGS="">
<METRIC NAME="disk_free" VAL="3209.276" TYPE="double" UNITS="GB" TN="0" TMAX="180" DMAX="0" SLOPE="both" SOURCE="gmond">
<EXTRA_DATA>
<EXTRA_ELEMENT NAME="GROUP" VAL="disk"/>
<EXTRA_ELEMENT NAME="DESC" VAL="Total free disk space"/>
<EXTRA_ELEMENT NAME="TITLE" VAL="Disk Space Available"/>
</EXTRA_DATA>
</METRIC>
......
略


使用rrdtool导出rrd的内容.
[root@db-172-16-3-221 db-172-16-3-221.localdomain]# rrdtool dump ./swap_free.rrd |less
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE rrd SYSTEM "http://oss.oetiker.ch/rrdtool/rrdtool.dtd">
<!-- Round Robin Database Dump -->
<rrd>
        <version>0003</version>
        <step>10</step> <!-- Seconds -->
        <lastupdate>1410403482</lastupdate> <!-- 2014-09-11 10:44:42 CST -->

        <ds>
                <name> sum </name>
                <type> GAUGE </type>
                <minimal_heartbeat>80</minimal_heartbeat>
                <min>NaN</min>
                <max>NaN</max>

                <!-- PDP Status -->
                <last_ds>1048568</last_ds>
                <value>2.0971360000e+06</value>
                <unknown_sec> 0 </unknown_sec>
        </ds>

        <!-- Round Robin Archives -->
        <rra>
                <cf>AVERAGE</cf>
                <pdp_per_row>1</pdp_per_row> <!-- 10 seconds -->

                <params>
                <xff>5.0000000000e-01</xff>
                </params>
                <cdp_prep>
                        <ds>
                        <primary_value>1.0485680000e+06</primary_value>
                        <secondary_value>1.0485680000e+06</secondary_value>
                        <value>NaN</value>
                        <unknown_datapoints>0</unknown_datapoints>
                        </ds>
                </cdp_prep>
                <database>
                        <!-- 2014-09-10 18:28:50 CST / 1410344930 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:29:00 CST / 1410344940 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:29:10 CST / 1410344950 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:29:20 CST / 1410344960 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:29:30 CST / 1410344970 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:29:40 CST / 1410344980 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:29:50 CST / 1410344990 --> <row><v>1.0485680000e+06</v></row>
                        <!-- 2014-09-10 18:30:00 CST / 1410345000 --> <row><v>1.0485680000e+06</v></row>
....略


[参考]
4. man rrdcreate
5. <monitoring with Ganglia>
相关文章
|
8月前
|
XML 前端开发 JavaScript
|
JavaScript 前端开发
|
数据安全/隐私保护
|
数据安全/隐私保护