首页 > linux服务器集群监控如何实现

linux服务器集群监控如何实现

想实现一个简单的监控,通过web(php)可以查看到linux服务器集群的cpu,内存,硬盘,i/o等基本信息,不需要太详细的监控信息

不要一些开源软件,只想找个简单的例子,学习一下原理


nagios+ganglia(这样行吧)


推荐你关注一下/proc目录下的文件. 只要写几个小脚本隔一段时间读取一下其中的文件就可以明确的得到系统信息.

proc目录是linux内核API的文件形式接口, 当你使用文件方式读取时, 内核会返回实际的状况给你. 比如/proc/loadavg文件,它就是系统负载接口.

#直接查看时,文件大小为0
λ  ~/github/awesome_hdfs/ master*  ll /proc/loadavg 
-r--r--r-- 1 root root 0 1月  14 14:35 /proc/loadavg

#如果读取其中的内容, linux内核会通过这个文件接口给你输出具体的负载信息
λ  ~/github/awesome_hdfs/ master*  cat /proc/loadavg
3.88 3.96 3.98 7/981 5982

不同的文件有不同的数据格式, 这个你需要自己再搜索一下.

希望对你有帮助!


Linux上一切皆文件,系统信息可以直接通过读取系统运行时文件来获取.
strace -f -o strace.log vmstat && cat strace.log|egrep 'open|read'
可见vmstat先后打开并读取了下面文件的内容:

/proc/meminfo
/proc/stat
/proc/vmstat

你也可以用PHP来读取系统的这些文件来分析,或者更简单的方法是直接调用Linux内置的vmstat命令,然后分析它的输出,分析方法也很简单,直接preg_split/explode以空格分隔就能获取每一列的信息.

procs -----------memory---------- ---swap-- -----io---- -system-- ------cpu-----
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 1  0      0 855376 171960 3150452    0    0    14    29  100   22  7  4 89  0  0

man vmstat 可见

Procs
r: The number of processes waiting for run time.
b: The number of processes in uninterruptible sleep.
Memory
swpd: the amount of virtual memory used.
free: the amount of idle memory.
buff: the amount of memory used as buffers.
cache: the amount of memory used as cache.
inact: the amount of inactive memory. (-a option)
active: the amount of active memory. (-a option)
Swap
si: Amount of memory swapped in from disk (/s).
so: Amount of memory swapped to disk (/s).
IO
bi: Blocks received from a block device (blocks/s).
bo: Blocks sent to a block device (blocks/s).
System
in: The number of interrupts per second, including the clock.
cs: The number of context switches per second.
CPU
These are percentages of total CPU time.
us: Time spent running non-kernel code. (user time, including nice time)
sy: Time spent running kernel code. (system time)
id: Time spent idle. Prior to Linux 2.5.41, this includes IO-wait time.
wa: Time spent waiting for IO. Prior to Linux 2.5.41, included in idle.
st: Time stolen from a virtual machine. Prior to Linux 2.6.11, unknown.

假设你要监控的Linux服务器叫Agent,那你就在这台Agent上用PHP启动一个HTTP服务:

php -S 192.168.43.80:8181 -t /www /www/auth.php
/www/auth.php:
auth.php用于检测如果HTTP_USER_AGENT不为指定串,则拒绝访问.
<?php
$ua = '1a4a8ae8c53db5cb3c8badc2255587438e6b8db5';
if(isset($_SERVER['HTTP_USER_AGENT']) && $_SERVER['HTTP_USER_AGENT']===$ua) {
    return false;
} else {
    exit('Auth Failed');
}
/www/vmstat.php
<?php
echo exec('vmstat'); //返回命令输出的最后一行

假设你要收集监控信息的服务器叫Center,那你就在这台Center上用PHP向Agent发送请求获取Agent的运行时信息.

<?php
ini_set('user_agent', '1a4a8ae8c53db5cb3c8badc2255587438e6b8db5');
$vmstat = file_get_contents('http://192.168.43.80:8181/vmstat.php');
$vmstat = preg_split('/\s+/', $vmstat);
print_r($vmstat);
//输出
Array
(
    [0] => 
    [1] => 1
    [2] => 0
    [3] => 0
    [4] => 758952
    [5] => 173036
    [6] => 3155576
    [7] => 0
    [8] => 0
    [9] => 13
    [10] => 28
    [11] => 102
    [12] => 33
    [13] => 7
    [14] => 4
    [15] => 89
    [16] => 0
    [17] => 0
)
【热门文章】
【热门文章】