青浦体育局:Nagios通过check
来源:百度文库 编辑:九乡新闻网 时间:2024/10/05 21:52:34
Nagios通过check_megaraid_sas(基于MecaCli工具的插件)对RAID卡和硬盘进行监控的方法2011-08-12 15:30 http://hi.baidu.com/edeed/blog/item/cf82a7ef30277d0cfdfa3cf8.html 对于使用了LSI MegaRAID卡搭建RAID的, 通过LSI公司提供的MegaCli工具, 就可以实现对RAID卡和硬盘的监控. 注: DELL PERC5/6(PowerEdge RAID ControllerPERC)阵列卡实际上也就是LSI MegaRAID SAS controllers.
最新MegaCli工具包下载地址:
http://www.lsi.com/Search/Pages/results.aspx?k=megacli&r=assettype%3D%22AQ1NaXNjZWxsYW5lb3VzCWFzc2V0dHlwZQEBXgEk%22%20os%3D%22AQVMaW51eAJvcwEBXgEk%22
1. 安装前提
1) 查看服务器类型
# dmidecode -s system-product-name (新版本dmidecode使用)
or
# dmidecode | grep "Product Name" (低版本dmidecode使用)
Lenovo WQ R520 G7
2) 确认是否使用MegaRAID卡
--HP ProLiant系列服务器大都使用Smart Array阵列卡
不适用.
--Lenovo万全系列服务器可能如下显示(有些不可用?)
# dmesg | grep RAID
scsi0 : LSI SAS based MegaRAID driver
Vendor: LSI Model: MegaRAID 8300XLP Rev: 2.02
md: Autodetecting RAID arrays.
--IBM x系列服务器可能如下显示
# dmesg | grep RAID
scsi0 : LSI SAS based MegaRAID driver
Vendor: IBM Model: ServeRAID M5015 Rev: 2.0.
md: Autodetecting RAID arrays.
--Dell PowerEdge系列服务器可能如下显示
# dmesg | grep RAID
scsi0 : LSI Logic SAS based MegaRAID driver
md: Autodetecting RAID arrays.
3) 确认是否已安装
# rpm -qa | egrep 'Lib_Utils|MegaCli'
2. 安装MegaCli
建议下载安装使用最新的MegaCli, 这样就支持更多的SAS硬盘类型的监控.
# cd /tmp
# unzip 8.01.06_Linux_MegaCLI.zip (解压MegaCli软件包)
Archive: 8.01.06_Linux_MegaCLI.zip
inflating: readme.txt
inflating: 8.01.06_Linux_MegaCLI.txt
extracting: MegaCliLin.zip
# unzip MegaCliLin.zip (进一步解压MegaCliLin软件包)
Archive: MegaCliLin.zip
inflating: Lib_Utils-1.00-08.noarch.rpm
replace readme.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
inflating: readme.txt
inflating: MegaCli-8.01.06-1.i386.rpm
其中MegaCli-8.01.06-1.i386.rpm包是我们需要的(32bit或64bit系统都使用该包), 如果操作系统缺失了MegaCli相关的依赖包, 那么就需要先安装Lib_Utils-1.00-08.noarch.rpm了:
# rpm -ivh Lib_Utils-1.00-08.noarch.rpm
# rpm -Uvh MegaCli-8.01.06-1.i386.rpm
# rpm -ql MegaCli (确认MegaCli包的安装文件信息)
/opt/MegaRAID/MegaCli/MegaCli
/opt/MegaRAID/MegaCli/MegaCli64
如果是32bit系统, 就使用MegaCli; 如果是64bit系统就是使用MegaCli64.
# /opt/MegaRAID/MegaCli/MegaCli (该命令直接执行会提示如下错误)
or
# /opt/MegaRAID/MegaCli/MegaCli64 (该命令直接执行会提示如下错误)
Fatal error - Command Tool invoked with wrong parameters
Exit Code: 0x01
3. 测试MegaCli
# arch (确定操作系统架构)
x86_64
原文件有大小写和数字, 且路径太长, 建议做个软连接到/usr/bin目录:
# ln -sf /opt/MegaRAID/MegaCli/MegaCli /usr/bin/megacli (32bit系统)
or
#megacli -LdGetNum -aALL (查看逻辑盘个数)
# megacli -LdInfo -LALL -aAll (显示所有逻辑盘信息, IBM x3650服务器示例)
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 1.086 TB
State : Optimal
Strip Size : 128 KB
Number Of Drives per span:4 //表示每4个物理盘做成一个RAID1盘组
Span Depth : 2 //表示共2个RAID1盘组做成了RAID10
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy : Read/Write
Disk Cache Policy : Disabled
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: Yes
Exit Code: 0x00
# megacli -PdList -aAll| more (显示所有的物理盘信息, IBM x3650服务器示例)
Adapter #0
Enclosure Device ID: 252
Slot Number: 0
Enclosure position: 0
Device Id: 8
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.464 GB [0x22cee000 Sectors]
Firmware state: Online, Spun Up
SAS Address(0): 0x5000cca015512ae5
SAS Address(1): 0x0
Connected Port Number: 1(path0)
Inquiry Data: IBM-ESXSCBRCA300C3ETS0 NC610PFWEMUBECCXSA610
IBM FRU/CRU: 42D0638
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :38C (100.40 F)
...
# megacli -cfgdsply -aALL | more (显示Raid卡型号,Raid设置,Disk相关信息)
# megacli -FwTermLog -Dsply -aALL | more (查看Raid卡日志)
# megacli -AdpAllInfo -aALL | more (查看Raid卡功能详细说明)
4. 安装check_megaraid_sas
就是一个通过MegaCli命令来获取监控信息的Nagios插件, 使用perl编写的.
下载地址: http://www.techno-obscura.com/~delgado/code/check_megaraid_sas
# cd /tmp
# vi check_megaraid_sas
-------------------------------------------------------------------------
# 35行修改如下
use lib qw(/usr/local/nagios/libexec); # possible pathes to your Nagios plugins and utils.pm
# 52-53行修改如下
my $megaclibin = '/usr/bin/megacli'; # the full path to your MegaCli binary
my $megacli = "$megaclibin"; # how we actually call MegaCli
-------------------------------------------------------------------------
# cp check_megaraid_sas /usr/local/nagios/libexec/check_megaraid_sas
# chmod 755 /usr/local/nagios/libexec/check_megaraid_sas
# /usr/local/nagios/libexec/check_megaraid_sas -h (查看使用帮助)
Usage: /usr/local/nagios/libexec/check_megaraid_sas [-s number] [-m number] [-o number]
-s is how many hotspares are attached to the controller
-m is the number of media errors to ignore
-p is the predictive error count to ignore
-o is the number of other disk errors to ignore
5. 测试check_megaraid_sas
# /usr/local/nagios/libexec/check_megaraid_sas
WARNING: 0:0:RAID-10:6 drives:1.225TB:Optimal Drives:6 (365 Errors)
如果报告有错误信息, 那么通过如下命令获得哪些物理盘有错误:
# megacli -PdList -aAll| egrep "Slot Number|Error Count|Failure Count"
Slot Number: 0
Media Error Count: 0
Other Error Count: 36
Predictive Failure Count: 0
Slot Number: 1
Media Error Count: 0
Other Error Count: 37
Predictive Failure Count: 0
Slot Number: 2
Media Error Count: 0
Other Error Count: 92
Predictive Failure Count: 0
Slot Number: 3
Media Error Count: 0
Other Error Count: 90
Predictive Failure Count: 0
Slot Number: 4
Media Error Count: 0
Other Error Count: 56
Predictive Failure Count: 0
Slot Number: 5
Media Error Count: 0
Other Error Count: 54
Predictive Failure Count: 0
如果确认这些错误可以忽略, 那么如下执行:
# /usr/local/nagios/libexec/check_megaraid_sas -o 365
OK: 0:0:RAID-10:6 drives:1.225TB:Optimal Drives:6 (365 Errors)
输出信息格式说明:
最新MegaCli工具包下载地址:
http://www.lsi.com/Search/Pages/results.aspx?k=megacli&r=assettype%3D%22AQ1NaXNjZWxsYW5lb3VzCWFzc2V0dHlwZQEBXgEk%22%20os%3D%22AQVMaW51eAJvcwEBXgEk%22
1. 安装前提
1) 查看服务器类型
# dmidecode -s system-product-name (新版本dmidecode使用)
or
# dmidecode | grep "Product Name" (低版本dmidecode使用)
Lenovo WQ R520 G7
2) 确认是否使用MegaRAID卡
--HP ProLiant系列服务器大都使用Smart Array阵列卡
不适用.
--Lenovo万全系列服务器可能如下显示(有些不可用?)
# dmesg | grep RAID
scsi0 : LSI SAS based MegaRAID driver
Vendor: LSI Model: MegaRAID 8300XLP Rev: 2.02
md: Autodetecting RAID arrays.
--IBM x系列服务器可能如下显示
# dmesg | grep RAID
scsi0 : LSI SAS based MegaRAID driver
Vendor: IBM Model: ServeRAID M5015 Rev: 2.0.
md: Autodetecting RAID arrays.
--Dell PowerEdge系列服务器可能如下显示
# dmesg | grep RAID
scsi0 : LSI Logic SAS based MegaRAID driver
md: Autodetecting RAID arrays.
3) 确认是否已安装
# rpm -qa | egrep 'Lib_Utils|MegaCli'
2. 安装MegaCli
建议下载安装使用最新的MegaCli, 这样就支持更多的SAS硬盘类型的监控.
# cd /tmp
# unzip 8.01.06_Linux_MegaCLI.zip (解压MegaCli软件包)
Archive: 8.01.06_Linux_MegaCLI.zip
inflating: readme.txt
inflating: 8.01.06_Linux_MegaCLI.txt
extracting: MegaCliLin.zip
# unzip MegaCliLin.zip (进一步解压MegaCliLin软件包)
Archive: MegaCliLin.zip
inflating: Lib_Utils-1.00-08.noarch.rpm
replace readme.txt? [y]es, [n]o, [A]ll, [N]one, [r]ename: A
inflating: readme.txt
inflating: MegaCli-8.01.06-1.i386.rpm
其中MegaCli-8.01.06-1.i386.rpm包是我们需要的(32bit或64bit系统都使用该包), 如果操作系统缺失了MegaCli相关的依赖包, 那么就需要先安装Lib_Utils-1.00-08.noarch.rpm了:
# rpm -ivh Lib_Utils-1.00-08.noarch.rpm
# rpm -Uvh MegaCli-8.01.06-1.i386.rpm
# rpm -ql MegaCli (确认MegaCli包的安装文件信息)
/opt/MegaRAID/MegaCli/MegaCli
/opt/MegaRAID/MegaCli/MegaCli64
如果是32bit系统, 就使用MegaCli; 如果是64bit系统就是使用MegaCli64.
# /opt/MegaRAID/MegaCli/MegaCli (该命令直接执行会提示如下错误)
or
# /opt/MegaRAID/MegaCli/MegaCli64 (该命令直接执行会提示如下错误)
Fatal error - Command Tool invoked with wrong parameters
Exit Code: 0x01
3. 测试MegaCli
# arch (确定操作系统架构)
x86_64
原文件有大小写和数字, 且路径太长, 建议做个软连接到/usr/bin目录:
# ln -sf /opt/MegaRAID/MegaCli/MegaCli /usr/bin/megacli (32bit系统)
or
# ln -sf /opt/MegaRAID/MegaCli/MegaCli64 /usr/bin/megacli (64bit系统)
现在就可以直接执行软连接后的文件了:
# megacli -help (查看命令帮助)
# megacli -adpCount (查看适配器个数)#megacli -LdGetNum -aALL (查看逻辑盘个数)
# megacli -LdInfo -LALL -aAll (显示所有逻辑盘信息, IBM x3650服务器示例)
Adapter 0 -- Virtual Drive Information:
Virtual Drive: 0 (Target Id: 0)
Name :
RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0
Size : 1.086 TB
State : Optimal
Strip Size : 128 KB
Number Of Drives per span:4 //表示每4个物理盘做成一个RAID1盘组
Span Depth : 2 //表示共2个RAID1盘组做成了RAID10
Default Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Current Cache Policy: WriteBack, ReadAheadNone, Direct, No Write Cache if Bad BBU
Access Policy : Read/Write
Disk Cache Policy : Disabled
Encryption Type : None
Default Power Savings Policy: Controller Defined
Current Power Savings Policy: None
Can spin up in 1 minute: Yes
LD has drives that support T10 power conditions: Yes
LD's IO profile supports MAX power savings with cached writes: Yes
Exit Code: 0x00
# megacli -PdList -aAll| more (显示所有的物理盘信息, IBM x3650服务器示例)
Adapter #0
Enclosure Device ID: 252
Slot Number: 0
Enclosure position: 0
Device Id: 8
Sequence Number: 2
Media Error Count: 0
Other Error Count: 0
Predictive Failure Count: 0
Last Predictive Failure Event Seq Number: 0
PD Type: SAS
Raw Size: 279.396 GB [0x22ecb25c Sectors]
Non Coerced Size: 278.896 GB [0x22dcb25c Sectors]
Coerced Size: 278.464 GB [0x22cee000 Sectors]
Firmware state: Online, Spun Up
SAS Address(0): 0x5000cca015512ae5
SAS Address(1): 0x0
Connected Port Number: 1(path0)
Inquiry Data: IBM-ESXSCBRCA300C3ETS0 NC610PFWEMUBECCXSA610
IBM FRU/CRU: 42D0638
FDE Capable: Not Capable
FDE Enable: Disable
Secured: Unsecured
Locked: Unlocked
Needs EKM Attention: No
Foreign State: None
Device Speed: 6.0Gb/s
Link Speed: 6.0Gb/s
Media Type: Hard Disk Device
Drive: Not Certified
Drive Temperature :38C (100.40 F)
...
# megacli -cfgdsply -aALL | more (显示Raid卡型号,Raid设置,Disk相关信息)
# megacli -FwTermLog -Dsply -aALL | more (查看Raid卡日志)
# megacli -AdpAllInfo -aALL | more (查看Raid卡功能详细说明)
4. 安装check_megaraid_sas
就是一个通过MegaCli命令来获取监控信息的Nagios插件, 使用perl编写的.
下载地址: http://www.techno-obscura.com/~delgado/code/check_megaraid_sas
# cd /tmp
# vi check_megaraid_sas
-------------------------------------------------------------------------
# 35行修改如下
use lib qw(/usr/local/nagios/libexec); # possible pathes to your Nagios plugins and utils.pm
# 52-53行修改如下
my $megaclibin = '/usr/bin/megacli'; # the full path to your MegaCli binary
my $megacli = "$megaclibin"; # how we actually call MegaCli
-------------------------------------------------------------------------
# cp check_megaraid_sas /usr/local/nagios/libexec/check_megaraid_sas
# chmod 755 /usr/local/nagios/libexec/check_megaraid_sas
# /usr/local/nagios/libexec/check_megaraid_sas -h (查看使用帮助)
Usage: /usr/local/nagios/libexec/check_megaraid_sas [-s number] [-m number] [-o number]
-s is how many hotspares are attached to the controller
-m is the number of media errors to ignore
-p is the predictive error count to ignore
-o is the number of other disk errors to ignore
5. 测试check_megaraid_sas
# /usr/local/nagios/libexec/check_megaraid_sas
WARNING: 0:0:RAID-10:6 drives:1.225TB:Optimal Drives:6 (365 Errors)
如果报告有错误信息, 那么通过如下命令获得哪些物理盘有错误:
# megacli -PdList -aAll| egrep "Slot Number|Error Count|Failure Count"
Slot Number: 0
Media Error Count: 0
Other Error Count: 36
Predictive Failure Count: 0
Slot Number: 1
Media Error Count: 0
Other Error Count: 37
Predictive Failure Count: 0
Slot Number: 2
Media Error Count: 0
Other Error Count: 92
Predictive Failure Count: 0
Slot Number: 3
Media Error Count: 0
Other Error Count: 90
Predictive Failure Count: 0
Slot Number: 4
Media Error Count: 0
Other Error Count: 56
Predictive Failure Count: 0
Slot Number: 5
Media Error Count: 0
Other Error Count: 54
Predictive Failure Count: 0
如果确认这些错误可以忽略, 那么如下执行:
# /usr/local/nagios/libexec/check_megaraid_sas -o 365
OK: 0:0:RAID-10:6 drives:1.225TB:Optimal Drives:6 (365 Errors)
输出信息格式说明:
剩下就是设置Nagios的Command和Service了, 就不细写了啊.
--End--
Nagios通过check
Check
Nagios 3.0 安装配置手册1
Ganglia 和 Nagios,第 2 部分: 使用 Nagios 监视企业集群
Linux nagios监控远程Linux服务器--LUPA开源
check version of Win Pe Installed
让List Control有Check Box 和 让List Box有Check Box
Chinese vice premier stresses efforts to check food safety
5月17日,CHECK被拒后的再一次签证
insert into (select ... from ... where ... with check option) values ...
Safety check on track for rail - People's Dai...
Time for Pak to check its ties with US - Focu...
InfoQ: 持续集成之戏说Check-in Dance
新目标英语九年级第一单元?SELF?CHECK?教案
新目标英语九年级第三单元?SELF?CHECK?教案
九年级第五单元SECTION?A、B?and?SELF?CHECK教案
试卷袋: 牛津小学英语3A Unit 12 Revision and Check 期末测试...
切客(Checker)Check in核心价值探讨 | 网赚-SEO搜索引擎优化-站长工具...
Check Point:IT技术潮流及安全考虑_Check Point IT技术 报道_安...
Fixing "font not embedded" issue to pass the IEEE PDF eXpress check
教材解析九年级英语第一单元Section?B?and?Self?Check?解析及拓展
新目标英语九年级第六单元SECTION?A、B、SELF?CHECK教案
loose garments not in our side,should check in customer side.
新目标英语九年级第五单元SECTION?A、B?and?SELF?CHECK教案