I use Cacti to monitor a lot of Dell servers, primarily 1850s and 2850s but also the newer models of same (1950s and 2950s). One itch that I’ve meant to scratch for a while is graphing some of the information available through the servers’ IPMI interface; specifically the servers’ various temperatures and and fan speeds.
IPMI Details
There are patches available for the Linux kernel to allow the IPMI information to be read via the lm_sensors project but I chose to avoid this (at least for now) as I’d have to schedule downtime to reboot the servers for a new kernel. It’d also ruin their uptime – most of the servers (serving many thousands of users daily) have almost two years of uptime. (The kernels are monolithic.)
Instead, I went with the already compiled in Linux IPMI Driver (see kernel source: Documentation/IPMI.txt) which is available in the ‘Character Devices’ menu. I specifically needed the following options for the Dells:
-
drivers/char/ipmi/ipmi_msghandler
-
drivers/char/ipmi/ipmi_devintf
-
drivers/char/ipmi/ipmi_si
In order to read information from the IPMI, you need the ipmitool
utility which is available on most recent Linux distributions or from here.
Lastly, I needed to create a character special file to interface with the IPMI:
mknod /dev/ipmi0 c 254 0
The sensor information was then available via:
# ipmitool sensor Temp | 30.000 | degrees C | ok | na | na | na | 85.000 | 90.000 | na Temp | 34.000 | degrees C | ok | na | na | na | 85.000 | 90.000 | na Ambient Temp | 16.000 | degrees C | ok | na | 3.000 | 8.000 | 42.000 | 47.000 | na ...
Making IPMI Sensor Information Available via SNMP
I make the IPMI sensor information available over SNMP by adding the following to the snmpd.conf
file:
# Monitor IPMI Temperature and Fan stats exec .1.3.6.1.4.1.X.1000 ipmitemp /usr/local/sbin/ipmi-temp-stats exec .1.3.6.1.4.1.X.1001 ipmifan /usr/local/sbin/ipmi-fan-stats
(Replace X above as appropriate.)
The scripts referenced are: /usr/local/sbin/ipmi-temp-stats
:
#! /bin/sh PATH=/usr/bin:/bin STATS=/tmp/ipmisensor-snmp printf "%f\n" `cat $STATS | grep Temp | cut -s -d "|" -f 2`
And /usr/local/sbin/ipmi-fan-stats
:
#! /bin/sh PATH=/usr/bin:/bin STATS=/tmp/ipmisensor-snmp printf "%f\n" `cat $STATS | grep FAN | cut -s -d "|" -f 2`
The file they reference is generated every 5mins (Cacti polling interval) via a cron entry in the file /etc/cron.d/ipmitool
:
*/5 * * * * root /usr/bin/ipmitool sensor >/tmp/ipmisensor-snmp
After restarting SNMP and allowing the cron job to execute at least once, you can test the results via:
# snmpwalk -c <community> -v <version> <ip/hostname> .1.3.6.1.4.1.X.1000 SNMPv2-SMI::enterprises.X.1000.1.1 = INTEGER: 1 SNMPv2-SMI::enterprises.X.1000.2.1 = STRING: "ipmitemp" SNMPv2-SMI::enterprises.X.1000.3.1 = STRING: "/usr/local/sbin/ipmi-temp-stats" SNMPv2-SMI::enterprises.X.1000.100.1 = INTEGER: 0 SNMPv2-SMI::enterprises.X.1000.101.1 = STRING: "37.000000" SNMPv2-SMI::enterprises.X.1000.101.2 = STRING: "39.000000" SNMPv2-SMI::enterprises.X.1000.101.3 = STRING: "23.000000" SNMPv2-SMI::enterprises.X.1000.101.4 = STRING: "36.000000" ... SNMPv2-SMI::enterprises.X.1000.102.1 = INTEGER: 0 SNMPv2-SMI::enterprises.X.1000.103.1 = ""
Graphing This Information in Cacti
Finally, I graph this information on Cacti (see end of post for examples).
I am making six templates available here which can be imported into Cacti (these were generated using version 0.8.6j) for graphing the above:
- Cacti graph template for Dell 1850 temperatures (see first image below);
- Cacti graph template for Dell 2850 temperatures (see second image below);
- Cacti graph template for Dell 1850 fan speeds (see third image below);
- Cacti graph template for Dell 2850 fan speeds (see fourth image below);
- Cacti host template for Dell 1850; and
- Cacti host template for Dell 2850.
The last two templates available are host templates for Dell 1850s and 2850s (I’m sure they’ll work fine with 1950s and 2950s also). These templates include:
- Host MIB – Logged in Users;
- Host MIB – Processes;
- IPMI Fan Speeds (Dell x850) (from above);
- IPMI Temperatures (Cel) (Dell x850) (from above);
- ucd/net – CPU Usage;
- ucd/net – Load Average;
- ucd/net – Memory Usage;
- SNMP – Get Mounted Partitions (data query); and
- SNMP – Interface Statistics (data query).
Example graphs are shown below; they’re not the cleanest given the amount of information they contain but they serve my purposes.
© 2007 Barry O’Donovan. All text is licensed under a Creative Commons Attribution 3.0 License. All scripts and Cacti templates are licensed under the MIT License.
Hi,
It is great tutorial.
I tried to create the script and added to snmpd.conf.
When I run the snmpwalk, I got
snmpwalk -c public -v 2c 192.168.0.31 .1.3.6.1.4.1.2021.1000
UCD-SNMP-MIB::ucdavis.1000.1.1 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.1000.2.1 = STRING: “ipmitemp”
UCD-SNMP-MIB::ucdavis.1000.3.1 = STRING: “/usr/local/sbin/ipmi-temp-stats”
UCD-SNMP-MIB::ucdavis.1000.100.1 = INTEGER: 1
UCD-SNMP-MIB::ucdavis.1000.101.1 = STRING: “/usr/local/sbin/ipmi-temp-stats: Exec format error”
UCD-SNMP-MIB::ucdavis.1000.102.1 = INTEGER: 0
UCD-SNMP-MIB::ucdavis.1000.103.1 = “”
Did u face it before?
Nico
Worked like a champ; dunno if it was eaten by WordPress but the “%fn” in the shell script should be “%f\n” (with a forward slash before the n to force a newline.)
You can even hook this up to Nagios to fire off warnings then the temp is out of bounds or if a fan dies with very little extra work.
Is there a way to make this work for a 2850 running Windows (using windows snmp.exe)?
Superbly useful ! Many thanks. Got this working on Sun hardware and can easily mod for software monitoring, i.e. heartbeat and managed resources.
Many thanks
Hi,
what if i get this after compiling ipmitool and creating /dev/ipmi0
ipmitool -vvv sensor
Querying SDR for sensor list
Using ipmi device 0
Could not enable event receiver: Invalid argument
Get Device ID command failed
Unable to open SDR for reading
I’ve just been lumbered with a client who has saved themselves some money by switching from DL360s to DL140s which means only IPMI and no SNMP agents ๐
A Windows IPMI-SNMP bridge like this would be a *really* handy thing.
Any idea?
# mknod /dev/ipmi0 c 254 0
# chmod 777 ipmi0
# ls -la |grep ipmi
crwxrwxrwx 1 root root 254, 0 2004-06-30 21:04 ipmi0
# ipmitool -I open lan set 1 ipaddr xxx.xxx.xxx.xxx
Could not open device at /dev/ipmi0 or /dev/ipmi/0: No such file or directory
Get Channel Info command failed
Channel 1 is not a LAN channel!
# whoami
root
What’s the output of:
# lsmod | grep ipmi
You should have the following:
ipmi_msghandler
ipmi_devintf
ipmi_si
– Barry
Always the same problem!!
# ipmitool user set name 1 enseleit
Could not open device at /dev/ipmi0 or /dev/ipmi/0: No such file or directory
Set User Name command failed (user 1, name enseleit)
# ipmitool user list
Could not open device at /dev/ipmi0 or /dev/ipmi/0: No such file or directory
Get User Access command failed (channel 14, user 1)
# lsmod | grep ipmi
ipmi_poweroff 24996 0
ipmi_watchdog 33472 0
ipmi_devintf 25736 0
ipmi_msghandler 65764 3 ipmi_poweroff,ipmi_watchdog,ipmi_devintf
Great tutorial! I’ve gotten to the point where snmpwalk shows me the temp and fan speed values but I have no idea how to make them work in cacti. I’ve imported the host, temp and fan templates but I have no idea how to hook them up to data sources, graphs etc
I’ve created a device based on the host template but that’s about it ๐ Anyone that can give me a quick hand? I think cacti is awesome but way too convoluted when it comes to setting it up..
I agree with George above – this is a great tutorial! I am also stuck in a similar place – everything works on SNMP, but not in Cacti. I am using 1950s instead of 1850s, so I don’t know if that creates a problem or not. I was able to get some of the fan speeds graphed, but I cannot get the CPU temps to work. IPMI reports our CPUs by relative temperature, so the numbers are negative. Could this be throwing Cacti off?
Also, how does the OID setup work for the data templates? If I want to see a certain value, can I put in the specific OID for that value rather than the generic OID that gets ‘walked’ across?
ipmitool sensors takes a long while to complete …
So, I use
ipmitool sdr type “Temperature”
ipmi-temp-stats has
printf “%f\n” `cat /tmp/ipmisensor-snmp |grep -v Disabled |cut -s -d “|” -f5|sed ‘s/degrees C//’`
But net-snmp does not like
the exec line in snmpd.conf
I tried extend; that didn’t help either.
On my systems the keyword “exec” changed to “extend” in snmpd.conf. And as the kipmi0 process is cpu hungry i changed my scripts to use an sdr dump file. See the -S option in the ipmitool manpages.
Hello Barry,
Am writing an application that gets the IPMI data over the lan interface. Whatever that you have described here works only with Dell Servers because IPMI data is exposed thru SNMP by Dell Openmanage? I am not sure if the same OIDs work in other server machines. Please confirm.
Thanks,
Rajesh
So how do we figure this value for X on the MIB?