Docs/SysAdmin/Dell OpenManage

From Mandriva Community Wiki

Jump to: navigation, search

Contents

Introduction

Some Dell servers like the 1600sc have older proprietary interfaces for monitoring system environmental information like temperature, fan speed, and voltages. lm_sensors will not work on these boxes. Some of the newer systems have standard interfaces like IPMI. Dell makes a package available called Dell Open Manage Server Administrator OMSA that provides the necessary hardware instrumentation and monitoring tools. OMSA is supported only under RHEL and SUSe.

I've successfully installed and used OMSA under Mandriva 2007.1 Spring. I was able to use the RHEL5 RPMS under Mandriva. This article explains the steps for installing OMSA under Mandriva, how to get information about your system, and finally some basic SNMP alerting, including sending an e-mail when an event occurs.

Note: there are some Dell specific utilities that are included with the Mandriva distribution that aren't installed by default which you may not be aware of. These let you get your system's service tags and bios versions. There is also a utility that will allow you to update the BIOS of some systems will running under Mandriva Linux.


Prerequisites

  1. Dell Open Manager Server Administrator 5.2.0 or later. OMSA can be downloaded from support.dell.com as a .ISO file. This contains the necessary RPMS
  2. Mandriva Dell driver RPM and smbios interface RPM. http://linux.dell.com/libsmbios/main
  3. C++ standard library version 5. Mandriva 2007.1 ships with version 6 libraries. The RPMS from Dell are linked against version 5. Mandriva provides version 5 libraries as a compatibility package.
  4. Optional SNMP packages for some basic monitoring.


Installation Steps

SMBIOS RPM install

Install the Dell Utilities & Drivers that are supplied by Mandriva as part of the distribution. Install the libsmbios1 and libsmbios

 urpmi libsmbios1 libsmbios-bin


You now should have some basic Dell utilities installed that will let you access the System Management Interface. You need to load some drivers if they haven't already been loaded. The drivers are provided as part of the kernel RPMS. The Dell OMSA start up scripts will load these drivers on the next boot or you can add them to your modules.

modprobe dcdbas
modprobe dell_rbu

You should now be able to use the basic utilities to get information from your system such as the Dell Service Tag and the level of firmware. On a number of Dell systems you also now have the ability to update the system firmware while running under Mandriva Linux.

# rpm -qil libsmbios-bin
Name        : libsmbios-bin                Relocations: (not relocatable)
Version     : 0.13.2                            Vendor: Mandriva
Release     : 1mdv2007.1                    Build Date: Thu 08 Mar 2007 05:50:03 AM EST
Install Date: Tue 19 Jun 2007 03:41:29 PM EDT      Build Host: n1.mandriva.com
Group       : System/Configuration/Hardware   Source RPM: libsmbios-0.13.2-1mdv2007.1.src.rpm
Size        : 127090                           License: GPL/Open Software License
Signature   : DSA/SHA1, Wed 28 Mar 2007 04:11:45 PM EDT, Key ID e7898ae070771ff3
Packager    : Frederic Crozat <fcrozat@mandriva.com>
URL         : http://linux.dell.com/libsmbios/main
Summary     : The "supported" sample binaries that use libsmbios
Description :
Libsmbios is a library and utilities that can be used by client programs
to get information from standard BIOS tables, such as the SMBIOS table.

This package contains some sample binaries that use libsmbios.
/usr/bin/assetTag
/usr/bin/dellBiosUpdate
/usr/bin/dellLcdBrightness
/usr/bin/getSystemId
/usr/bin/propertyTag
/usr/bin/serviceTag
/usr/bin/tokenCtl
/usr/bin/verifySmiPassword
/usr/bin/wakeupCtl


# getSystemId
Libsmbios:    0.13.2
System ID:    0x0135
Service Tag:  8Q4YXXX
Express Service Code: 18994NNNNNN
Product Name: PowerEdge 1600SC
BIOS Version: A12
Vendor:       Dell Computer Corporation
Is Dell:      1


Install the C++ standard library version 5

Dell OMSA is linked against an older library than the default version 6 that ships with Mandriva.

# urpmi libstdc++5

# rpm -qil libstdc++5
Name        : libstdc++5                   Relocations: (not relocatable)
Version     : 3.3.6                             Vendor: Mandriva
Release     : 3mdk                          Build Date: Wed 19 Oct 2005 11:40:11 AM EDT
Install Date: Sat 17 Nov 2007 07:15:04 PM EST      Build Host: n2.mandriva.com
Group       : System/Libraries              Source RPM: gcc3.3-3.3.6-3mdk.src.rpm
Size        : 765624                           License: GPL
Signature   : DSA/SHA1, Wed 28 Mar 2007 04:12:03 PM EDT, Key ID e7898ae070771ff3
Packager    : Mandriva Linux Team <http://www.mandrivaexpert.com/>
URL         : http://gcc.gnu.org/
Summary     : GNU C++ library
Description :
This package contains the GCC Standard C++ Library v3, an ongoing
project to implement the ISO/IEC 14882:1998 Standard C++ library.
/usr/lib/libstdc++.so.5
/usr/lib/libstdc++.so.5.0.7


Install the Dell OMSA RPMS

The install scripts provided by Dell won't work under Mandriva so you need to install the RPMS by hand. Download the OMSA 5.2.x CD from Dell and mount it. Install the srvadmin packages that are in srvadmin/linux/RPMS/RHEL5. Not all of the packages are required. Some packages are for optional hardware like the Dell Remote Access Controllers (three different versions) and Dell Storage. For my systems I just installed the packages I needed. I tested that I could install all of the packages for completeness even though I didn't have those pieces of hardware. This was the minimal set of packages I needed for monitoring and also to run the dell web service.


mount /media/cdrom
cd /media/cdrom/srvadmin/linux/RPMS/RHEL5
urpmi libstdc++5
urpmi libsmbios-bin
rpm -Uvh instsvc-drivers-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-omilcore-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-omauth-5.2.0-460.rhel5.i386.rpm
rpm -Uvh srvadmin-deng-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-omacore-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-ipmi-5.2.0-460.rhel5.i386.rpm
rpm -Uvh srvadmin-hapi-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-isvc-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-omhip-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-racvnc-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-jre-5.2.0-460.i386.rpm
rpm -Uvh srvadmin-iws-5.2.0-460.i386.rpm


Start the Dell Processes and Test

Now your should be able to start the dell processes and test them.

# srvadmin-services.sh start
Starting Systems Management Device Drivers:
Starting dcdbas:                                                [  OK  ]
Starting dell_rbu:                                              [  OK  ]
Starting Systems Management Data Engine:
Starting dsm_sa_datamgr32d:                                     [  OK  ]
Starting dsm_sa_eventmgr32d:                                    [  OK  ]
Starting dsm_sa_snmp32d:                                        [  OK  ]
Starting DSM SA Shared Services:                                [  OK  ]

# omreport chassis
Health

Main System Chassis

SEVERITY : COMPONENT
Ok       : Fans
Ok       : Intrusion
Ok       : Memory
Ok       : Processors
Ok       : Temperatures
Ok       : Voltages
Ok       : Hardware Log

For further help, type the command followed by -?



Checking your system status from the command line

You can use the omreport command to check the system's vital statistics from the command line. You have to tell it which subsystem to report on it. Support for subsystems varies depending on the Dell server it is installed on. You can browse using -? to find out the available options. These can be used for scripting.



# omreport chassis temps
Temperature Probes Information

------------------------------------
Main System Chassis Temperatures: Ok
------------------------------------

Index                     : 0
Status                    : Ok
Probe Name                : Planar Temp
Reading                   : 27.0 C
Minimum Warning Threshold : 10.0 C
Maximum Warning Threshold : 45.0 C
Minimum Failure Threshold : 5.0 C
Maximum Failure Threshold : 55.0 C

Index                     : 1
Status                    : Ok
Probe Name                : CPU 1 Temp
Reading                   : 33.0 C
Minimum Warning Threshold : 10.0 C
Maximum Warning Threshold : 75.0 C
Minimum Failure Threshold : 5.0 C
Maximum Failure Threshold : 80.0 C


# omreport chassis fans
Fan Probes Information

----------------------------
Main System Chassis Fans: Ok
----------------------------

Index                     : 0
Status                    : Ok
Probe Name                : CPU 1 FAN RPM
Reading                   : 3245 RPM
Minimum Warning Threshold : 1960 RPM
Maximum Warning Threshold : 14535 RPM
Minimum Failure Threshold : 1680 RPM
Maximum Failure Threshold : 15000 RPM

Index                     : 1
Status                    : Ok
Probe Name                : Front FAN RPM
Reading                   : 1739 RPM
Minimum Warning Threshold : 1260 RPM
Maximum Warning Threshold : 14535 RPM
Minimum Failure Threshold : 1080 RPM
Maximum Failure Threshold : 15000 RPM

Index                     : 2
Status                    : Ok
Probe Name                : Rear FAN RPM
Reading                   : 1854 RPM
Minimum Warning Threshold : 1260 RPM
Maximum Warning Threshold : 14535 RPM
Minimum Failure Threshold : 1080 RPM
Maximum Failure Threshold : 15000 RPM

# omreport system esmlog 
Embedded System Management (ESM) Log

Health : Ok

Embedded System Management Log contains...

Severity      : Ok
Date and Time : Mon Mar  3 21:53:10 2003
Description   : Log cleared

Severity      : Critical
Date and Time : Sat Nov 17 10:03:58 2007
Description   : Chassis Intrusion detected

Severity      : Ok
Date and Time : Sat Nov 17 10:09:39 2007
Description   : Chassis Intrusion returned to normal


Web Access

There is also a web interface that was installed by the srvadmin-iws package. It uses local system usernames and passwords for authentication. It can be found at https://localhost:1311/ Note: you must use https! The web interface is a convenient way of turning on some of the alerting and setting up the hardware watchdog timer that can reboot or even power cycle the system if the OS isn't running. Since it uses SSL it needs a certificate. The installation process will generate a self signed certificate that you will get warnings about. It's safe to ignore those warnings once you see they are coming from your system.

If you are running the Shorewall firewall on your server, you will need to allow access to your server on port 1311 Add the following to /etc/shorewall/rules and then reload shorewall.

ACCEPT          net             $FW             tcp     1311   # dell omsa


SNMP Access

OMSA will add a line to your SNMP configuration if you have one in order to allow it to connect to the main SNMP agent on your box /etc/snmp/snmpd.conf:

# Allow Systems Management Data Engine SNMP to connect to snmpd using SMUX
smuxpeer .1.3.6.1.4.1.674.10892.1


If you haven't already enabled SNMP trap reporting you will want to add the following lines to /etc/snmp/snmpd.conf. This assumes you have snmptrapd running on your local server. If you have a different central trap host, change the trap2sink to point to that. Also remember to set the appropriate SNMP community string, rather than the default "public".

trap2sink localhost public
authtrapenable 1

Sample snmptrapd configuration

Here's a sample SNMPtrapd configuration that will allow you to get e-mail notifications when events occur. If you are going to be using SNMP trap reporting you will want a much more robust configuration. Note this opens up snmptrapd to accept traps from anyone without authentication.

Add this to /etc/snmp/snmptrapd.conf

# send e-mail when we receive traps
traphandle default /usr/bin/traptoemail root@your.domain

# TODO: change this to something more secure
disableAuthorization yes

Enable syslogging of snmptraps. In /etc/sysconfig/snmptrapd add:

OPTIONS="-Lsd"

At this point it would be a good ideal to restart snmptrapd, snmpd, and the Dell server admin pieces in that order so that the configuration pieces will be picked up.


Testing Alerts

To test that everything is working and that you will get alerts if something goes wrong, I needed to trigger an event. What I did that is non-destructive is to remove the side chassis cover from the server. Through SNMP and traptoemail I received an e-mail when the cover was opened and when it was closed. Additionally there were reports via syslog. Through the OMSA web interface, I had enabled sending a broadcast message as well as a console message, so OMSA did a wall(1) to let anyone who was logged in know what was happening. If you do something to intentionally trigger an error on your own server, do so at your own risk.

Syslog messages received, Note srvadmin logged one message, while the other was logged by snmptrapd.

Nov 17 20:03:58 srv Server Administrator: Instrumentation Service EventID: 1254  Chassis intrusion detected [...]
Nov 17 20:03:58 srv snmptrapd[5672]: ... Chassis intrusion detected ...  Previous state was: OK (Normal) ...

Sample e-mail text:

Host: localhost (UDP: [127.0.0.1]:32833)
            DISMAN-EVENT-MIB::sysUpTimeInstance  301853
                      SNMPv2-MIB::snmpTrapOID.0  SNMPv2-SMI::enterprises.674.10892.1.0.1254
SNMPv2-SMI::enterprises.674.10892.1.5000.10.1.0  "server.local"
SNMPv2-SMI::enterprises.674.10892.1.5000.10.2.0  SNMPv2-SMI::enterprises.674.10892.1.300.70.1.2.1.1
SNMPv2-SMI::enterprises.674.10892.1.5000.10.3.0  "Chassis intrusion detected 
                                         Sensor  location: Chassis Intrusion 
                                        Chassis  location: Main System Chassis 
                                       Previous  state was: OK (Normal) 
                                        Chassis  intrusion state: Open"
SNMPv2-SMI::enterprises.674.10892.1.5000.10.4.0  5
SNMPv2-SMI::enterprises.674.10892.1.5000.10.5.0  3
SNMPv2-SMI::enterprises.674.10892.1.5000.10.6.0  ""

For More Information

Personal tools