Donnerstag, 28. August 2014

Nagios: NetApp Plugin check_netapp_sdk.pl

Written in Perl, using NetApp Manageability SDK


Functions
  • check-volume
  • check-lun
  • check-snapmirror
  • check-aggr
  • check-cluster
  • check-shelf
  • check-license

Dependencies:
- NetApp 7-Mode
- Nagios::Plugins (Perl)
- NetApp Manageability SDK (Perl)
- At line 24 adjust your lib path:
use lib "/usr/lib/perl5/site_perl/5.8.8/NetApp";
- Warning/Critical defaults to 85/95%


check-volume [-n VOLUME_NAME] - List volumes use -n to specify volume
-> size in percent (-w/-c)
check-lun [-n LUN_NAME] - List LUNs, use -n to specify lun
-> size in percent (-w/-c)
-> misalignment results to warning
-> offline state and is mapped results to critical
check-snapmirror [-n SNAPMIRROR_NAME] - List snapmirrors, use -n to specify snapmirror
-> lag_time in seconds (-w/-c)
-> transfer error -> CRITICAL
check-aggr [-n AGGREGATE_NAME] - List aggregates, use -n to specify lun
-> size in percent (-w/-c)
-> mount state: warning on creating, mounting, unmounting, quiescing; ok on online consistent quiesced; critical for the rest!
-> mirror state: warning on 'CP count check in progress'; ok on mirrored, unmirrored; critical for the rest!
-> raid state: warning on resyncing, copying, growing, reconstruct; ok on normal, mirrored; critical for the rest
-> inconsistency results on critical
check-cluster - checks for cluster state
-> warning/critical on other state than connected
-> warning on inactive hwassist (if available)
-> interconnect state
check-shelf
-> critical on failed power-supply
-> critical on failed voltage sensor
-> critical on failed temp sensor
-> temperature (values provided by netapp, is needed due to different sensor locations)
-> shelf state : warning on informational, non_critical; ok on normal; critical for the rest!
check-license - checks license
-> expiry date (excludes demo, auto_enabled and non expiry lics)

Sample Output:


$ check_netapp_sdk.pl -H snapmirror1 -U $USER1$ -P $USER2$ -S -C check-snapmirror -w 43200 -c 86400
CRITICAL - 1 failed snapmirror found: C->netapp1->snapmirror1: nfs_ds2_snapmirror: Lag-time: 10.2 days Error: - | netapp1_snapmirror1_nfs_ds2_snapmirror_xfer_size=36609B;; netapp1_snapmirror1_nfs_ds2_snapmirror_lag_time=885023s;43200;86400

$ check_netapp_sdk.pl -H netapp1 -U $USER1$ -P $USER2$ -S -C check-cluster
OK - Cluster is fine! Partner netapp2 is connected

$ check_netapp_sdk.pl -H netapp1 -U $USER1$ -P $USER2$ -S -C check-lun -n /vol/lun_1_vol/lun_1
WARNING - 1 suspicious luns found: W->lun_1: 85.66% | lun_1_size_used=44911216B;44564480;47185920;;52428800 lun_1_size_pct=85.66%;85;9

$ check_netapp_sdk.pl -H netapp1 -U $USER1$ -P $USER2$ -S -C check-volume -n lun_1_vol
WARNING - 1 suspicious volumes found: W->lun_1_vol: 88.56% | lun_1_vol_size_used=52653680B;50537015;53509781;;59455312 lun_1_vol_size_pct=88.56%;85;9

$ check_netapp_sdk.pl -H netapp1 -U $USER1$ -P $USER2$ -S -C check-aggr
OK - 0 suspicious aggregate found | aggr_unmirrored_size_used=3161408134.68B;3274284218;3466889172;;3852099080 aggr_unmirrored_size_pct=82.07%;85;90 aggr0_size_used=10676546580.44B;11114697879;11768503637;;13076115152 aggr0_size_pct=81.65%;85;90

$ check_netapp_sdk.pl -H netapp1 -U $USER1$ -P $USER2$ -S -C check-license
OK - 0/48 expired licenses found

$ check_netapp_sdk.pl -H netapp1 -U $USER1$ -P $USER2$ -S -C check-version
OK - System-Name: netapp1 System-ID: 123456789 Model: FAS3240 Serial: 123456789 Version: NetApp Release 8.1.2P4 7-Mode: Fri Apr 26 19:57:25 PDT 2013
 
 
 

Mittwoch, 20. August 2014

Nachtrag für Ubuntu 14.04 Trusty zu Active-Directory User-Login für Ubuntu/Debian Servern

Nachdem ich gestern das erste System von 12.04 auf 14.04 gehoben habe, musste ich feststellen, dass der AD-Login (wie hier beschrieben: http://oskibbe.blogspot.de/2014/07/linux-active-directory-user-login-fur.html) nicht mehr funktionierte.


Der Server war weiterhin in der Domäne und die Winbind Tools wbinfo -u / wbinfo -g lieferten weiterhin die korrekten User/Gruppen zurück.

Im Auth-Log fand ich allerdings einen Eintrag:

Aug 18 16:07:38 host login[1738]: pam_listfile(login:auth): Refused user oliver.skibbe for service login
Aug 18 16:07:40 host login[1738]: pam_unix(login:auth): check pass; user unknown
Aug 18 16:07:40 host login[1738]: pam_unix(login:auth): authentication failure; logname=LOGIN uid=0 euid=0 tty=/dev/tty1 ruser= rhost=
Aug 18 16:07:40 host login[1738]: pam_winbind(login:auth): getting password (0x00000388)
Aug 18 16:07:40 host login[1738]: pam_winbind(login:auth): pam_get_item returned a password
Aug 18 16:07:42 host login[1738]: FAILED LOGIN (1) on '/dev/tty1' FOR 'UNKNOWN', Authentication failure

Ausgehend von diesem Eintrag und der Info, dass die User weiterhin abrufbar waren, konnte ich das Problem auf nsswitch eingrenzen und zwar gibt es seit Ubuntu 14.04 Trusty für diesen Part zwei zusätzliche Pakete Namens: libnss-winbind und libpam-winbind benötigt werden:

ii  libnss-winbind:amd64                2:4.1.6+dfsg-1ubuntu2.14.04.3 amd64        Samba nameservice integration plugins
ii  libpam-winbind:amd64                2:4.1.6+dfsg-1ubuntu2.14.04.3       amd64        Windows domain authentication integration plugin


Nach der Installation dieses Paketes funktionierte dann der AD-Login sofort wieder, schade das Winbind keine explizite Abhängigkeit für dieses Paket hat.

Freitag, 1. August 2014

My github for monitoring plugins

Instead of using a local subversion, i moved my selfwritten monitoring plugins to github.

It's accessible through this link: https://github.com/riskersen/Monitoring


Feel free to contribute :-)