...A place where sharing IT monitoring knowledges

Wednesday 13 March 2013

OP5's check_esx3 timeouts when querying VSphere servers


I've been using the fantastic OP5's check_esx3 (now known as check_vmware_api) Nagios Core compatible plugin for years in different systems with success. In one of my last installs I got an strange problem: Plugin froze for minutes, even timed-out, when requesting info from any server (VCenter or ESX).

Debugging the code I saw that it happened when the VSphere API performed an HTTP POST to the  server SOAP webservice querying the properties of an object. Since prior to that operation VSphere API had successfully opened an SSL conection to the server I discarded a problem in the Perl LWP libraries, those that manage the SSL data flow from the monitoring system to the VSphere servers.

The monitoring system was using VSphere SDK for Perl 5.1 and Perl LWP 6.05 module, both the last available releases at the time of writing this post. 

After many tests I solved the problem downgrading Perl LWP from 6.05 to the 5.837 release. I'm not very sure why it happened and I've not tested it with different Perl LWP 6.x releases in order to find a more recent running version. The fact is that now it works like a charm.



No comments:

Post a Comment

 
Design by Free WordPress Themes | Bloggerized by Lasantha - Premium Blogger Themes