Nagios check_ntp quits working in 2009 with Offset unknown
Posted by Admin • Friday, January 2. 2009 • Category: Code and HacksI've been happily using nagios to monitor all my servers for quite some time, yet two days ago, suddenly, I started getting "Offset unknown" from my check_ntp check. Same from check_ntp_time. Then it was intermittent, service was flapping (going in and out of Unknown state). I messed around with ntp.conf and changed servers, restarted ntpd and then it stopped working across the board...
asterisk asterisk # /usr/nagios/libexec/check_ntp_time -H srv NTP CRITICAL: Offset unknown|
The problem?
Apparently a leap year second is inserted periodically, and that's what was done on Dec 31 2008! Just one second, but enough to expose a bug in nagios-plugins-1.4.11
Running with -v got me this:
# /usr/nagios/libexec/check_ntp_time -H server -v sending request to peer 0 response from peer 0: offset 0.01255832968 sending request to peer 0 response from peer 0: offset 0.01255403762 sending request to peer 0 response from peer 0: offset 0.01255219412 sending request to peer 0 response from peer 0: offset 0.01255332731 discarding peer id 0: flags=1 overall average offset: 0 NTP CRITICAL: Offset unknown|
The solution?
The bug was already fixed in 1.4.13 - so just upgrade. In my case (Gentoo) - that version is still ~x86 (unstable), but I'm upgrading anyway.
I installed 1.4.13, and now I get this:
# /usr/nagios/libexec/check_ntp -H asterisk NTP OK: Offset -0.004825386102 secs|offset=-0.004825s;60.000000;120.000000;
Unfortunately I have nagios-plugins version 1.4.15 and I'm still encountering this problem:
watch01:~ # /usr/lib/nagios/plugins/check_ntp_time -H lb01 -w 1 -c 2 -v --version
check_ntp_time v1.4.15 (nagios-plugins 1.4.15)
watch01:~ # /usr/lib/nagios/plugins/check_ntp_time -H lb01 -w 1 -c 2 -v
sending request to peer 0
response from peer 0: offset 0.07509887218
sending request to peer 0
response from peer 0: offset 0.07508444786
sending request to peer 0
response from peer 0: offset 0.07499825954
sending request to peer 0
response from peer 0: offset 0.07510817051
discarding peer 0: stratum=0
overall average offset: 0
NTP CRITICAL: Offset unknown|