SYSTEM WARNING: 'date_default_timezone_get(): It is not safe to rely on the system's timezone settings. You are *required* to use the date.timezone setting or the date_default_timezone_set() function. In case you used any of those methods and you are still getting this warning, you most likely misspelled the timezone identifier. We selected the timezone 'UTC' for now, but please set date.timezone to select your timezone.' in '/usr/share/mantis/www/core.php' line 264

0003170: Collectd Crash abnormally - MantisBT
MantisBT - Endian Firewall
View Issue Details
0003170Endian FirewallGUIpublic2010-10-05 10:162011-03-18 10:34
ardit-endian 
peter-endian 
normalfeaturealways
confirmedopen 
2.3.1 
 
0003170: Collectd Crash abnormally
------------------------------------------
root@endian:~ # /etc/init.d/collectd start
Starting collectd: [ OK ]
root@endian:~ # /etc/init.d/collectd status
collectd (pid 26008) is running...
root@endian:~ # /etc/init.d/collectd status
collectd (pid 26083) is running...
root@endian:~ # /etc/init.d/collectd status
collectd (pid 26083) is running...
root@endian:~ # /etc/init.d/collectd status
collectd dead but pid file exists

This issue is coming back, For unknown reasons collectd crashes after 5 sec.

Here is the output of collectd -f
------------------------------------------
Dispatching value to all write plugins failed with status -1.uc_update: Value too old: name = 8c989399-24d4-4214-8566-f7c7a224c9cf/ntpd/time_dispersion-196.43.1.14; value time = 1286273031; last cache update = 1286273031;rrdtool plugin: (rc->last_value = 1286273031) >= (value_time = 1286273031)Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.uc_update: Value too old: name = 8c989399-24d4-4214-8566-f7c7a224c9cf/ntpd/delay-196.43.1.14; value time = 1286273031; last cache update = 1286273031;rrdtool plugin: (rc->last_value = 1286273031) >= (value_time = 1286273031)Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.uc_update: Value too old: name = 8c989399-24d4-4214-8566-f7c7a224c9cf/ntpd/time_offset-196.43.1.14; value time = 1286273036; last cache update = 1286273036;rrdtool plugin: (rc->last_value = 1286273036) >= (value_time = 1286273036)Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.uc_update: Value too old: name = 8c989399-24d4-4214-8566-f7c7a224c9cf/ntpd/time_dispersion-196.43.1.14; value time = 1286273036; last cache update = 1286273036;rrdtool plugin: (rc->last_value = 1286273036) >= (value_time = 1286273036)Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.uc_update: Value too old: name = 8c989399-24d4-4214-8566-f7c7a224c9cf/ntpd/delay-196.43.1.14; value time = 1286273036; last cache update = 1286273036;rrdtool plugin: (rc->last_value = 1286273036) >= (value_time = 1286273036)Filter subsystem: Built-in target `write': Dispatching value to all write plugins failed with status -1.Buss Error.
-------------------------------

The programs crash after the "Buss Error"
To work around this I removed

/var/lib/collectd/rrd/8c989399-24d4-4214-8566-f7c7a224c9cf dir, and at least the program was not any more crashing (still displays the errors).

The bug is reproducible on some machines only.
purple
? collectd-logs (91,155) 2011-02-15 18:31
https://bugs.endian.com/file_download.php?file_id=611&type=bug
Issue History
2010-10-05 10:16ardit-endianNew Issue
2010-10-05 10:18ardit-endianNote Added: 0004913
2010-10-05 10:20luca-endianTag Attached: purple
2010-10-05 10:29peter-endianNote Added: 0004915
2010-10-05 10:29peter-endianStatusnew => acknowledged
2010-10-05 14:45ardit-endianNote Added: 0004916
2011-02-15 16:22ardit-endianNote Added: 0005704
2011-02-15 16:22ardit-endianCustomer Occurencies => 2-3
2011-02-15 16:38ra-endianStatusacknowledged => new
2011-02-15 16:38ra-endianAssigned To => lorenzo-endian
2011-02-15 18:31lorenzo-endianNote Added: 0005706
2011-02-15 18:31lorenzo-endianStatusnew => feedback
2011-02-15 18:31lorenzo-endianFile Added: collectd-logs
2011-02-16 08:39ardit-endianNote Added: 0005708
2011-02-16 08:51ardit-endianNote Added: 0005710
2011-03-08 17:18lorenzo-endianNote Added: 0005894
2011-03-09 09:10ardit-endianNote Added: 0005898
2011-03-09 09:11ardit-endianNote Edited: 0005898
2011-03-09 09:12lorenzo-endianNote Added: 0005899
2011-03-09 09:22lorenzo-endianNote Edited: 0005899
2011-03-18 10:34lorenzo-endianAssigned Tolorenzo-endian => peter-endian
2011-03-18 10:34lorenzo-endianStatusfeedback => confirmed

Notes
(0004913)
ardit-endian   
2010-10-05 10:18   
Found this on collectd wiki
http://collectd.org/wiki/index.php/Target:Write [^]
maybe it helps..
(0004915)
peter-endian   
2010-10-05 10:29   
hmm, this happens when the new value has an older timestamp as the last entries in the rrd databases.

this will happen for sure if the clock changes abruptly (manually or with ntpdate)
ntpd should not cause this since it is shifting the clock slowly making it running faster or slower.

could this be happened?

of course it should not crash if this happens
isn't monit restarting it automatically?

there's also a shellscript:
/usr/local/bin/rrdfix.sh

which removes all rrd files if the last entry is newer than the current time.
it is running every 5 minutes, maybe it is not working correctly?
or probably that is the source of the problem, since it removes the rrd database while collectd writes to it (?)

can you please check that?
(0004916)
ardit-endian   
2010-10-05 14:45   
I'm sorry, I'm unable to make this tests :( because I have no access any more to that machine and I don't know in detail how collectd works.
(0005704)
ardit-endian   
2011-02-15 16:22   
Hi,

I had the same problem on another system, this one is 2.3 up to date and have 4 cpu,s
so, again collectd was unable to run, before trying my fix above I run rrdfix and it executes correctly and gives me back the console, I run it several times to be sure.

After that, try again to start collectd but it exit's with (again a bus error).
I had to remove all dirs under /var/lib/collectd/rrd to make it run "properly"
and when I say "normally" this is the output:

collectd -C /etc/collectd.conf -f
Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Initialization complete, entering read-loop.unixsock plugin: Removing stale socket file /var/run/collectdread-function of plugin `openvpn' failed. Will suspend it for 10 seconds.read-function of plugin `openvpn' failed. Will suspend it for 20 seconds.read-function of plugin `openvpn' failed. Will suspend it for 40 seconds.read-function of plugin `openvpn' failed. Will suspend it for 80 seconds.

The bus error was:

------------------------------------
root@EFW-VOB:/etc # collectd -C collectd.conf -f
Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Initialization complete, entering read-loop.unixsock plugin: Removing stale socket file /var/run/collectdread-function of plugin `openvpn' failed. Will suspend it for 10 seconds.Bus error
--------------------------------------

now the graphics are generated but 2 of the cpu's (the guy have 4 CPU's) are keeping 100% (one of them always, the other not that much).. top shows ntop consuming the processor, anyway after restarting ntop it's ok (cpu eating).
(0005706)
lorenzo-endian   
2011-02-15 18:31   
Hello everybody,

Attacched, the logs of a 2,4 fully up-to-date...the execution of rrdfix.sh does not change anything.

Hope this helps....

BTW, on my installation the collectd process does not crash.

Lo
(0005708)
ardit-endian   
2011-02-16 08:39   
Hello,

in my installation either :)

for some reason the way rrd's are generated in /var/lib/collectd/rrd/ make the daemon crash.
bu any way this message is quite persistent.

"Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.Found a configuration for the `iptables' plugin, but the plugin isn't loaded or didn't register a configuration callback.filecount plugin: No directories have been configured.Initialization of plugin `filecount' failed with status -1. Plugin will be unloaded.tail plugin: File list is empty. Returning an error.Initialization of plugin `tail' failed with status -1. Plugin will be unloaded.Initialization complete, entering read-loop.read-function of plugin `openvpn' failed. Will suspend it for 10 seconds.unixsock plugin: Socket file /var/run/collectd is in use by another process.read-function of plugin `openvpn' failed."

run it with collectd -C /etc/collectd.conf -f

Why the errors?
(0005710)
ardit-endian   
2011-02-16 08:51   
ah, as Peter said: "hmm, this happens when the new value has an older timestamp as the last entries in the rrd databases."

-------------------------------------------------------------
 failed: illegal attempt to update using time 1297845950 when last update time is 1297845989 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda4/disk_time.rrd) failed: illegal attempt to update using time 1297845945 when last update time is 1297846044 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda5/disk_octets.rrd) failed: illegal attempt to update using time 1297845945 when last update time is 1297845994 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda5/disk_ops.rrd) failed: illegal attempt to update using time 1297845945 when last update time is 1297845999 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda5/disk_merged.rrd) failed: illegal attempt to update using time 1297845945 when last update time is 1297845964 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda6/disk_time.rrd) failed: illegal attempt to update using time 1297845945 when last update time is 1297845959 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda1/disk_octets.rrd) failed: illegal attempt to update using time 1297845950 when last update time is 1297846029 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda3/disk_octets.rrd) failed: illegal attempt to update using time 1297845950 when last update time is 1297845954 (minimum one second step)rrdtool plugin: rrd_update_r (/var/lib/collectd/rrd/d1087ad6-f595-4a70-8a12-a4ddadced983/disk-hda4/disk_ops.rrd) failed: illegal attempt to update using time 1297845950 when last update time is 1297846034 (minimum one second step)
---------------------------------------------------------------
[this was in my testing machine]

Just let the daemon run for a while in (-f) your system and you will confirm this.

Thanks.
(0005894)
lorenzo-endian   
2011-03-08 17:18   
hey ardit,

on one hand, I can confirm the plugin problems (today I saw the same on my test system), on the other hand my collecd never crashes :/

Now it is running, I will keep you updated.

Thanks a lot

Lo
(0005898)
ardit-endian   
2011-03-09 09:10   
(edited on: 2011-03-09 09:11)
Hi,

yes the crash happened only in some systems [why should this happen?] , but there are errors with the plugins and also with rrd's timestamps.

Do you confirm this?

(0005899)
lorenzo-endian   
2011-03-09 09:12   
(edited on: 2011-03-09 09:22)
Yes, those errors are confirmed!