Several few days ago I completed an overhaul of my Linux server. The purpose of this rebuild was to freshen up the file systems, packages, and fix a few issues caused by system bloat over years. The new version of OpenSuse Linux combined with a clean installation on a new SSD drive, made for a very fast boot time and system considering the motherboard and its limitations.
The problems solved with the new installation were many. The system boots significantly faster and reboots cleanly. Sound issues were immediately solved, and several small issues with KDE were fixed. I am thrilled with the results of this work, and the old system continues to serve web pages and email very well indeed.
Today, I discovered that something was not quite right with the new server. I started to see that the volume of email dropped. A quick look at the log files revealed that the log files where stale and not being updated. At first I worried that I configured the drives incorrectly. The files and partitions looked fine and another quick check verified that the file system was fine and nowhere near full. I then check the status of the syslod daemon and discovered to my surprise that the daemon was not running. When I tried to start the daemon, it failed.
This is a monumental problem and a shock. Syslogd is a logging program that allows all the processes on the system or network to log messages. I have never seen syslogd fail in this manner. The only time I have seen failures where due to huge log files on busy systems OR full partitions. However, this was not possible on my server. The last message in /var/log messages was a cryptic “rsyslogd exiting on signal 15”. In researching this message I found nothing useful.
I did some experiments by re-installing rsyslog, syslogd and syslog-ng and configured all of them. None of the changes I made fixed the problem. Whenever I tried to start the daemon, I would get a job failed and no output in a log file, or any sort of useful debugging information. However, I did try to start the daemons directly by running the binary file and it worked!!! This is great, now I know that the binary files are correct, which essentially puts this into a configuration issue or a startup script problem. Further review in this arena again showed now issues.
Then in researching various phrases on google, I discovered a few posts which mentioned a problem with syslog, systemd and cron. This evening I switched from systemd-sysvinit to sysvinit-init and rebooted the server. This fixed the problem. Amazing Suse!!!
I am thrilled the system is again working. I love the Suse Linux distribution and configuration utility. This is a great version of Linux which has served me well for a long time. However, it is unacceptable to me that they would default their installation to a system which has a bug with the syslog daemon. This is arguably one of the most important daemons in the operating system and the fact that it has a bug which causes it to fail after rotating log files is astounding.
OpenSuse, let’s try to do better next time.