Ask Your Question

Revision history [back]

click to hide/show revision 1
initial version

Let me start by saying that in my opinion restarting any service automatically is a bad idea.

Answering partly your question - it is possible to design a primitive monitoring system based on systemd. I have never used monit, but the number of available nagios plugins (http://www.nagios.org/projects/nagiosplugins) is something hard to beat. Therefore the example below uses a nagios plugin from within systemd, which in turn sends emails in case the check performed by the plugin fails.

Example below for checking for httpd:

  1. install the nagios plugin (and httpd to be monitored)

    su -c "yum -y install httpd"
    su -c "yum -y install nagios-plugins-http"
    
  2. configure systemd mailer:

    su -c "yum -y install sendmail-cf"
    

    Configure sendmail with e.g. SMART_HOST, see http://fedoraproject.org/wiki/Administration_Guide_Draft/Mail#Smart_Host

    All steps as root from now.

    The systemd script (idea taken from https://mailman.archlinux.org/pipermail/arch-general/2014-February/035037.html) reads:

    cat <<'EOF' > /root/mail
    #!/usr/bin/sh
    (echo "Subject: Failed Service: ${1} on `hostname`"; systemctl status /usr/bin/systemctl status "${1}.service") | /usr/sbin/sendmail email@domain.com
    EOF
    chmod go-rwx /root/mail
    

    /usr/bin/mail did not work for me. The above script is executed by the following systemd service (for usage of the systemd specifiers (@, %i, etc), see http://www.freedesktop.org/software/systemd/man/systemd.unit.html):

    cat <<'EOF' > /usr/lib/systemd/system/mail@.service
    [Unit]
    Description="Mailer"
    [Service]
    Type=oneshot
    ExecStart=/usr/bin/sh /root/mail %i
    [Install]
    WantedBy=multi-user.target
    EOF
    
  3. configure the systemd timer service - this will periodically call the systemd service performing the actual nagios check. The timer uses OnCalendar systemd option (http://www.freedesktop.org/software/systemd/man/systemd.time.html) and is based on the ideas from https://wiki.archlinux.org/index.php/Systemd/cron_functionality#Starting_events_according_to_the_calendar and http://solpeth.wordpress.com/2013/12/27/using-systemd-as-a-cron-replacement/. It will run every minute (haven't figured out how to set more complex time patterns yet - see https://bugzilla.redhat.com/show_bug.cgi?id=1074951):

    cat <<'EOF' > /usr/lib/systemd/system/min@.timer
    [Unit]
    Description=Run %i every minute
    [Timer]
    OnCalendar=*:*:00
    Unit=%i.service
    [Install]
    WantedBy=timers.target
    EOF
    
  4. configure the actual systemd service performing the nagios check_http check and notifying the min@.timer through OnFailure:

    cat <<'EOF' > /usr/lib/systemd/system/check_http.service
    [Unit]
    Description=check_http
    OnFailure="mail@check_http.service"
    [Service]
    Type=simple
    ExecStart=/usr/lib64/nagios/plugins/check_http -H localhost -p 80
    EOF
    

Now, how to use those? The main controller is the timer, so:

su -c "systemctl start min@check_http.timer"

You should start receiving emails now. Let's start apache:

su -c "systemctl start httpd.service"

The emails with errors still coming? This is due to the way nagios check_http behaves - one has to create some web contents:

su -c "touch /var/www/html/index.html"

In order to disable the check:

su -c "systemctl stop min@check_http.timer"

or to have it permanently after reboot:

su -c "systemctl enable min@check_http.timer"

You may notice that there are some other systemd services in the email (have not figured out where do they come from), and we are far from an usable monitoring system - an overview (gui?) instead of sending possibly thousands of emails is needed.

Let me start by saying that in my opinion restarting any service automatically is a bad idea.

Answering partly your question - it is possible to design a primitive monitoring system based on systemd. I have never used monit, but the number of available nagios plugins (http://www.nagios.org/projects/nagiosplugins) is something hard to beat. Therefore the example below uses a nagios plugin from within systemd, which in turn sends emails in case the check performed by the plugin fails.

Example below for checking for httpd:

  1. install the nagios plugin (and httpd to be monitored)

    su -c "yum -y install httpd"
    su -c "yum -y install nagios-plugins-http"
    
  2. configure systemd mailer:

    su -c "yum -y install sendmail-cf"
    

    Configure sendmail with e.g. SMART_HOST, see http://fedoraproject.org/wiki/Administration_Guide_Draft/Mail#Smart_Host

    All steps as root from now.

    The systemd script (idea taken from https://mailman.archlinux.org/pipermail/arch-general/2014-February/035037.html) reads:

    cat <<'EOF' > /root/mail
    #!/usr/bin/sh
    (echo "Subject: Failed Service: ${1} on `hostname`"; systemctl status /usr/bin/systemctl status "${1}.service") | /usr/sbin/sendmail email@domain.com
    EOF
    chmod go-rwx /root/mail
    

    /usr/bin/mail did not work for me. The above script is executed by the following systemd service (for usage of the systemd specifiers (@, %i, etc), see http://www.freedesktop.org/software/systemd/man/systemd.unit.html):

    cat <<'EOF' > /usr/lib/systemd/system/mail@.service
    [Unit]
    Description="Mailer"
    [Service]
    Type=oneshot
    ExecStart=/usr/bin/sh /root/mail %i
    [Install]
    WantedBy=multi-user.target
    EOF
    
  3. configure the systemd timer service - this will periodically call the systemd service performing the actual nagios check. The timer uses OnCalendar systemd option (http://www.freedesktop.org/software/systemd/man/systemd.time.html) and is based on the ideas from https://wiki.archlinux.org/index.php/Systemd/cron_functionality#Starting_events_according_to_the_calendar and http://solpeth.wordpress.com/2013/12/27/using-systemd-as-a-cron-replacement/. It will run every minute (haven't figured out how to set more complex time patterns yet - see https://bugzilla.redhat.com/show_bug.cgi?id=1074951):

    cat <<'EOF' > /usr/lib/systemd/system/min@.timer
    [Unit]
    Description=Run %i every minute
    [Timer]
    OnCalendar=*:*:00
    Unit=%i.service
    [Install]
    WantedBy=timers.target
    EOF
    
  4. configure the actual systemd service performing the nagios check_http check and notifying the min@.timer through OnFailure:

    cat <<'EOF' > /usr/lib/systemd/system/check_http.service
    [Unit]
    Description=check_http
    OnFailure="mail@check_http.service"
    [Service]
    Type=simple
    ExecStart=/usr/lib64/nagios/plugins/check_http -H localhost -p 80
    EOF
    

Now, how to use those? The main controller is the timer, so:

su -c "systemctl start min@check_http.timer"

You should start receiving emails now. Let's start apache:

su -c "systemctl start httpd.service"

The emails with errors still coming? This is due to the way nagios check_http behaves - one has to create some web contents:

su -c "touch /var/www/html/index.html"

In order to disable the check:

su -c "systemctl stop min@check_http.timer"

or to have it permanently after reboot:

su -c "systemctl enable min@check_http.timer"

You may notice that there are some other systemd services in the email (have not figured out where do they come from), and we are far from an usable monitoring system - an overview (gui?) instead of sending possibly thousands of emails is needed.