Linux – Daniel Tillett

Dead simple ssh login monitoring with Monit and Pushover

Following on from my earlier post on how to set up Dead simple CentOS server monitoring with Monit and Pushover, I recently added monitoring for ssh logins. I wanted to be able to see who is logging into my servers and be notified if anyone not authorised gained access. If you already have set up a Monit and Pushover system then this just requires adding of an extra monit.conf file.

Create the ssh logins monit .conf file with the following.

# nano /etc/monit.d/ssh_logins.conf

check file ssh_logins with path /var/log/secure 
  #Ignore login's from whitelist ip addresses
  ignore match "/var/www/ignore_ips.txt"
  if match "Accepted publickey" then exec "/usr/local/bin/pushover.sh"
  if match "Accepted password" then exec "/usr/local/bin/pushover.sh"

If you want to be able to ignore logins from certain IP addresses (i.e. your own) then create a text file with the list of IP address to be ignored (one per line).

# nano /var/www/ignore_ips.txt

123.123.134.123
122.121.121.121
...

Check that all the .conf file are correct

# monit -t

If everything is fine then restart monitoring by reloading the new .conf files.

# monit reload

Now anytime someone logs in to the server you will be sent notification. The only downside is that notification takes around a minute to occur since the notification is only pushed once monit checks the secure logfile. It is possible to get instant notification by using pam_exec, but that is another post.

Dead simple CentOS server monitoring with Monit and Pushover

My company Nucleics has an array of servers distributed around the world to support our PeakTrace Basecaller. For historical reasons these servers are a mix of CentOS 6/7 VPS and physical servers supplied by three different companies. While the Auto PeakTrace RP application is designed to be robust in the face of server downtime, I wanted a dead simple monitoring service that would fix 99% of the server problem automatically and only contact me if there was something really wrong. After looking around all the paid services I settled on using a combination of Monit and Pushover.

Monit is an open source watchdog utility that can monitor other Linux services and automatically restart them if they crash or stop working. The great thing about monit is that you can set it up to fix things on its own. For example, if the server can be fixed by simply restarting apache then I want the monitoring service to just do this and only send me a message if something major has happened. I also wanted a service that would ping my phone, but where I could easily control it (i.e turn on/off, set away times, etc).

Pushover looked ideal for doing this. For a one off cost of $5 you can use the Pushover API to send up to 7500 message a month to any phone. It has lots of other nice features like quiet times and group notification. It comes with a 7 day free trial so you have time to make sure everything is going to work with your system before paying.

The only issue with integrating monit and pushover is that by default monit is set to email alert notices. Most of our servers don’t have the ability to email (they are slimmed down and are only running the services needs to support PeakTrace). Luckly, monit can also execute scripts so I settled on the alternative approach of calling the Pushover API via an alert script that would pass through exactly what server and service was having problems. This alert script is set to only be called if monit cannot fix the problem by restarting the service. After a bit of experimentation I got the whole system running rather nicely.

Here is the step-by-step guide. I did all this logged in as root, but if you don’t like to live on the edge just put sudo in front of every command.

Setting up Pushover

After registering an account at Pushover, and downloading the appropriate app for your phone (iOS or android), you need to set up a new pushover application on the Pushover website.

Click on Register an Application/Create an API Token. This will open the Create New Application/Plugin page.

Give the application a name (I called it Monit), but you call it anything you like.
Choose “script” as the type.
Add a description (I called it Monit Server Monitoring).
Leave the url field blank.
If you want you can add an icon, but you don’t need to do this. It is nice though having an icon when you get a message.
Press the Create Application button.

You need to record the new application API Token/Key as well as your Pushover User Key (you can find this on the main pushover page if you are logged in). You will need both these keys to have monit be able to ping Pushover via the alert script.

Install Monit

Install the EPEL package repository.

# yum install -y epel-release

Install monit and curl.

# yum install -y monit curl

Set monit to start on boot and start monit.

# chkconfig monit on && service monit start

You can edit the monif.conf file in /etc but the default values are fine. Take a look at the monit man page for more details about what you might want to change.

Create the Pushover Alert Script

You need to create the script that monit will call when it raises an alert.

# nano /usr/local/bin/pushover.sh

Paste the following text substituting your own API Token and User Keys before saving.

#!/bin/bash
 /usr/bin/curl -s --form-string "token=API Token" \
 --form-string "user=User Key" \
 --form-string "message=[$MONIT_HOST] $MONIT_SERVICE - $MONIT_DESCRIPTION" \
 https://api.pushover.net/1/messages.jsonop

Make the script executable.

# chmod 700 /usr/local/bin/pushover.sh

Test that the script works. If there are no issues the script will return without error and you will get an short message in the Pushover phone app almost immediately.

# /usr/local/bin/pushover.sh

Configure Monit

Once you have the pushover.sh alert script set up you need to create all the service-specific monit .conf files. You can mix and match these to suit the services you are running on your server. The aim is to have monit restart the service if there are any issues and only if this does not solve the problem, call the pullover.sh alert script. This way most servers will fix themselves and you only get contacted if something catastrophic has happened.

system

# nano /etc/monit.d/system.conf

check system $HOST
if loadavg (5min) > 4 then exec "/usr/local/bin/pushover.sh"
if loadavg (15min) > 2 then exec "/usr/local/bin/pushover.sh"
if memory usage > 80% for 4 cycles then exec "/usr/local/bin/pushover.sh"
if swap usage > 20% for 4 cycles then exec "/usr/local/bin/pushover.sh"
if cpu usage (user) > 90% for 4 cycles then exec "/usr/local/bin/pushover.sh"
if cpu usage (system) > 80% for 4 cycles then exec "/usr/local/bin/pushover.sh"
if cpu usage (wait) > 80% for 4 cycles then exec "/usr/local/bin/pushover.sh"
if cpu usage > 200% for 4 cycles then exec "/usr/local/bin/pushover.sh"

apache

# nano /etc/monit.d/apache.conf

check process httpd with pidfile /var/run/httpd/httpd.pid
start program = "/etc/init.d/httpd start" with timeout 60 seconds
stop program = "/etc/init.d/httpd stop"
if children > 250 then restart
if loadavg(5min) greater than 10 for 8 cycles then exec "/usr/local/bin/pushover.sh"
if failed port 80 for 2 cycles then restart
if 3 restarts within 5 cycles then exec "/usr/local/bin/pushover.sh"

sshd

# nano /etc/monit.d/sshd.conf

check process sshd with pidfile /var/run/sshd.pid
start program "/etc/init.d/sshd start"
stop program "/etc/init.d/sshd stop"
if failed port 22 protocol ssh then restart
if 5 restarts within 5 cycles then exec "/usr/local/bin/pushover.sh"

fail2ban

# nano /etc/monit.d/fail2ban.conf

check process fail2ban with pidfile /var/run/fail2ban/fail2ban.pid
start program "/etc/init.d/fail2ban start"
stop program "/etc/init.d/fail2ban stop"
if 5 restarts within 5 cycles then exec "/usr/local/bin/pushover.sh"

syslog

# nano /etc/monit.d/syslog.conf

check process rsyslog with pidfile /var/run/syslogd.pid
start program "/etc/init.d/rsyslog start"
stop program "/etc/init.d/rsyslog stop"
if 5 restarts within 5 cycles then exec "/usr/local/bin/pushover.sh"

crond

# nano /etc/monit.d/crond.conf

check process crond with pidfile /var/run/crond.pid
start program "/etc/init.d/crond start"
stop program "/etc/init.d/crond stop"
if 5 restarts within 5 cycles then exec "/usr/local/bin/pushover.sh"

mysql

# nano /etc/monit.d/mysql.conf

check process mysqld with pidfile /var/run/mysqld/mysqld.pid
start program = "/etc/init.d/mysqld start"
stop program = "/etc/init.d/mysqld stop"
if failed host 127.0.0.1 port 3306 then restart
if 5 restarts within 5 cycles then exec "/usr/local/bin/pushover.sh"

Check that all the .conf file are correct

# monit -t

If everything is fine then start monitoring by loading the new .conf files.

# monit reload

Check the status of monit by using

# monit status

This should give you something like this depending on which services you are monitoring.

The Monit daemon 5.14 uptime: 3d 20h 17m

System 'rps.peaktraces.com'
 status Running
 monitoring status Monitored
 load average [0.00] [0.12] [0.11]
 cpu 0.2%us 0.1%sy 0.0%wa
 memory usage 106.6 MB [10.7%]
 swap usage 0 B [0.0%]
 data collected Tue, 19 Jul 2016 04:16:06

Process 'rsyslog'
 status Running
 monitoring status Monitored
 pid 1016
 parent pid 1
 uid 0
 effective uid 0
 gid 0
 uptime 4d 23h 33m
 children 0
 memory 3.4 MB
 memory total 3.4 MB
 memory percent 0.3%
 memory percent total 0.3%
 cpu percent 0.0%
 cpu percent total 0.0%
 data collected Tue, 19 Jul 2016 04:16:06

Process 'sshd'
 status Running
 monitoring status Monitored
 pid 1176
 parent pid 1
 uid 0
 effective uid 0
 gid 0
 uptime 4d 23h 33m
 children 4
 memory 1.2 MB
 memory total 20.7 MB
 memory percent 0.1%
 memory percent total 2.0%
 cpu percent 0.0%
 cpu percent total 0.0%
 port response time 0.006s to [localhost]:22 type TCP/IP protocol SSH
 data collected Tue, 19 Jul 2016 04:16:06

Process 'fail2ban'
 status Running
 monitoring status Monitored
 pid 1304
 parent pid 1
 uid 0
 effective uid 0
 gid 0
 uptime 4d 23h 33m
 children 0
 memory 30.2 MB
 memory total 30.2 MB
 memory percent 3.0%
 memory percent total 3.0%
 cpu percent 0.1%
 cpu percent total 0.1%
 data collected Tue, 19 Jul 2016 04:16:06

Process 'crond'
 status Running
 monitoring status Monitored
 pid 1291
 parent pid 1
 uid 0
 effective uid 0
 gid 0
 uptime 4d 23h 33m
 children 0
 memory 1.2 MB
 memory total 1.2 MB
 memory percent 0.1%
 memory percent total 0.1%
 cpu percent 0.0%
 cpu percent total 0.0%
 data collected Tue, 19 Jul 2016 04:16:06

Process 'httpd'
 status Running
 monitoring status Monitored
 pid 20963
 parent pid 1
 uid 0
 effective uid 0
 gid 0
 uptime 4h 5m
 children 2
 memory 7.7 MB
 memory total 19.0 MB
 memory percent 0.7%
 memory percent total 1.9%
 cpu percent 0.0%
 cpu percent total 0.0%
 data collected Tue, 19 Jul 2016 04:16:06

Suggestions

You may want to adjust the system.conf values if your server is under sustained high loads so as to scale back on the pushover triggers. Since you will know exactly what is the trigger this is quite easy to do.

To create a monit .conf file for a new services you just need to make sure that you use the correct .pid file path for the service and that the start and stop paths are correct. These can be a little non-obvious (look at syslog.conf for example). If you do make a mistake monit -t and monit status will show you what is wrong.

Once you have all this in place then sit back, relax and let the servers take care of themselves (well we can all dream).

Edit July 2017. I have been using this system for over a year now and it has been working great. I have had no problem that monit has not fixed by itself by just restarting the service. About the only issue I have had is load spikes on the server caused by a runaway service not monitored.

I have recently used the same approach to monitor for unauthorised logins which I wrote up Dead simple ssh login monitoring with Monit and Pushover.

CentOS 6: Adjusting the remote desktop resolution without a monitor being connected

If you remote desktop share into a CentOS 6 system without a monitor being connected (i.e. a headless server) the screen resolution defaults to a tiny 800×600. Unfortunately the upstream provider (Redhat) has declined to provide system-config-display for EL6 (see this bug report), making it impossible to set the screen resolution in CentOS6 to something more useable as you would if you had a monitor attached.

The workaround is to use xrandr in a script to change the screen size. Here are the steps involved.

1. Open a terminal window and type xrandr. This will list your various screen devices.

xrandr

You should get something like this

Screen 0: minimum 320 x 200, current 800 x 600, maximum 8192 x 8192
VGA-0 disconnected  (normal left inverted right x axis y axis)
HDMI-0 disconnected (normal left inverted right x axis y axis)
DP-0 disconnected (normal left inverted right x axis y axis)

2. Type cvt with the screen dimensions you want (e.g. 1920 x 1200).

cvt 1920 1200

This should output something like this. Copy everything after “Modline”.

# 1920x1200 59.88 Hz (CVT 2.30MA) hsync: 74.56 kHz; pclk: 193.25 MHz
Modeline "1920x1200_60.00"  193.25  1920 2056 2256 2592  1200 1203 1209 1245 -hsync +vsync

3. Create a new shell script to call xrandr. This script will use xrandr to create a new mode, add the new mode to the screen device, and finally sets the screen output to the device using this mode.

nano setscreen.sh

Paste the copied cvt output information to after the xrandr newmode line. Adjust the xrandr addmode line to be the name of your device (e.g. VGA-0) and the name of the new mode (e.g. 1920x1200_60.00) . Modify the xrandr output line to be this new mode. Your script should look something like this.

#!/bin/bash
xrandr --newmode "1920x1200_60.00" 193.25 1920 2056 2256 2592 1200 1203 1209 1245 -hsync +vsync
xrandr --addmode VGA-0 1920x1200_60.00
xrandr --output VGA-0 --mode 1920x1200_60.00

Save the script (control-x).

4. Change the script’s ownership and permissions to root and make it executable.

sudo chown root setscreen.sh
sudo chgrp root setscreen.sh
sudo chmod 755 setscreen.sh

5. Run the script to see if it works correctly.

sudo ./setscreen.sh

This should change the remote screen resolution to your desired size (e.g. 1920×1200). It can take a second or two to adjust.

6. To avoid having to run this script manually every time you log into the server, copy it into the /etc/X11/xinit/xinitrc.d/ directory.

sudo cp setscreen.sh /etc/X11/xinit/xinitrc.d/

Now that was so easy Redhat.

Simple POSIX Semaphore Library for Windows

I had a recent need to control access of one of my programs to a hardware device – basically the hardware can only handle a certain number of simultaneous connections before it starts returning garbage. POSIX semaphores are a really elegant way of handling this as you just create a semaphore and then put a semi_wait call before every critical code section and a sem_post call at the end of the critical code. This way you can control the number of simultaneous calls of the critical code no matter how many parallel processes or threads are running.

The only problem with POSIX semaphores is they are not natively supported on Windows (Windows has a entirely different semaphore system). Since my program is cross-platform I wanted a simple Windows-compatible POSIX semaphore library I could just drop in on my Windows builds and avoid a whole series of #ifdef #else conditionals. I managed to find a full Windows POSIX thread library (libpthread), but it was way more complex than what I needed and it does not build on Windows XP (I unfortunately still need to support Windows XP/2003). I was able to use the library as a starting point for making a very simple POSIX semaphore library. My cut-down library provides a simple drop in replacement for all the POSIX semaphore functions on Windows and just requires the inclusion of the header file in any project you might want to use it in. It is not identical to semaphores on linux, but it does the job.

You can download the simple windows semaphore library from my github repository. I hope it is of help for someone else.

Tracking down global variables in C code

I recently had the fun job of making some complex legacy C code (200,000+ lines) thread safe. Of course there were globals spread through the code so finding them all turned into a massive easter egg hunt. The initial problem was how to track them all down. If you ever have this fun job here are the steps I used.

1. I searched for the word “global”. If the person who wrote the code took care then all the global variables will have been commented as such (ha ha). Maybe surprisingly this did managed to find around 50% of the globals – while not perfect, it was a good start. I also found a few more unlabelled globals by searching for “thread-safe” and “thread” since some of the non-thread safe code was actually commented as being not thread-safe (I wish it all had been).

2. I next opened the .map file (generated by Visual Studio in debug mode) and looked for any “common” class variables (these are found down towards the bottom of the .map file). This process managed to shake out another 40% of the globals.

3. The last 5% of globals I found by doing a manual search through the entire code base for the “static” keyword. I of course then had to then trawl through several thousand static function declarations, but the last of the globals were in here too.

The most interesting thing about this painful exercise is how often people use globals when they are not really needed. 90% of the globals were being used only two or three times and were able to be made thread safe with very little code change. Of course the last 10% were used in more than 100 locations spread throughout the entire code base so the whole thread safe conversion exercise was not easy.

To keep the required code change to a minimum (I am very reluctant to make major structural changes to complex code even if I have written it) I solved the problem by just packaging up the difficult globals into a thread-safe strut declared in the main entry function and passed the pointer to this strut into each of the functions that were using the globals. While not a trivial job, this approach avoided disturbing code I knew was working well and which would have taken months to re-write.

(104)Connection reset by peer: mod_fcgid: error reading data from FastCGI server

This error (104)Connection reset by peer: mod_fcgid: error reading data from FastCGI server started to appear in my apache error log whenever I tried to upload a file. Apache seemed to be working fine with static html pages, it was only when I tried to upload files that I had problems.

Searching for a solution suggested a whole series of possible causes all revolving around permission problems. Since I had not changed anything on the server for a few weeks, and everything looked fine permission wise, this did not look to be the cause. After checking everything I could think of I noticed that the disk quota allocated by Virtualmin was only 1 GB and that I had used nearly all of it. I upped the disk quota to 10 GB and the problem was fixed. I hope this helps someone.

setrlimit not raising a signal with SIGXCPU

I ran into a unusual linux bug of late using RLIMIT_CPU to kill zombie processes that ran for longer than 15 seconds. The basic code I was using was:

struct rlimit rl;
memset(&rl, 0, sizeof(rl));
/* Set a CPU limit of 15 second. */
rl.rlim_cur = 15;
setrlimit(RLIMIT_CPU, &rl);
/* CPU Time exceeded */
signal(SIGXCPU, catchSignal);

This worked fine when I original wrote it a few years ago, but I noticed of late that I was spawning a large number of zombie processes that were not being trapped by SIGXCPU. After tearing my hair out all day trying to work out why, I noticed that this was only occurring on my CentOS 6.5 development box (Linux version 2.6.32). It turns out that there was a bug in the implementation of setrlimit before 2.6.17 that led a RLIMIT_CPU limit of 0 to be wrongly treated as “no limit” (like RLIM_INFINITY). Since 2.6.17 this is now treated as a limit of 1 second. Since I was setting rlim_max to 0 via memset meant that I was now effectively setting rlim_max to less than rlim_cur.

The solution is really simple – just ensure that you set rlim_max (hard limit) as well as rlim_cur (soft limit).

struct rlimit rl;
/* Set a CPU soft limit of 15 second and hard limit of 20 seconds */
rl.rlim_cur = 15;
rl.rlim_max = 20;
setrlimit(RLIMIT_CPU, &rl);
/* CPU Time exceeded */
signal(SIGXCPU, catchSignal);

pear.php.net is using a unsupported protocol – This should never happen.

I ran into the error “pear.php.net is using a unsupported protocol – This should never happen.” when I tried to upgrade my pecl packages. I have to say errors like “this should never happen” shouldn’t happen, but since it obviously did what is the cause and more importantly the solution. Apparently this error is caused because PHP 5.2.9 and 5.2.10 were broken and it corrupted the .record folder. The solution to fix this is to upgrade pear first then pecl.

pear upgrade
pecl upgrade

So simple :)

Letters to avoid in creating passwords for non-english keyboard layouts

I had an interesting issue arise where I sent a computer to a customer located in Germany. I had created a password for them, but they had problems logging in with it. It turns out that the password I had created contained the letters Z and Y. These two keys are swapped on German keyboards so when they plugged in their own German keyboard they ended up entering the wrong password as the computer was still set to the English keyboard layout.

I have looked into the most common keyboard layouts used in Europe (QWERTY, QWERTZ & AZERTY) and you can avoid key swapping problems like mine if you avoid using the letters Z, Y, A, Q, M & L in any password or user name.

Update. Here is a simple one liner to generate a random alpha numerical password of 8 characters that avoids these letters. You can of course change the password length by changing the -c switch on the head call. This should work on any *nix based system (including MacOS X) that has openssl installed. One warning is the password will visible at generation time to other local users on shared systems if the other users are watching for process list changes. This doesn’t matter to me too much as I generate the passwords on my Mac laptop, but it is something to keep in mind if you are on an shared system.

 openssl rand -base64 25 | tr -dc 'BCDEFGHIJKNOPRSTUVWXbcdefghijknoprstuvwx0123456789' | head -c8; echo ""

Install R on CentOS 5 x64 using yum

I recently had the fun of installing R on my development box. While you can install from source I wanted to be able to install using yum. R is not in the standard packages, but it is in the epel repository (Extra Packages for Enterprise Linux 5 – x86_64).

Steps for Installing R
1. Make sure that you have epel in your yum repositories (use yum listrepo to check). If not add epel to your yum repos (see here for instructions how to do this).
2. Install R and dependencies using yum install R-core R-2*
3. Enter R (just type R) and update all the default package using update.packages(). You will need to choose the nearest mirror to you.
4. Install the packages you need using install.packages(“package_name”, dependencies = TRUE)
5. Quit R using q()

Hope this saves someone a little time.

Update.
If you are behind a proxy server then use the following (change for your proxy setting) after starting R
Sys.setenv(http_proxy=”http://url_to_proxy.com:8080″)
You can check if correct by
Sys.getenv(“http_proxy”)
then update by
update.packages()