Archive for the ‘Script Day’ Category

Script Day: Automatically backup your EC2 instance using snapshots

The following script I install as a cron job on Amazon AWS virtual machines I deploy, to allow them to backup themselves automatically. The script uses the EC2 management utilities that are normally available on “Amazon Linux” installations (and can be easily installed on other Linux distributions) to create EBS snapshots of the current mounted root EBS volume 1.

  1. I don’t expect this script to work for instances that have an instance-stored root device, but I don’t expect to encounter these any more[]

Script Day: find the oldest file in a directory structure

This piece of script came in handy when I wrote a utility that “recycles” space on a logging partition: before log rotation archives the current log file, we move some old log files (depending on some archive freshness policy) to a remote storage that archives older files.

The problem is that the “old archive storage” also has limited disk space and I got fed up managing the archive by hand. The solution I came up is to scan the hierarchy of  log files in the storage (logs are stored hierarchically according to origin and type) and delete old files until I have enough room to move some newer files in. That way the “old archive storage” is always kept full and keeps as much back-log as possible and does this automatically.

The piece of code that determines which files we want to delete works like this:

  1. Use find to list all the files in the directory structure
  2. Pipe it to perl and collect all the file names in a list
  3. Use perl’s sort operator to compare the modification times of each file in the list and show them in the order (i.e. oldest first)
  4. Use head to get just the first file

So it looks like this:

find /mnt/httpd_back/ -type f | perl -nle 'next unless -f; push @files, $_; END { foreach $file (sort { @a=stat($a); @b=stat($b); $a[9] <=> $b[9] } @files) { print $file; }}' | head -n1

Note: normally we use head to get some initial output and terminate the process early before it does more costly work – when head has enough data it terminates the pipe sending SIGPIPE to the upstream process and that usually terminates the process that generates the data. In this case – and in all other cases involving sort – the upstream process buffers all the data in its own memory before outputting anything, so it can sort everything, and using head here is just a filter to get what I want and does not actually save me from doing all the work. I could have easily done the same thing inside the perl script itself by replacing the block of  print $file; with print $file; last; – this has the same effect as using head, because head will send SIGPIPE to perl after getting the first print and will terminate it. Deciding which way you want to go is probably more about readability of the code and I prefer my original version because its easier to read to non-perl specialists.

I can then just remove that file, see if I have enough room to move in the newer log file and if no – repeat the process.

This would work well, I believe, but it may be inefficient if I find a bunch of small files and I want to copy in a large file. So what I did next is to take advantage of the fact that all the log files I have are named using the following simple format:


and that allows me to easily find all the log files that record the same day and eliminate them at the same time. Subsequent moving of additional files will likely succeed because I cleared out all the log files of an entire day. If not, I can always go and clear up another day’s worth of logs.

Enhanced by Zemanta

Script Day: automatically locate the next valid transaction in MySQL binlog

Sometimes the MySQL replication breaks due to some corruption in the binary log files 1. When your binary log files are corrupted, the only option (other then trying to rebuild a database of hundreds of gigabytes) is to try to skip over the corrupted region and get the slave to pick up from where the transactions are valid.

Locating the correct position in the binary log from which the server can carry on is difficult but can be made easier by the mysqlbinlog utility that can scan the binary log files and show you which position is valid using the --start-position to try random positions in the binary log file and see which position will let you read from the file 2.


  1. I have yet to find a good explanation to why it happens and how to prevent that[]
  2. because in the binary log transactions can have any size, so they can end and start at any point[]

Script day: output the tail of a log based on time

As system administrators we often want to list the last few lines from a log file in order to track problems and see system reports. The UNIX command tail is very useful for that purpose and lets you display an arbitrary number of lines from the bottom of any file.

But often this is not really what you want – an administrator might want to see what happens in the last X minutes and the common practice to do this is to run tail with a guessed number of lines, see if you get what you want and if its not enough increase the number and try again.

Here’s another approach that works well if the log file you want to trace has time stamps for its lines (more…)

Script day: grep in jar (or zip) files

Here is another script I wrote for work and I thought it will be interesting enough to share:

Say you want to check which JAR files (or ZIP files for that matter, as Java ARchive files are just ZIP files with a different extension) contain files that contain some text. grep is the obvious answer, but how to grep files in JARs?


Script day – Shutting down multiple servers at once

A system administrator in my company recently approached me with a problem – how to shutdown multiple Linux servers at the same time from a central location. Apparently this is something that people in the MS-Windows world use all kinds of applications, like the Remote Shutdown Tool from Microsoft (though I don’t understand how they handle the authentication – this tools doesn’t seem to require any authentication so it appears that any person with network access can shutdown any computer).

Anyway, apparently searching the web for “Linux remote shutdown” yields no useful results (or so I’ve been told), but frankly – when you have standard UN*X tools at your fingertips, a remote shutdown tool is simply typing ssh root@server shutdown -r now at your local console. But still, for people who want a “tool” – read on.


Script day – randomly rotate GNOME desktop backgrounds

I kind of collect desktop wallpapers – I have a lot of those, several thousands 1. It is a bit ridiculous as I mostly use maximized windows all the time so if not for the fact that in work I live on the console and I have a transparent terminal, I would rarely see my desktop wallpaper.

That being said, with a wallpaper collection, you want a software to manage it and cycle your desktop through the wallpapers. KDE has this function built it – just go to configure your wallpaper and select a directory of wallpapers, choose whether you want to cycle through the images sequentially or randomly, the delay and your done.

Not so in GNOME – simplicity for simplicity’s sake (more…)

  1. mostly anime and video games fan made as well as promotional walls, a lot of hobby photographs – mine and other people’s, and a few more professionally made art[]

Script day – read configuration files

This is not really a script – more of a snippet. I don’t have a lot of spare time these days, so I can justify posting a snippet and calling it “script day” 😉 .

A lot of unix configuration files use the # sign to add comments to configuration files, and a lot of software comes with very well documented files – i.e. has lots of comments. So much that if you just want a quick glimpse at the configuration that is active (not commented out) its very difficult to wade through all the documentation.

Here’s a simple grep that will filter out all the junk and leave you with just the active configuration settings:

egrep -v '^(#|\s*$)' <config file>

and on the standard output you’d get only lines that are not commented out or empty.

Do note that some configuration files can also use ; as a comment character, but modifying the grep to support this is trivial.

Script day – find Java jar files that contain a Java class

From time to time I need to work with a Java program or library that requires some import which I’m not familiar with. Its often very easy to just copy the fully qualified class name and search for it on Google which usually helps identify the product that contains this class.

But if you know that you have this class on your system somewhere, and you are just not sure which jar file you need to add to your project for it to compile – this script will come in handy:


Script day – simple log graphing toolּ

I wrote similar versions of this script over the years to analyze all kinds of logs, but here’s one for posterity:

This script is useful if you have a log for which you want to analyze load over time – transactions per second or whatnot (the version below does this for Apache httpd logs, but it can be easily modified to analyze anything). For apache (and most other HTTP servers) there are many readily available log analysis software packages that do a much better job then what one can do in a simple script, but you might not have such software pre-configured or it can’t filter what you need or you just want to analyze something else – in which case this script will come in handy.

The script receives time stamped log events – each event on a line – and collects the temporal information for each line. Then it will dump a simple vertical graph (i.e. time is on the Y axis) of load over time in the resolution that you want. Its output looks something like this:

Oct 30 14:40:00 2007 |#############                                    | 3.8 x/sec
Oct 30 14:50:00 2007 |##########################################       | 6.3 x/sec
Oct 30 15:00:00 2007 |###########################################      | 6.5 x/sec
Oct 30 15:10:00 2007 |#############################################    | 6.6 x/sec
Oct 30 15:20:00 2007 |###############################                  | 5.4 x/sec