Caius Theory

Now with even more cowbell…

Compiling SmartOS for AMD processors

There's a few community-provided patches for SmartOS that enable KVM on AMD processors amongst other things, and given the HP Microserver has an AMD processor, that's quite useful for turning it into a better lab server. The main list of so called "eait" builds was hiccuping when I tried to download the latest, and all I could find was a 20140812T062241Z image here.

The source code for the eait builds is maintained at, and you can see the patches applied on top of the normal SmartOS master by going to

So here's how to use SmartOS to compile a more up to date AMD-friendly Smartos!

  1. Grab the latest multiarch SmartOS image (which has to be used, or the compile will fail.) The latest at the time of writing was 4aec529c-55f9-11e3-868e-a37707fcbe86, so that's what I'll use.

     imgadm import 4aec529c-55f9-11e3-868e-a37707fcbe86
  2. Spin up a zone for us to build in (the Building SmartOS on SmartOS page has extra info about this):

     echo '{
       "alias": "platform-builder",
       "brand": "joyent",
       "dataset_uuid": "4aec529c-55f9-11e3-868e-a37707fcbe86",
       "max_physical_memory": 32768,
       "quota": 0,
       "tmpfs": 8192,
       "fs_allowed": "ufs,pcfs,tmpfs",
       "maintain_resolvers": true,
       "resolvers": [
       "nics": [
           "nic_tag": "admin",
           "ip": "dhcp",
           "primary": true
       "internal_metadata": {
         "root_pw": "password",
         "admin_pw": "password"
     }' | vmadm create
  3. Login to the created zone:

     zlogin <uuid from `vmadm create` output>
  4. Update the image to the latest packages, etc:

     pkgin -y update && pkgin -y full-upgrade
  5. Install a few images we'll need to compile & package SmartOS:

     pkgin install scmgit cdrtools pbzip2
  6. Grab the source code of the fork containing the patches we want, from arekinath/smartos-live

     git clone
     cd smartos-live
  7. Optional: Edit src/Makefile.defs and change PARALLEL = -j$(MAX_JOBS) to PARALLEL = -j8 to do less at once. (Microserver only has a dual core CPU!)

  8. Copy the configure definition into the right place and start configuration:

     cp {sample.,}configure.smartos

    (You'll probably get asked to accept the java license during configuration, so keep half an eye on it)

  9. Once configure has completed (which doesn't take too long, 15 minutes or so), start building:

     gmake world && gmake live
  10. Once the build is successfully finished, time to package an iso & usb image:

    export LC_ALL=C

Hey presto, you've a freshly built AMD-friendly SmartOS build to flash to a USB key / put on your netboot server and boot your Microserver from!


Add to iCloud Reading List programmatically

One piece of a larger puzzle I'm trying to solve currently, was how to add a given URL to my Apple "Reading List" that is stored in iCloud and synced across all my OS X and iOS devices. More specifically, I wanted to add URLs to the list from my mac running Mavericks (10.9). I had a quick look at the Cocoa APIs and couldn't see anything in OS X to do this. (iOS has an API to do it from Cocoa-land it seems though.)

I figured was the key to getting this done on OS X, given it has the ability itself to add the current page to the reading list, either via a keyboard command, a menu item, or a button in the address bar. One quick mental leap later, and I was wondering if the engineers at Apple had been nice enough to expose that via Applescript for me to take advantage of.

One quick stop in "Script" later, and I had the Applescript dictionary open for Lo and behold, there is rather handily an Applescript command called "add reading list item", which does exactly what I want. It has a few different options you can call it with, depending on whether you want Safari to go populate the title & preview text, or if you want to specify it yourself at save-time.

As I want to be able to call this from multiple runtimes, I've chosen to save it as an executable, which leans on osascript to run the actual Applescript. And here it is:

#!/usr/bin/env osascript

on run argv
    if (count of argv) > 0
        tell app "Safari" to add reading list item (item 1 of argv as text)
    end if
end run

Save it as whatever you want (eg. add_to_reading_list), make it executable (chmod +x add_to_reading_list), and then run it with the URL you want saving as the first argument.

$ add_to_reading_list ""
$ add_to_reading_list ""
# … etc …

(Adding support for specifying preview text and title is left as an exercise for the reader!)

Have fun reading later!

Why I love DATA

In a ruby script, there's a keyword __END__ that for a long time I thought just marked anything after it as a comment. So I used to use it to store snippets and notes about the script that weren't really needed inline. Then one day I stumbled across the DATA constant, and wondered what flaming magic it was.

DATA is in fact an IO object, that you can read from (or anything else you'd do with an IO object). It contains all the content after __END__ in that ruby script file*. (It only exists when the file contains __END__, and for the first file ruby invokes though. See footnote for more details.)

How can we use this, and why indeed do I love this fickle constant? I mostly use it for quick scripts where I need to process text data, rather than piping to STDIN.

Given a list of URLs that I want to open in my web browser and look at, I could do the following for instance: do |url|
  `open "#{url}"`


which upon running (on a mac) would open all the URLs listed in DATA in my default web browser. (For bonus points, use Launchy for cross-platform compatibility.) Really handy & quick/simple when you've got 500+ URLs to open at once to go through. (I once had a job that required me to do this daily. Fun.)

Or given a bunch of CSV data that you just want one column for, you could reach for cut or awk in the terminal, but ruby has a really good CSV library which I trust and know how to use already. Why not just use that & DATA to pull out the field you want?

require "csv"

CSV.parse(DATA, headers: true).each do |row|
  puts row["kName"]

1,Google UK,
2,"Yahoo, UK",
# >> Google UK
# >> Yahoo, UK

I find when the data I want to munge is already in my clipboard, and I can run ruby scripts directly from text editors without having to save a file, it saves having to write the data out to a file, have ruby read it back in, etc just to do something with the data. I can just write the script reading from DATA, paste the data in and run it. Which also lets me run it iteratively and build up a slightly more complex script that I don't want to keep. Then do what I need with the output and close the file without saving it.

* technically DATA is an IO handler to read __FILE__, which has been wound forward to the start of the first line after __END__ in the file. And it only exists for the first ruby file to be invoked by the interpreter.

cat > tmp/data.rb <<RUBY

ruby tmp/data.rb
# => "data.rb\n"

cat > tmp/data-require.rb <<RUBY
require "./tmp/data"

ruby tmp/data-require.rb
# => /Users/caius/tmp/data.rb:1:in `<top (required)>': uninitialized constant DATA (NameError)

And because it's a file handle pointing at the current file, you can rewind it and read the entire ruby script into itself…

$ ruby tmp/readself.rb 

something goes here

Geolocation in nginx

Sometimes you need to have a rough idea of where your website visitor is located. There's many ways to geolocate them, but if you just want to go to country level then MaxMind have free geo databases available to help you. When we needed to do this quickly on-the-fly at EmberAds, we came up with the trifle gem, which supports ipv4 and ipv6 lookups.

Recently I was searching for something else to do with nginx and ran across a mailing list thread about using the maxmind database with nginx's HTTP Geo module and do the lookup directly in nginx itself. Finally got a chance to sit down and work out the logistics of doing this. I've done this on an ubuntu 12.04 box, with the expected config file layouts that come with ubuntu.

Run the following on your server (as someone with write access to nginx config files):

# Generate the text file for nginx to import
perl <(curl -s \
< <(zip=$(tempfile) && \
curl -so $zip \
&& unzip -p $zip) > /etc/nginx/nginx_ip_country.txt

# Tell nginx to work out the IP country and store in variable
echo 'geo $IP_COUNTRY {
  default --;
  include /etc/nginx/nginx_ip_country.txt;
}' > /etc/nginx/conf.d/ip_country.conf

Now go find the http block for the vhost you want to have the header passed to, and assuming it's passenger, add the following:

# http {
  # server_name;
  passenger_set_cgi_param HTTP_X_IP_COUNTRY $IP_COUNTRY;
# }

(If you don't use passenger, look at the docs for proxy_pass_header or fastcgi_pass_header to see which you'll require for your setup.)

Reload nginx, and behold, request.env["HTTP_X_IP_COUNTRY"] (assuming a rack app running under ruby) will be a two letter country code, or "--".

Unfortunately this is IPv4 only currently, there's a thread on the nginx mailing list from November 2012 saying IPv6 support should be coming on the v1.3 branch of nginx, but with no known ETA. So currently for IPv6 support, take a look at EmberAds' trifle gem instead.

Install capybara-webkit gem on Ubuntu

Dear future Caius searching for this issue,

The apt package you need to install to use the capybara-webkit rubygem on ubuntu (tested on 10.04 and 11.10) is libqt4-dev. That is, to gem install capybara-webkit, you need to run aptitude install libqt4-dev.

Yours helpfully,
Past Caius

Use Readline With Default Ruby on OS X

OS X Lion comes with ruby 1.8.7-p249 installed, although it's compiled against libedit rather than libreadline. Whilst libedit is a mostly-compatible replacement for libreadline, I find there's a couple of settings I'm used to that don't work in libedit. (Like history-beginning-search-backward.)

Luckily you can grab the source of ruby and compile just the readline extension, and move it into the right place for it to just work. Here's what's been working for me:

# Install readline using homebrew
brew install readline

# Download the ruby source and check out 1.8.7-p249
mkdir ~/tmp && cd ~/tmp
git clone git://
cd ruby
git checkout v1_8_7_249
cd ext/readline
ruby extconf.rb --with-readline-dir=$(brew --prefix readline) --disable-libedit

Now you should have readline.bundle in the current directory, and it should be compiled against your homebrew-installed readline library, rather than libedit that comes with the system. We can quickly double-check that by using otool to check what the binary is linked against.

$ otool -L readline.bundle
    /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/lib/libruby.1.dylib (compatibility version 1.8.0, current version 1.8.7)
    /usr/local/Cellar/readline/6.2.2/lib/libreadline.6.2.dylib (compatibility version 6.0.0, current version 6.2.0)
    /usr/lib/libncurses.5.4.dylib (compatibility version 5.4.0, current version 5.4.0)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 159.1.0)

And in the output you should see a line listing "libreadline", and no lines listing "libedit". Which that shows, we've compiled it properly then. Now the bundle is built we need to move it into the right place so it's loaded when ruby is invoked.

# Back up the original bundle, just in cases
sudo mv "$RL_PATH/readline.bundle" "$RL_PATH/readline.bundle.libedit"
sudo mv readline.bundle "$RL_PATH/readline.bundle"

And that's it. You've got a proper compiled-against-readline installed ruby 1.8.7-p249 on 10.7 now.

One gotcha I ran into was needing to pass the same arguments to rvm when installing any other version of 1.8.7 on the same machine. Simple enough, just need to remember to do it though.

CC=gcc-4.2 rvm install 1.8.7-p357 -C --with-readline-dir=$(brew --prefix readline) --disable-libedit

Install GCC-4.2.1 (Apple build 5666.3) with Xcode 4.2

As of Xcode 4.2 Apple have stopped bundling GCC with it, shipping only the (mostly) compatible llvm-gcc binary instead. The suggested fix is to install GCC using the osx-gcc-installer project. However, I wanted to build and install it from source, which apple provides at

You should already have installed Xcode 4.2 from the app store, then basically the following steps are to grab the tarball from the 4.1 developer tools source, unpack & compile it, then install it into the right places.

Update 2016-07-03: I'd suggest just using homebrew to install this these days:

brew install homebrew/dupes/apple-gcc42


# Grab and unpack the tarball
mkdir ~/tmp && cd ~/tmp
curl -O
tar zxf gcc-5666.3.tar.gz
cd gcc-5666.3

# Setup some stuff it requires
mkdir -p build/obj build/dst build/sym
# And then build it. You should go make a cup of tea or five whilst this runs.
gnumake install RC_OS=macos RC_ARCHS='i386 x86_64' TARGETS='i386 x86_64' \
  SRCROOT=`pwd` OBJROOT=`pwd`/build/obj DSTROOT=`pwd`/build/dst \

# And finally install it
sudo ditto build/dst /

And now you should have gcc-4.2 in your $PATH, available to build all the things that llvm-gcc fails to compile.

at(1) on OS X

I recently came across the at(1) command, and wondered why it wasn't executing jobs I gave it on my machine. Had a poke around the man pages, and discovered in atrun(8) that by default launchd(8) has the atrun entry disabled.

To enable it (and have at jobs fire) you simply need to run the following command once:

sudo launchctl load -w /System/Library/LaunchDaemons/

Personally I've taken to using this to sleep my machine after a custom amount of time, mainly because my alarm clock/sleep timer of choice (Awaken) can't handle playing Spotify for x minutes and then sleeping the machine. The following command puts the machine to sleep, which (quite effectively) silences spotify.

echo "osascript -e 'tell app \"Finder\" to sleep'" | at 1:00am

See the at(1) manpage for how to specify the time, but as I'm only ever scheduling it on the same day (usually 20 minutes or so in advance), just passing the time works fine.

Read standard input using Objective-C

On a couple of occasions now I've wanted to read from STDIN into an Objective-C command line tool, and both times I've had to hunt quite a bit to find the answer because nothing shows up in google for the search terms I used. "Objective-c read from stdin" and "objc read stdin" both turn up results ranging from using NSInputStream to dropping some C++ in there.

The answer is quite simple really, just use NSFileHandle. More specifically +[NSFileHandle fileHandleWithStandardInput]. You can then read all data currently in STDIN, monitor it for new data and anything else you can do with a normal NSFileHandle.

And here's some example code, reads all data from STDIN and stores it into an NSString:

NSFileHandle *input = [NSFileHandle fileHandleWithStandardInput];
NSData *inputData = [NSData dataWithData:[input readDataToEndOfFile]];
NSString *inputString = [[NSString alloc]
  initWithData:inputData encoding:NSUTF8StringEncoding];

I'm using this in GarbageCollected apps, memory management without GC is left as an exercise to the user.

Ignore .gitignore in Git

Recently I ran into an issue where I was working on a project which had files I wanted git to ignore, but I didn't want to commit a .gitignore file into the project. In case you don't know, any files matching a pattern in .gitignore in a git repository are ignored by git. (Unless the file(s) have already been committed, then they need removing from git before they are ignored.)

Initially I figured I could just throw the patterns I needed excluded into my global ~/.gitignore, but quickly realised that I needed files matching these patterns to show up in other git repos, so going the global route really wasn't an option. After some thought I wondered if you could make git ignore .gitignore, whilst still getting it to ignore files matching the other patterns in the .gitignore.

Lets create a new empty repo to test this crazy idea in:

$ mkdir foo
$ cd foo
$ git init
Initialized empty Git repository in /Volumes/Brutus/Users/caius/foo/.git/

And create a couple of files for us to play with:

$ touch bar
$ touch baz

Ignore one of the files so we can check other matches are still ignored later on:

$ echo "baz" >> .gitignore
$ git status
# On branch master
# Initial commit
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#       .gitignore
#       bar
nothing added to commit but untracked files present (use "git add" to track)

Ok so far, but we can still see .gitignore in git, so now for the crazy shindig, ignore the ignore file:

$ echo ".gitignore" >> .gitignore 

Lets see if it worked, or if we can still see our .gitignore:

$ git status
# On branch master
# Initial commit
# Untracked files:
#   (use "git add <file>..." to include in what will be committed)
#       bar
nothing added to commit but untracked files present (use "git add" to track)

And lets just double-check that .gitignore and baz still exist on the filesystem:

$ ls -a
.  ..  .git  .gitignore  bar  baz

Fantastic! Turns out adding ".gitignore" to .gitignore works perfectly. The file is still parsed by git to ignore everything else too, so it does exactly what I needed in this instance.