Rails: Signing out from devices

In an app I’m working on, I wanted users to be able to sign out from any device they are signed in on, by invalidating logins. There’s a gem called authie that does this so you may want to check it out; here I’ll show a very simple implementation I went with which works well enough for me. The goal is to:

  • create a login whenever a user signs in, with IP address, user agent and a unique device ID;
  • at each request, check whether a login exists for the given user/device ID combination and if it doesn’t, force sign in;
  • update the login at each authenticated request just in case the IP address (thus the location) changes while a session is active (optional);
  • delete the login when the user signs out from the device;
  • list all the active logins in the user’s account page with browser/OS info, IP address, and approximate location (city & country);
  • allow the user to delete any of those logins to sign out from the respective device.

I like doing authentication from scratch (see this Railscast) so that’s what I am using here but if you use something like Devise instead, it won’t be very different.

The first thing we need for this simple implementation is to generate a Login model:

rails g model Login user:belongs_to ip_address user_agent device_id:index

The Login model will be basically empty as it will only do persistence:

class Login < ApplicationRecord
  belongs_to :user
end

Then in the create action of my SessionsController I have something like this:

  def create
    @sign_in_form = SignInForm.new

    if user = @sign_in_form.submit(params[:sign_in_form])
      device_id = SecureRandom.uuid

      if params[:sign_in_form][:remember_me]
        cookies.permanent[:auth_token] = user.auth_token
        cookies.permanent[:device_id]  = device_id
      else
        cookies[:auth_token] = user.auth_token
        cookies[:device_id]  = device_id
      end

      user.logins.create!(ip_address: request.remote_ip,
                          user_agent: request.user_agent,
                          device_id: device_id)

      redirect_to ...
    else
      redirect_to sign_in_path, alert: "Invalid email or password."
    end
  end

So each time a user successfully signs in from a device we create a login with a unique device ID.

In the ApplicationController, I have:

  def current_user
    @current_user ||= begin
      if cookies[:auth_token].present? and cookies[:device_id].present?
        if user = User.find_by(auth_token: cookies[:auth_token])
          if login = user.logins.find_by(device_id: cookies[:device_id])
            # optional
            login.update!(ip_address: request.remote_ip, user_agent: request.user_agent, updated_at: Time.now.utc)
            user
          end
        end
      end
    end
  end
  helper_method :current_user

  def authenticate
    redirect_to sign_in_path unless current_user
  end

I didn’t bother here but perhaps you can prettify the current_user method. So, in order to assume the user is successfully authenticated for the request, we expect:

  • both the auth_token and device_id cookies to be present;
  • the auth_token to be associated with an existing user;
  • a login to exist for the user with the device_id stored in the cookies;

otherwise we redirect the user to the sign in page.

Finally, in the SessionsController I have a destroy action which deletes both the login and the cookies from the browser:

  def destroy
    current_user.logins.find_by(device_id: cookies[:device_id]).destroy
    cookies.delete(:auth_token)
    cookies.delete(:device_id)
    flash.now[:notice] = "Successfully signed out."
    redirect_to sign_in_path
  end

Remember to add a route for the destroy action, e.g.:

resources :logins, only: [:destroy]

Next, we want to list the active logins for the user in their account page so that they can sign out from any of those devices. So that the user can easily tell logins apart I am using:

  • the device_detector gem to identify browser and operating system;
  • the Maxmind GeoIP2 API with the geoip2 gem to geolocate IP addresses so we can display the approximate location for each login. This is just one of many ways you can geolocate IP addresses; I am using Maxmind for other things too so using the Maxmind API works fine for me but you may want to use a different service or a local database (for performance). Also see the geocoder gem for another option.

In the LoginsHelper I have:

module LoginsHelper
    def device_description(user_agent)
        device = DeviceDetector.new(user_agent)
        "#{ device.name } #{ device.full_version } on #{ device.os_name } #{ device.os_full_version }"
    end

    def device_location(ip_address)
        if ip = Ip.find_by(address: ip_address)
            "#{ ip.city }, #{ ip.country }"
        else
            location = Geoip2.city(ip_address)
            if location.error
                Ip.create!(address: ip_address, city: "Unknown", country: "Unknown")
                "Unknown"
            else
                Ip.create!(address: ip_address, city: location.city.names[:en],
                                     country: location.country.names[:en])
                "#{ location.city.names[:en] }, #{ location.country.names[:en] }"
            end
        end
    end
end

I am leaving these methods in the helper but you may want to move them into a class or something. device_description, as you can see, shows the browser/OS info, for example for my Chrome on Gentoo it shows Chrome 52.0.2743.116 on GNU/Linux; then device_location shows city and country like Espoo, Finland if the IP address is in the Maxmind database. If the IP address is invalid or it is something like 127.0.0.1 or a private IP address, the Maxmind API will return an error so we’ll just show “Unknown” instead. This is an example, you may want to avoid the API call (if using an API) when the IP is a private IP address; another optimisation could be performing the geolocation asynchronously with a background job when the user signs in, instead of performing it while rendering the view. Also, you can see another model here, Ip. This is a simple way to cache IP addresses with their locations so we don’t have to make the same API request twice for a given IP address. So next we need to generate this model:

rails g model Ip address:index country city

Again, I am showing here an example, you may want to move the geolocation logic to the Ip model or to a separate class, up to you.

We can now add something like the following to the user’s account page:

<h2>Active sessions</h2>
These are the devices currently signed in to your account:
<table id="logins">
<thead>
<tr>
<th>Device</th>
<th>IP Address</th>
<th>Approximate location</th>
<th>Most recent activity</th>
<th></th>
</tr>
</thead>
<tbody>
    <%= render @logins  %></tbody>
</table>

where @logins is assigned in the controller:

@logins = current_user.logins.order(updated_at: :desc)

The _login.html.erb partial contains:

<tr id="<%= dom_id(login) %>" class="login">
<td><%= device_description(login.user_agent) %></td>
<td><%= login.ip_address %></td>
<td><%= device_location(login.ip_address) %></td>
<td><%= time_ago_in_words(login.updated_at) %></td>
<td>
        <% if login.device_id == cookies[:device_id] %>
            (Current session)
        <% else %>
            <%= link_to "<i class='fa fa-remove'></i>".html_safe, login_path(login), method: :delete, remote: true, title: "Sign out", data: { confirm: "Are you sure you want to sign out from this device?" } %>
        <% end %></td>
</li>

Besides browser/OS/IP/location we also show an X button to sign out from devices unless it’s the current session. It looks like this:

screenshot-from-2016-10-19-18-31-50

Finally, a little CoffeeScript view to actually delete the login when clicking on the X:

$("#login_<%= @login.id %>").hide ->
    $(@).remove()

and the destroy action:

class LoginsController < ApplicationController
    def destroy
        current_user.logins.find(params[:id]).destroy
    end
end

That’s it! Now if the user removes any of the logins from the list, the respective device will be signed out.

Jenkins CI with Rails projects

I’ve had to set up a Jenkins server for Rails projects today so I thought I’d write a post about it. Hopefully it will save someone time – I’ll assume here that your already know what Jenkins and CI are, and prefer setting up your own CI solution rather than using any commercial CI service. I will add here instructions on how to set up Jenkins on a Ubuntu server, so dependencies may be different if you use another distribution of Linux.

Dependencies

For starters, you need to install some dependencies in order to configure a fully functional Jenkins server for RSpec/Cucumber testing with MySQL, and Firefox or Phantomjs for testing features with a headless browser. You can install all these dependencies as follows – these dependencies also include everything you need to correctly install various gems required in most projects:

sudo apt-get install build-essential git-core curl wget openssl libssl-dev libopenssl-ruby libmysqlclient-dev ruby-dev mysql-client libmysql-ruby xvfb firefox libsqlite3-dev libxslt-dev libxml2-dev libicu48

Once these dependencies are installed, if you use Selenium with your Cucumber features you will have Firefox ready for use as a headless browser thanks to xvfb, which simulates a display. When xvfb is installed, the headless browser should already work with Jenkins with the project configuration I will show later. If that’s not the case, you may need to write an init.d script so that xvfb can run as a service. Here’s the content of such script (/etc/init.d/xvfb):

XVFB=/usr/bin/Xvfb
XVFBARGS=":1 -screen 0 1024x768x24 -ac +extension GLX +render -noreset"
PIDFILE=/var/run/xvfb.pid
case "$1" in
start)
echo -n "Starting virtual X frame buffer: Xvfb"
start-stop-daemon --start --quiet --pidfile $PIDFILE --make-pidfile --background --exec $XVFB -- $XVFBARGS
echo "."
;;
stop)
echo -n "Stopping virtual X frame buffer: Xvfb"
start-stop-daemon --stop --quiet --pidfile $PIDFILE
echo "."
;;
restart)
$0 stop
$0 start
;;
*)
echo "Usage: /etc/init.d/xvfb {start|stop|restart}"
exit 1
esac

exit 0

Of course you’ll need to make this file executable and then start the service:

chmod +x /etc/init.d/xvfb
/etc/init.d/xvfb start

In this example xvfb is configured to make the virtual display :1 available, so to make sure any app requiring it ‘finds’ it, you need to set the environment variable DISPLAY in your shell rc/profile file:

export DISPLAY=:1

If instead of Selenium/Firefox you are using Phantomjs as headless browser with your cucumber features, you need to install Phantomjs first. At the moment of this writing the latest LTS release of Ubuntu is 13.04, which has by default an old version of Phantomjs; Cucumber/Capybara will complain that this version is too old, so you need to install a newer version (e.g. 1.9) from source:

cd /usr/local/src
wget https://phantomjs.googlecode.com/files/phantomjs-1.9.0-linux-x86_64.tar.bz2
tar xjf phantomjs-1.9.0-linux-x86_64.tar.bz2
ln -s /usr/local/src/phantomjs-1.9.0-linux-x86_64/bin/phantomjs /usr/bin/phantomjs

Now if you run phantomjs –version it should return 1.9.0.

Jenkins

Once the dependencies are sorted out, it’s time to install Jenkins. It’s easy to do by following the instructions you can also find on Jenkins’ website. I’ll add them here too for convenience:

wget -q -O - http://pkg.jenkins-ci.org/debian/jenkins-ci.org.key | sudo apt-key add -
sudo sh -c 'echo deb http://pkg.jenkins-ci.org/debian binary/ > /etc/apt/sources.list.d/jenkins.list'
sudo apt-get update
sudo apt-get install jenkins

Jenkins’ UI should now be available on port 8080 (optionally you may want to configure a web server such Nginx as fronted to Jenkins). The first thing I recommend to do through the UI is to enable the security, otherwise anyone will have access to projects etc. You can secure Jenkins in many ways, but for the sake of simplicity I will suggest here the simplest one which is based on authentication with username and password. So go to Manage Jenkins > Configure Global Security, and check Enable security. Still on the same page, select Jenkins’ own user database under Security Realm and leave Allow users to sign up enabled for now.

Done this, follow the Sign up link in the top right corner of the page and sign up, so creating a new user. Then back to the Configure Global Security page, select Matrix-based security under Authorisation and add all permissions to the user you have just registered with. Then, disable Allow users to sign up – unless you do want other people to be able to sign up, rather than manually creating new users as needed.

Then log out and log in again just to make sure everything still works OK. If you have problems after these steps and can no longer access Jenkins, you can reset the security settings and try again.

Job configuration

I’ll assume here you are configuring Jenkins for a Rails project and that you use Git as SCM. Jenkins doesn’t support Git out of the box unfortunately, but you can easily fix this by installing the plugins GIT Plugin and GIT Client Plugin. You can install plugins under Manage Jenkins > Manage plugins > Available, where you can search for those plugins and select to install them (and, I recommend, to restart Jenkins after the plugins are installed so that the changes are effective immediately).

Next step is to create and configure a Job. Head to the main page, and then follow New Job; give the job a name and choose the type of job you want to create. In most cases you want to choose Build a free-style software project. You will be taken to the configuration page for the job. Under Source code management, choose Git and enter in Repository URL the URL… of your app’s repository. Before doing this though, make sure you can pull the code on the server by configuring SSH access and anything else needed – basically do a pull test manually from the terminal and ensure it works. Under Branches to build enter one or more branch that you want Jenkins to test against, e.g. */development.

Next, it is very likely that you want Jenkins to build the job automatically each time code is pushed to any of the branches the job is ‘watching’. There are a few ways to do so, called Build triggers on the job configuration page. The two methods I use are Trigger builds remotely with an authentication token and Poll SCM; in the first case, you’ll need to enter a token and then add a hook to the Git repository so that the trigger is automatically activated when new code is pushed. For example, in Bitbucket, you can do this on the page Hooks of the administration area of the repository; the hook to add is of type Jenkins and the format is:

http://USER:TOKEN@JENKINS_URL:8080/

The second method involves enabling Poll SCM in the job configuration page but without a schedule; then you’d add a POST hook with format:

http://JENKINS_URL:8080/git/notifyCommit?url=REPO_URL

In this case you may want to restrict these POST requests with a firewall or else. Either way, Jenkins will be notified whenever code is pushed and a build will be triggered.

Next, add an Execute shell build step under Build, and paste the following:

. /var/lib/jenkins/.bash_profile
rbenv global 1.9.3-p484
rbenv rehash
bundle install
cp config/database.yml.example config/database.yml
mkdir -p tmp/cache
RAILS_ENV=test bundle exec rake db:migrate db:seed
RAILS_ENV=test bundle exec rspec spec
DISPLAY=localhost:0.0 xvfb-run -a bundle exec cucumber features

Please note that I am assuming here that you have installed Ruby under the user jenkins (which is created automatically when installing Jenkins) with rbenv. If you have installed Ruby in a different way, you will have to adapt the build step accordingly. You may anyway have to make changes depending on your project, but the build step as suggested above should work with most projects.

The last piece of configuration left is email notifications, which you can customise as you like. Remember though to set Jenkins’ own email address under Configure system > Jenkins location.

That’s it – you can now test Jenkins by manually running a build or by pushing some code. Hope it helps.

Multi tenancy with Devise and ActiveRecord’s default scope

Multi tenancy with default scope

Multi tenancy in a Rails application can be achieved in various ways, but my favourite one is using ActiveRecord’s default scope as it’s easy and provides good security. Essentially, the core of this technique is to define a default scope on all the resources owned by a tenant, or account. For example, say you have a tenant model named Account which has many users. The User model could define a default scope as follows:

class User < ActiveRecord::Base
# ...
belongs_to :account
default_scope { where(account_id: Account.current_id) }
# ...
end

Do see this screencast by Ryan Bates for more details on this model of multi tenancy.

The problem with this technique is that it often gets in the way of authentication solutions like Devise, which happens to be one of the most popular ones. One common way of implementing multi tenancy with Devise is using subdomains, as suggested in Ryan’s screencast; this works well because it’s easy to determine the tenant/account by just looking up the subdomain, regardless of whether the user is signed in or not. There are cases though when you don’t want or can’t use subdomains; for example, an application that enables vanity urls with subdomains only for paid users while using standard authentication for non paid users. In such scenario your application needs to implement multi tenancy both with and without subdomains.

So if you need to use the typical Devise authentication while also implementing the multi tenancy with the default scope to isolate the data belonging to each account, this combination won’t work out of the box. The reason is that the user must be already signed in, in order for Devise’s current_user to be defined, and with it – through association – the current account:

class ApplicationController < ActionController::Base
# ...

before_filter :authenticate_user!
around_filter :scope_current_tenant

private

# ...

def scope_current_tenant
Account.current_id = current_user.account.id if signed_in?
yield
ensure
Account.current_id = nil
end
end

If the user is not signed in, Account.current_id cannot be set, therefore the default scope on the User model will add a condition -to all the queries concerning users- that the account_id must be nil. For example when the user is attempting to sign in, a query like the following will be generated to find the user:

SELECT `users`.* FROM `users` WHERE `users`.`account_id` IS NULL AND `users`.`email` = 'email@example.com' LIMIT 1

As you can see it looks for a user with account_id not set. However, it is likely that in a multi tenancy application each user belongs to an account, therefore such a query will return no results. This means that the user cannot be found, and the authentication with Devise will fail even though a user with the given email address actually exists and the password is correct. This isn’t the only problem when using Devise together with default scope for multi tenancy without subdomains. Each Devise feature is affected:

  • authentication: the first problem you won’t miss when enabling default scope in an application that uses Devise for the authentication, is simply that you won’t be able to sign in. This is because the user cannot be found for the reasons explained earlier;
  • persistent sessions: once you get the basic authentication working, you will soon notice that the session is not persisted across pages. That is, once signed in you will need to sign in again when you change page in your application. Here the default scope gets in the way when retrieving the user using the session data;
  • password recovery: there are two problems caused by default scope to the password recovery process. First, as usual the user cannot be found when supplying a valid email address; second, when reaching the ‘change my password’ form upon following the link in the email the user receives, that form will be displayed again upon submission and the user won’t actually be able to set the new password because this form will be displayed again and again. Some investigation when I was trying to fix showed that the reason for this is that since the user cannot be found in that second step of the process (because of default scope, of course), the token will be considered invalid and the password recovery form will be rendered again with a validation error;
  • resending confirmation email: this is quite similar to the password recovery; first, user cannot be found when requesting that the confirmation instruction be sent again; second, token is considered invalid and the confirmation form is displayed again and again when reaching it by clicking the link in the email.

In order for Devise to find the user in all these cases, it is necessary that it ignore the default scope. This way the query like the one I showed earlier won’t include the condition that the account_id must be nil, and therefore the user can be found. But how to ignore the default scope? As Ryan suggests in his screencast, it’s as simple as calling unscoped before a where clause. unscoped also accepts a block, so that anything executed within the given block will ignore the default scope.

So in order to get the broken features working, it is necessary to override some methods that Devise uses to extend the User model, so that these methods use unscoped. I’ll save you some time with researching and just add here the content of a mixin that I use for this purpose:

module DeviseOverrides
def find_for_authentication(conditions)
unscoped { super(conditions) }
end

def serialize_from_session(key, salt)
unscoped { super(key, salt) }
end

def send_reset_password_instructions(attributes={})
unscoped { super(attributes) }
end

def reset_password_by_token(attributes={})
unscoped { super(attributes) }
end

def find_recoverable_or_initialize_with_errors(required_attributes, attributes, error=:invalid)
unscoped { super(required_attributes, attributes, error) }
end

def send_confirmation_instructions(attributes={})
unscoped { super(attributes) }
end

def confirm_by_token(confirmation_token)
unscoped { super(confirmation_token) }
end
end

See the use of unscoped. Then, simply extend the User model with this mixin (which I keep in the lib directory of the app):

class User < ActiveRecord::Base
# ...
extend DeviseOverrides
# ...
end

That’s it. You should now have Devise working just fine with the default scope for multi tenancy in your Rails application, without subdomains. While I was investigating these issues I was wondering, would it be a good idea to update Devise’s code so to ensure it always uses unscoped by default? In my opinion this wouldn’t affect the existing behaviour and would make this way of doing multi tenancy easier without having to override any code. What do you think? If you also know of a quicker, easier way of achieving the same result, do let me know!

Top level methods in Ruby

How top level methods work in Ruby

There are many quirks in the Ruby language which IMO show funny behaviours. Take the top level methods, for example – that is, methods defined outside of a class or module. There is something weird about them that makes me wonder about the reasons behind certain design choices. In particular, one thing that I find weird is that Ruby top level methods become private instance methods in all objects.

The reason is that main (the name of the top level context) is an instance of Object:

p self.class
=> Object

So for top level methods to be available on main they are defined as private instance methods on Object, as we can also see if we run:

p Object.private_instance_methods.size

def title
"Mr"
end

p Object.private_instance_methods.size

=> 70
=> 71

This in turn means these methods are basically attached to every Ruby object due to inheritance – that’s how Ruby implements global functions. I am not 100% sure of the reasons behind this behaviour (I might have an idea, read on), but it really looks weird to me. For example, say that we have a top level method called title:

def title
"Mr" # just an example
end

That method will become a private instance method on Object:

Object.private_instance_methods(false).include? :title
=> true

Having a class Person:

class Person
# ...
end

We cannot then call title on any instance of Person because despite the method is available due to inheritance, it is private:

Person.new.title
=> :in `<main>': private method `title' called for #<Person:0x007fba5abbad70> (NoMethodError)

Unless, of course… we use send:

p Person.new.send :title
=> "Mr"

And, because classes are also objects in Ruby, top level methods also become class methods on all classes…

p Person.send :title
=> "Mr"

D’oh.

This may seem innocuous, but it results in ‘polluting’ all Ruby objects, and it certainly happens often. For example, take Cucumber step definitions. I have seen often methods defined directly in the files containing step definitions without being encapsulated into modules or classes; so those methods are basically top level methods, and as such they are attached to all Ruby objects. In the case of Cucumber, it’s easy to avoid this by creating modules and including them at runtime with World(ModuleName). But the problem is, when you add top level methods to a step definitions file -in the Cucumber example- you don’t necessarily intend to be able to call those methods from anywhere.

So.. why aren’t top level methods simply defined as singleton methods on main, instead of being instance methods on Object? Methods defined as top level methods should ideally result in a NoMethodError when called from any other class, but that’s not the case.

One possible reason behind this design choice is that this way you can avoid referring to the main object when calling methods like puts or require, for example. So we can just say puts and require from everywhere instead of something like Main.puts, Main.require. But wouldn’t it be better to explicitly call a method on main rather than polluting all the other objects just for this?

If my assumption is right – that this design choice is explained by the convenience of calling methods like puts and require without having to refer to main – is this behaviour a feature? (Open question for the readers).

It is also interesting that when you try the same code in IRB instead -at least with Ruby 1.9.3- the behaviour is different, and top level methods become instead public instance methods in all objects. So, if you run, in IRB, you’ll see that the behaviour is the opposite of that shown when running the code with the Ruby interpreter:

irb(main):001:0> def title
irb(main):002:1> "Mr"
irb(main):003:1> end
=> nil
irb(main):004:0> Object.private_instance_methods(false).include? :title
=> false
irb(main):005:0> Object.public_instance_methods(false).include? :title
=> true

Does anyone know the reason for this?

Bitwise operations in Ruby, and an example application to testing with Rspec

I often need to test that something should happen or should be possible depending on a variable number of conditions that are linked together in some arbitrary way. In one application, for example, I have a model called License, and one of the requirements is that it should be possible to ‘activate’ an existing license only if its status is either :new or :suspended. Without going into too much detail about the domain specific to this application, it is clear in this example that a license should be activable only if the following two conditions are met at the same time:

  • the license should exist, that is – in typical Rails terms – the license should be persisted, and
  • the status of the license should be either :new or :suspended

So I have an instance method on the License model that looks like this:

class License < ActiveRecord::Base
...
def activable?
persisted? and (status_new? or status_suspended?)
end
...
end

persisted? is a method available on all ActiveRecord models while status_new? and status_suspended? are methods dynamically generated by a module that, included in a model, allows to manage any attribute as an ‘enum’ field that can only contain one from a list of possible values (so, in the example, License can have any of the statuses [:new, :active, :suspended, :revoked, :expired, :transferred]).

How would you go about testing that this method behaves as expected? One obvious way could be by taking into account all the possible combinations of true/false values that the three variables above (persisted?, status_new?, status_suspended?) could have. persisted? is a framework thing (and as such we won’t test it) and the methods dynamically generated concerning the license status are unit-tested separately, plus there are integration tests to ensure everything is working together properly. So it is safe, in this case, to just stub all the methods with the purpose of testing when a license can be activated, in isolation.

So we could have something like this:

describe "License" do
subject(:license) { build(:license) }
...
it "can only be activated if persisted, and new or suspended" do
license.stub(:persisted?).and_return(false)
license.stub(:status_new?).and_return(false)
license.stub(:status_suspended?).and_return(false)

license.should_not be_activable

# ... some other combinations

license.stub(:persisted?).and_return(true)
license.stub(:status_new?).and_return(true)
license.stub(:status_suspended?).and_return(false)

license.should be_activable

# ... some other combinations
end
...
end

One other way some would achieve the same thing is:

describe "License" do
subject(:license) { build(:license) }
...
it "can only be activated if persisted, and new or suspended" do
# license isn't persisted yet
[:new, :suspended].each { |status| license.status = status; license.should_not be_activable }
(License.statuses - [:new, :suspended]).each { |status| license.status = status; license.should_not be_activable }

license.save!
# license is persisted
[:new, :suspended].each { |status| license.status = status; license.should be_activable }
(License.statuses - [:new, :suspended]).each { |status| license.status = status; license.should_not be_activable }
end
...
end

Either way, the bottom line is that we’ll likely have to somehow loop through all the possible combinations of boolean values that the three variables persisted?, status_new?, status_suspended? can have, and then we determine for each of these combinations what the expected behaviour is (in the example: whether the license can be activated or not). While both examples (as in “Rspec examples”) would work, both suffer from quite a bit of duplication and reduced readability, in that it is not very clear right away by just looking at the code what is the relationship between the conditions we’re taking into account. In our case, we have a “A and (B or C)” kind of relation that is instantly clear only if you read the title of the Rspec examples.

Another way of testing the same thing, which I prefer, is as follows:

describe "License" do
subject(:license) { build(:license) }
...
it "can only be activated if persisted, and new or suspended" do
boolean_combinations(3).each do |persisted, _new, suspended|
license.stub(:persisted?).and_return(persisted)
license.stub(:status_new?).and_return(_new)
license.stub(:status_suspended?).and_return(suspended)

if persisted and (_new or suspended)
license.should be_activable
else
license.should_not be_activable
end
end
end
...
end

As you can guess, thanks to the boolean_combinations method (more on that later) we’re still looping through all the possible combinations but with no duplication, and the example is more readable. The advantage is that from looking at just the code, you can understand right away what is the relationship between what I’m calling “the variables” and how the various conditions are linked together. In particular, the code

...
if persisted and (_new or suspended)
license.should be_activable
else
license.should_not be_activable
end
...

clearly says when a license should be activable. It’s good from a “driving code by tests/specs” standpoint in that it suggests the “code we wish we had”. Before going ahead, one note on having multiple expectations in the same example: I usually prefer to keep one expectation per example, as it is often recommended for clarity and simplicity; however in cases like the above all the expectations in the single example define only together the behaviour being tested. None of them, taken individually, would define any behaviour of the subject of the test, and the alternative would be several almost identical specs with perhaps some nested contexts (depending on the combinations of conditions being tested) that would IMO be overkill in such cases. This is particularly true if you have even more than 3 different variables involved in the conditions being tested.

So, back to the boolean_combinations method: how do we generate all the possible combinations of boolean values, given any number of variables? If you’ve studied -in particular- some electronics, the answer should be pretty obvious. What we need is the same kind of truth table often used to figure out how to simplify, or reduce, some logical operations (for example with Karnaugh maps or similar methods). Such a table looks like the following (for 3 variables as in our example) and should be pretty familiar:

a | b | c
---+---+---
F | F | F
F | F | T
F | T | F
F | T | T
T | F | F
T | F | T
T | T | F
T | T | T

One possible ‘Rubyish’ way is to generate this table is with the combination method available for arrays in Ruby. It expects the number of items you want for each combination, and will produce all the possible combinations with the given array of items, e.g.

1.9.3p194 :038 > ['a', 'b', 'c'].combination(2).to_a
=> [["a", "b"], ["a", "c"], ["b", "c"]]
1.9.3p194 :039 >

In our example however we have a slightly different case, since we want combinations of 3 items each, but each of the items can have either true or false value. If we tried with

1.9.3p194 :040 > [true, false].combination(3).to_a
=> []

we would be out of luck because the array doesn’t have enough items to match the number of items required for each combination, therefore the result is an empty array. We can work around this by just ‘extending’ the original array, for example by multiplying it for the number of items we want in each combination:

1.9.3p194 :041 > ([true, false]*3).combination(3).to_a
=> [[true, false, true], [true, false, false], [true, false, true], [true, false, false], [true, true, false], [true, true, true], [true, true, false], [true, false, true], [true, false, false], [true, true, false], [false, true, false], [false, true, true], [false, true, false], [false, false, true], [false, false, false], [false, true, false], [true, false, true], [true, false, false], [true, true, false], [false, true, false]]

The good thing is that this way we do get all combinations we are looking for; the bad thing is that we get lots of duplicates – the more items per combination, the more duplicates. So we’d have to use uniq on the result to remove those duplicates:

1.9.3p194 :044 > ([true, false]*3).combination(3).to_a.uniq
=> [[true, false, true], [true, false, false], [true, true, false], [true, true, true], [false, true, false], [false, true, true], [false, false, true], [false, false, false]]

This does indeed produce all the combinations we are after, but it’s not terribly efficient, albeit it is short and simple.

Using bitwise logic

Another way to generate all the boolean combinations, in a perhaps less ‘Rubyish’ but more efficient way, is to use a bitwise operation.

First, we need to know in advance how many possible combinations we have with the given number of variables, since this will be required in the algorithm we’ll see shortly; one first way to figure out the number of combinations (or range) is with the shift-left operator (which shouldn’t be confused with same-looking operators on types other than integers):

number_of_combinations = 1 << n

As the name of this operator might suggest to some, what it does is shift each bit of the binary representation of the number to the left by n positions. The number of positions is simply the number of elements we want in each combination (that is also the number of variables we want to produce the boolean table for). So, in our example, we have:

number_of_combinations = 1 << 3

The binary representation of 1 is 0001 (using some leading zeros for clarity), so if we shift each bit by 3 positions to the left, we get 1000 which is the binary representation of the number 8. Similarly, 1 << 2 means shifting each bit of 0001 by 2 positions to the left, so we get 0100, which is the binary representation of the number 4. You’d quickly guess that the operation 1 << n is equivalent to the operation 2**n (number two raised to the nth power), so the formula usually used to calculate the number of combination is instead:

number_of_combinations = 2**number_of_items

as it’s a bit easier to remember and understand. In our case we have 2 ** 3 items = 8 combinations. The table we’ve seen earlier is equivalent to the following table:

a | b | c
---+---+---
0 | 0 | 0
0 | 0 | 1
0 | 1 | 0
0 | 1 | 1
1 | 0 | 0
1 | 0 | 1
1 | 1 | 0
1 | 1 | 1

Or also, using the variables in our example:

persisted | status_new | status_suspended combination index (I)
-----------+------------+----------------------------------------------
0 | 0 | 0 0
0 | 0 | 1 1
0 | 1 | 0 2
0 | 1 | 1 3
1 | 0 | 0 4
1 | 0 | 1 5
1 | 1 | 0 6
1 | 1 | 1 7

Interestingly, each combination of 1s and 0s on each line of the table is basically equivalent to the binary representation of the index I of the combination. So 4 = 101, 6 = 110, and so on.

So, if we look at this table we can see that for each combination with index I, each variable will have value true or false depending on whether their corresponding bit in the binary representation of I is set or not. For example, in the combination with index 7, all the three variables will have value true since their corresponding bits in the binary representation of 7, which is 111, are all set. Similarly, in the combination with index 5, persisted and status_suspended will have value true since their bits in the binary representation of 5 (101) are set, while status_new will be false because its bit in the binary representation of 5 isn’t set.

We can say the same thing this way too: given a combination with index I, and an element of the original array with index J, the element J will have value true in the combination I only if the Jnth bit (from the right) of the binary representation of I is set.

Given a number m, how do we check if the bit of position n of the binary representation of m is set? The “canonical” way do to this is to perform a binary and operation between m and the number that has only the bit of n position set. A binary and is simply an and operation performed bit by bit. So for example, if we wanted to check whether the 3rd bit of the binary representation of 16 (which is 10000) is set, we would do:

16 = 10000 &
4 = 00100
-------
00000 = 0 = 3rd bit not set

since 00100 (or 4) is the binary number having only the 3rd bit set. So the operation is equivalent to 16 & 4. In this example, the result of the bit-by-bit and operation is 0, meaning that the 3rb bit of 10000 (16) isn’t set – and in fact it is not. As another example, let’s check now if the 4th bit of 27 is set. The binary representation of 27 is 11011, while the binary number having only the 4th bit set is 10000, or 16. So we have:

27 = 11011 &
16 = 10000
-------
10000

The result is 10000 or 16, and the “rule” is that given:

  • m is a number
  • n is the position of one bit in the binary representation of m
  • o is the number that has only the nth bit set
    the nth bit of the binary representation of m is set if the operation m & o yields a result that differs from zero.

Back to the table above, let’s check for example why persisted has a value of true in the combination by index 5 (“m“). Persisted, in the table, corresponds to the third bit (“n“) of the binary representation of 5, so we need to find out if the 3rd bit from the right in 101 (= 5) is set; the number with only the 3rd bit set (“o“) is 100, or 4, so the operation we need is (using directly numbers in decimal notation):

1.9.3p194 :048 > 5 & 4
=> 4

The result is != 0, meaning that the 3rd bit of 5 is indeed set and, in turn, that for that combination, persisted has value true.

Applying all of the above to write a first version of an algorithm to generate all the boolean combinations we wanted in first place, we get:

def boolean_combinations(number_of_elements)
number_of_combinations = 2 ** number_of_elements
combinations = []

(0...number_of_combinations).each do |combination_index|
combination = Array.new(number_of_elements)

(0...number_of_elements).each do |element_index|
combination[element_index] = (combination_index & 2**element_index) != 0
end

combinations << combination.reverse
end

combinations
end

This indeed produces all the combinations we’re after:

1.9.3p194 :106 > boolean_combinations(3).each {|c| p c}; nil
[false, false, false]
[false, false, true]
[false, true, false]
[false, true, true]
[true, false, false]
[true, false, true]
[true, true, false]
[true, true, true]
=> nil

We can simplify this code a bit. Firstly, with the bit reference operator fix[n] -> 0,1 available with Fixnum objects, we can get right away the nth bit of the binary representation of a given number, and then we return either true or false depending on whether that bit is set or not. So the code

...
combination[element_index] = (combination_index & 2**element_index) != 0
...

is equivalent to

...
combination[element_index] = combination_index[element_index] == 1
...

Secondly, we can slightly simplify the two nested loops with map:

def boolean_combinations(number_of_elements)
(0...2**number_of_elements).map do |i|
(0...number_of_elements).map{ |j| i[j] == 1 }
end
end

I’m not sure of the performance difference (we are talking about tiny arrays here) but perhaps I would prefer the other, more readable version. So I keep a file in spec/support containing the boolean_combinations method, and use it in some Rspec examples as shown earlier:

describe "License" do
subject(:license) { build(:license) }
...
it "can only be activated if persisted, and new or suspended" do
boolean_combinations(3).each do |persisted, _new, suspended|
license.stub(:persisted?).and_return(persisted)
license.stub(:status_new?).and_return(_new)
license.stub(:status_suspended?).and_return(suspended)

if persisted and (_new or suspended)
license.should be_activable
else
license.should_not be_activable
end
end
end
...
end

There are various other operations that can be done with bitwise logic, including some tricks involving databases that I’ll perhaps show in some other post.

Update 16/08/2012: reader Luca Belmondo sent me a comment pointing out that the built in Array#repeated_permutation does exactly the same thing.

1.9.3p194 :004 > [true, false].repeated_permutation(3).to_a
=> [[true, true, true], [true, true, false], [true, false, true], [true, false, false], [false, true, true], [false, true, false], [false, false, true], [false, false, false]]

I hadn’t notice this method before, as well as Array#repeated_combination – it looks like both were introduced in Ruby 1.9 – so this is a fresh reminder that it’s always better to check out what’s already available before reinventing the wheel. Thanks Luca!

How custom RSpec matchers for blocks work

Custom matchers and blocks

Having used RSpec as my favourite testing framework for a while now, one of the aspects that I like the most about it is that it has some very useful built in matchers that enable you to test, with some friendly and readable syntax (at least compared to other testing frameworks), expectations on what happens when a given block of code is executed, rather than simply comparing some objects. Examples of such useful matchers are change and raise_error.

If you use RSpec, you are likely familiar with these matchers and likely know that you can use them either with lambdas,

it "does not update the last connection time" do
lambda { do_something }.should_not change(client, :last_connected_at)
end
...
it "throws an exception" do
lambda { do_something }.should raise_error
end

or, preferably, with a nicer syntax thanks to expect:

it "does not update the last connection time" do
expect { do_something }.to_not change(client, :last_connected_at)
end
...
it "throws an exception" do
expect { do_something }.to raise_error
end

But what if you want to test some custom expectations when a block is executed? It’s actually pretty easy, thanks to RSpec’s support for custom matchers.

Say, for example, that you have an app ensuring that “incidents” are logged into some database table, whenever some events of particular relevance occur. Such logging could be done with a simple model named, say, Incident, so that whenever some particular condition is met or event occurs, a new instance of Incident of a given type is created in the database having as “parent” object the subject of the event that is being logged and, optionally, some other data.

It would be possible to test this kind of functionality by just checking if a new incident of the expected type and with the expected properties is actually logged when a particular event occurs, in a way similar to the following:

describe User do
let(:user) { subject }

...
describe "#lock!" do
it "locks the user" do
user.lock!
user.should be_locked
end
...
it "logs the incident" do
user.lock!
user.should have(1).incidents
incident = user.incidents.first
incident.name.should == :locked
# any other expectations on the incident....
end
end
...
end

The example above is a very simple one just to give an idea; you might already notice though that this test isn’t complete, in that it does not actually test that the expected incident exists because of the code we are testing, and wrongly assumes that the incident we are looking for is the first/only incident logged for the test user. What if the incident is correct but -for whatever reason – already existed before the code we are testing was executed? Or what if the incident we are looking for isn’t the only one logged for the user (nor the first/last one), because other events in the object’s life also might log other incidents?

Of course, these are very simple problems that can be very easily solved with a little bit more code and one or two more expectations so to make the test complete; we could for example use the change matcher (as shown earlier) to test that a new incident of the expected type is created when we execute our block of code, and then simply test expectations on that incident’s properties. But if we are testing this logging functionality for more than a few events, we might end up with a lot of nasty duplication. After all, the code required to test this functionality would be basically the same for all the events, except that the subject of the event, and the event name, would change for each event.

A possible solution to remove duplication and hide away the logic required to test that an incident is created when expected, would be a simple helper. That would work, but an even better, nicer solution would be a custom matcher like the one shown in the code snippet below:

module Chronicler
module Matchers
class ChronicleIncident
def initialize(target, event_name)
@target, @event_name = target, event_name
end

def matches?(block_to_test)
before = incidents_count
block_to_test.call
incidents_count == before + 1
end

def failure_message_for_should
"the block should have chronicled the '#{ @event_name }' incident for the #{ @target.class.name }##{ @target.object_id }, but it didn't"
end

def failure_message_for_should_not
"the block should not have chronicled the '#{ @event_name }' incident for the #{ @target.class.name }##{ @target.object_id }, but it did"
end

private

def incidents_count
@target.incidents.where(name: @event_name).count
end
end

def chronicle_incident(target, event_name)
Matchers::ChronicleIncident.new(target, event_name)
end
end
end

And, to enable the new matcher in RSpec (in spec_helper.rb):

...
RSpec.configure do |config|
...
config.include Chronicler::Matchers
...
end
...

So we now have a simple matcher for the Chronicler feature, that will execute some block (block_to_test.call) and then return true or false depending on whether the condition defined by the matcher is met or not. The condition being, of course, that an incident of a particular type with the expected data is created when the block is executed. We also have some friendly error messages that RSpec will display if the expectation defined by the matcher fails its verification. You can also notice that there’s a handy helper, chronicle_incident, that accepts as arguments the target object we want to log the incident for (assuming Incident is used with a polymorphic association to multiple models), and the name of the event. In a way, our matcher looks like an extended version of the change matcher since it too tests that an object has somehow changed due to the execution of our block, but it also tests more explicit expectations on those changes, in one go.

We can now use the matcher as follows:

describe User do
let(:user) { subject }

...
describe "#lock!" do
...
it "chronicles the incident" do
expect { user.lock! }.to chronicle_incident(user, :locked)
end
...
end
...
end

The new syntax is nicer, more readable, more expressive. At the same time, we have removed a lot of potential duplication if we are going to test this event logging a lot, and we’ve hidden away the steps and expectations necessary to test that the expected incident is created. But how does this little “magic” work behind the scenes?

How does it work?

It’s actually pretty simple. If you open the rspec-expectations gem (one of the several gems that, together, make the RSpec testing framework), you’ll find the answer right away if you look a little bit in the code. Firstly, expect is nothing more than a simple macro that, given some aliases for the methods should, should_not (defined elsewhere) – so that we can use the syntax expect {…}.to/to_not – “attaches” these to/to_not methods to the blocks that we pass to it in our tests, by extending those blocks with these methods.

module RSpec
module Matchers
module BlockAliases
alias_method :to, :should
alias_method :to_not, :should_not
alias_method :not_to, :should_not
end

# Extends the submitted block with aliases to and to_not
# for should and should_not.
#
# @example
# expect { this_block }.to change{this.expression}.from(old_value).to(new_value)
# expect { this_block }.to raise_error
def expect(&block)
block.extend BlockAliases
end
end
end

But… you might wonder, isn’t it true that only objects can be extended? And what about should/should_not? What exactly are they? As you can quickly find out from looking into the code, they are methods defined in the code as Kernel methods, so that they are available to all Ruby objects (of course besides BasicObject, which does not include the Kernel module):

module Kernel
def should(matcher=nil, message=nil, &block)
RSpec::Expectations::PositiveExpectationHandler.handle_matcher(self, matcher, message, &block)
end

def should_not(matcher=nil, message=nil, &block)
RSpec::Expectations::NegativeExpectationHandler.handle_matcher(self, matcher, message, &block)
end
end

Uhm…all objects? You might remember that in Ruby almost everything is an object. Well, apart from normal methods, for example… But the block we pass to expect, is treated as an object, and therefore the should/should_not methods (and their aliases to/to_not) would be available to it. But why is our block treated as an object? If you look again at expect‘s signature,

def expect(&block)
block.extend BlockAliases
end

you can see that the param block is defined with the unary ampersand: this is equivalent to constructs like…

lambda { ... }

and…

proc { ... }

and…

Proc.new { ... }

in that, like the above three, &block also returns a proc (“converts” the block into a proc) for deferred evaluation: the proc is simply “stored” to be executed later by our matcher with call. You can find easily where this happens, by looking a little further into RSpec’s code.

Update 26/02/212: (thanks Matijs) it is actually not possible to simply pass a “normal” method as argument for expect, if that method was previously defined elsewhere with the usual def..end syntax. The reason is in how a normal method has to be passed as argument to be treated as a block. You can’t for example just pass your method as argument as you’d do for normal arguments:

def block_to_test
# do something
end

it "...." do
expect(block_to_test).to ...
end

The above would fail with

ArgumentError:
wrong number of arguments (1 for 0)

for the expect method, since your method would be treated here as any normal, non-block argument, while expect‘s signature only accepts a block and no normal parameters. To make sure your method is treated as a block you’d have to use the unary ampersand as follows when passing it as argument to expect:

def block_to_test
# do something
end

it "...." do
expect(&block_to_test).to ...
end

But …this would fail too! To understand why this fails, let me remind first how the unary ampersand operator works. When you use this operator on a block, the operator converts the block to a proc. However, what happens if you use the same operator on a proc? Simple: it converts the proc back to a block.

Back to our example: when we pass our block to expect as “&block”, the block is converted into a proc because of the ampersand operator. However, expect also uses the same operator in it’s signature, so the proc it receives is basically converted back into a block before it can even use it. So the deferred evaluation has gone, and the proc, newly converted to standard block, actually gets executed.

The result is that expect will not see a proc but … the return value of the block once executed. Let’s see an example:

def block_to_test
5463
end

it "...." do
expect(&block_to_test).to ...
end

So in this example our method returns a number. If we try to run the test, it will fail with:

TypeError:
wrong argument type Fixnum (expected Proc)

This is because due to the double use of the unary ampersand the block_to_test method gets executed and expect only sees its return value (5463), which is a number/Fixnum – not a proc.

Now, back to the main topic. Assumed it is (hopefully) clear that expect treats the given block as proc, and why, it is important to note that a proc is an object too! Therefore, expect can extend it with the methods to/to_not/should/should_not, so these methods will be available for calling on the proc and will be executed in its context. When therefore we call for example the to method on our block/proc, the proc isn’t actually executed right away. Our matcher’s helper isn’t nothing more than an argument to this to method (or – depending on the case – to_not, or the original should/should_not), therefore it gets executed first.

Let’s add parenthesis as this might make things easier to follow:

expect { user.lock! }.to( chronicle_incident(user, :locked) )

With the parenthesis, it is clearer that chronicle_incident needs to be executed and evaluated first, so that whatever is its return value, this value can be passed as argument to the to method. chronicle_incident, as you can see from the custom matcher’s code, returns a new instance of our matcher, Matchers::ChronicleIncident, and this instance is what gets passed as argument to the to method.

What happens next is pretty simple too. Let’s look again at how for example should_not is defined as Kernel method (remember again that to is just an alias for should):

def should(matcher=nil, message=nil, &block)
RSpec::Expectations::PositiveExpectationHandler.handle_matcher(self, matcher, message, &block)
end

Let’s put aside for a moment that to is an alias and let’s imagine that this method is defined directly with the to name since it might help a little bit see what’s going on here.

def to(matcher=nil)
RSpec::Expectations::PositiveExpectationHandler.handle_matcher(self, matcher)
end

For the sake of simplicity, I have removed the other params (the custom, optional, message RSpec would display, if given, and the optional block – see how the change matcher works in detail for examples of when a block might be passed as argument to it).

So when the code

expect { user.lock! }.to( chronicle_incident(user, :locked) )

is evaluated, to is executed and the instance of Matchers::ChronicleIncident passed as argument to it (and returned by chronicle_incident) is also passed to RSpec::Expectations::PositiveExpectationHandler.handle_matcher as argument, together with self.

This is a crucial bit where some might get confused (I hope it’s been easy enough to follow so far!): what is self here? self, in the method to‘s scope, is the object that receives the to “message” (in Ruby terminology) itself.

And what is that object? Surprise! It’s the block we passed to expect in first place, remember? That is, the block we want to test.

Next, let’s have a look at how RSpec::Expectations::PositiveExpectationHandler.handle_matcher is defined (again, I have removed the optional params and a few more things just to simplify):

def self.handle_matcher(actual, matcher)
...
match = matcher.matches?(actual)
return match if match

message ||= matcher.respond_to?(:failure_message_for_should) ?
matcher.failure_message_for_should :
matcher.failure_message

if matcher.respond_to?(:diffable?) && matcher.diffable?
::RSpec::Expectations.fail_with message, matcher.expected, matcher.actual
else
::RSpec::Expectations.fail_with message
end
end

This is the final bit that unveils the mystery. See what’s happening here? actual is, as you can easily spot, our original block that we wanted to test, so let’s rename it to block_to_test so to make, again, things easier to follow:

def self.handle_matcher(block_to_test, matcher)
...
match = matcher.matches?(block_to_test)
...
end

Remember that block_to_test is a proc, and as such it hasn’t been executed yet. But now it will, when the matches? method of our custom matcher is executed. Let’s look at it again:

def matches?(block_to_test)
before = incidents_count
block_to_test.call
incidents_count == before + 1
end

Bingo! Now the original block to test will be executed thanks to call, and because we are controlling when this happens in our custom matcher, we can do whatever we want before and after it, and test whatever expectations we want.

In conclusion, I hope the explanation didn’t suck too much 🙂 and that you found it interesting to look at how a bit of RSpec, and some Ruby magic work behind the scenes.

About RSpec’s custom matchers, I find very useful and I recommend using them a lot in that they often help reduce duplication and have a nicer syntax in our tests for improved readability, among other things.

Would like to add something to the topic or have any tips? Please share them in the comments or get in touch 🙂

Rails 3.1 and installing Ruby 1.9.2-p290 with the ‘fast require’ patch, readline, iconv

When Rails 3 was released, many users noticed that Ruby 1.9.2 and Ruby-head (basically 1.9.3) seemed to be awfully slower when loading Rails 3 apps, than Ruby 1.8.7. Then, back in May, Xavier Shay posted an article to his blog with some interesting findings: he noticed that the way later versions of Ruby require files and keep track of the files that have already been loaded isn’t very efficient, causing Rails 3 apps (which require thousands files at startup) to load a lot more slowly. He also released an awesome patch for Ruby 1.9.3, that did seem to improve the loading times and make them more comparable to those of Ruby 1.8.7, so other patches were then released for Ruby 1.9.2 as well (I suggest you read his post for more details; you can also find more information if you Google for ‘Ruby 1.9 fast require patch’).

Over the past few months, I have been sticking to a patched version of Ruby 1.9.2-p180, without upgrading to the latest stable revision (p290) since I couldn’t find yet a version of the ‘fast require’ patch that would work correctly with this revision as well. Yesterday, however, I started working a new Rails 3.1 app, and it seemed that the p180 revision wasn’t working well with the latest release of Rails. I am not sure of whether this had something to do with the particular configuration of my system (I tried reinstalling RVM and Ruby a few times though), but I would get warning and errors mainly related to Sass like

...(cut)...
... warning: already initialized constant ROOT_DIR
... warning: already initialized constant RUBY_VERSION
... warning: already initialized constant RUBY_ENGINE
... warning: already initialized constant ENCODINGS_TO_CHECK
... warning: already initialized constant CHARSET_REGEXPS
... warning: already initialized constant PARENT
...(cut)...

and

...
script/rails:6:in `require'
script/rails:6:in `<main>'
error scss [not found]

when, for example, bootstrapping the administration of some model with a quick and dirty scaffolding.

I didn’t want to spend much time investigating this, so also out of curiosity I tried to install an updated version of Ruby to see if it would help – and in fact it did, so I was lucky. After trying a few patches for the ‘fast require’, I eventually found one that actually worked with p290, so here’s how you’d use it to install and patch the latest stable revision of Ruby 1.9.2, with RVM:

curl https://raw.github.com/gist/1008945/4edd1e1dcc1f0db52d4816843a9d1e6b60661122/ruby-1.9.2p290.patch > /tmp/192.patch
rvm uninstall 1.9.2 && rvm cleanup all && rvm fetch 1.9.2
rvm install 1.9.2 --patch /tmp/192.patch

In my case, I also needed to install some dependencies such as iconv and readline, since I like using bond to achieve bash-like auto completion in my IRB/Rails consoles.

At first, I tried to install the relevant packages with RVM:

rvm pkg install iconv
rvm pkg install readline
rvm uninstall 1.9.2 && rvm cleanup all && rvm fetch 1.9.2
rvm install 1.9.2 --patch /tmp/192.patch --with-readline-dir=$rvm_usr_path --with-iconv-dir=$rvm_usr_path

but this would blow up when also applying the ‘fast require’ patch:

...
ruby-1.9.2-p290 - #fetching
ruby-1.9.2-p290 - #extracted to /Users/vito/.rvm/src/ruby-1.9.2-p290 (already extracted)
Applying patch '/tmp/192.patch' (located at //tmp/192.patch)
ERROR: Error running 'patch -F25 -p1 -f <"//tmp/192.patch"', please read /Users/vito/.rvm/log/ruby-1.9.2-p290/patch.apply.192.patch.log
...
ERROR: There has been an error while running configure. Halting the installation.

I believe this was due to the fact that RVM also needs to apply a second patch to Ruby, in order to install readline, if readline is being installed with RVM. I am not sure of how to apply more than one patch at the same time (unless I am missing something, I don’t think this is possible?); I also got weird errors related to iconv (for example required by the json gem), when trying to start an app or to execute a rake task:

/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json/common.rb:98: warning: already initialized constant NaN
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json/common.rb:100: warning: already initialized constant Infinity
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json/common.rb:102: warning: already initialized constant MinusInfinity
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json/common.rb:121: warning: already initialized constant UnparserError
rake aborted!
no such file to load -- iconv
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json/common.rb:358:in `require'
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json/common.rb:358
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json.rb:1:in `require'
/Users/vito/.rvm/gems/ree-1.8.7-2011.03/gems/json-1.5.1/lib/json.rb:1
...(cut)...

Ruby 1.9.x already includes some equivalent of the iconv gem by default, so I am not sure of what triggered this error. I even tried to install the iconv gem anyway, without much success (‘failed to build native extensions’ for some reason).

I found a workaround by installing both readline and iconv with Homebrew instead:

brew install readline
brew install libiconv
brew link libiconv

rvm cleanup all && rvm fetch 1.9.2
rvm install 1.9.2 --patch /tmp/192.patch --with-readline-dir=/usr/local/Cellar/readline/6.2.1 --with-iconv-dir=/usr/local/Cellar/libiconv/1.14

Note that I also had to run brew link libiconv due to some other error. This is now working as expected, and I can definitely see the speedier startup with my Rails 3 apps, while the aforementioned dependencies also work. If you install the patch, it’s simple to compare the startup time before and after the patch. Here’s an example with one application I’m working on:

before

time rails runner 'a=1'

real 0m21.427s
user 0m17.585s
sys 0m2.004s

after

time rails runner 'a=1'

real 0m14.725s
user 0m12.721s
sys 0m1.996s

As you can see, in this case the patch cut several seconds from the startup time, although it also depends a lot on the application and on how large it is.

The shorter startup time can make a nice difference especially when you are running tests (unless you’re using something like spork, that is), so if you haven’t patched your Ruby install yet, for whatever reason, I’d definitely recommend you to.