Blog tech

Introduction to Ruby code optimization (part 2/2)

Rédigé par Nicolas Zermati | 27 septembre 2012

Today, I finished my Introduction to Ruby code optimization. In the first part I talked about the tools used to find bottlenecks, here I’ll try to give you advices, examples and practical techniques to make your code faster.

Ruby code

Caching

One of the simplest optimization is to avoid doing the same job multiple times.

In a situation like this:

class Model
  ...
  def computed_field
    # ... some_computation ...
  end
  ...
end

you should use memoization and get something like:

class Model
  ...
  def computed_field
    @computed_field ||= # ... some computation ...
  end
  ...
end

Be careful with nil and false values that does not play well with the previous Ruby implementation.

ActiveSupport provides the Memoizable mixin to make this operation easier:

class Model
  extend ActiveSupport::Memoizable
  ...
  def computed_field
    # ... some_computation ...
  end

  memoize :computed_field
  ...
end

The Ruby version is simpler that the ActiveRecord one. The Rails team depreciated Memoizable.

Memoization can be used in many situations, it can be refined to take account of functions parameters:

class Model
  ...
  def method(param)
    @method_cache        ||= {}
    @method_cache[param] ||= # ... some computation using 'param' ...
  end
  ...
end

It is a good idea to refine the previous technique using a Hash with a limited size to avoid garbage collection and memory issues.

Compute at load time

Another way to compute things only one time is to try to do it at load time. I’ve already seen code like this:

...
STATUSES = %s{state1 state2 state3 ...}

def self.valid_statuses
  STATUSES + STATUSES.map(&:to_s)
end
...

An obvious optimization is to compute the valid_statuses at load time:

...
STATUSES       = %s{state1 state2 state3 ...}
VALID_STATUSES = STATUSES + STATUSES.map(&:to_s)
...

In this example, the memoization technique could also be efficient.

Use the appropriate data structure

In a previous post about Boggle solving (fr) I showed that algorithms and data structure could be the best optimization. In most situations a program is slow because of them.

The Ruby array’s API is nice, maybe too nice because sometime I forget there is other kind of collections. Sets are overlooked, they provide excellent lookup times, ensure uniqueness of their elements and could have a better memory consumption than arrays.

Here is a simple benchmark that demonstrate the tradeoff between insertion and lookup speed:

require 'benchmark'
require 'set'

n       = 10000
random  = Random.new(1234)
range   = 0..1000
element = 1001

for c in [10, 100, 1000, 2000, 10000]
  set    = Set.new
  array  = Array.new

  puts "With #{c} elements and #{n} lookup"
  Benchmark.bm(15) do |x|
    x.report('insertion set')   { c.times do ; set << random.rand(range)   ; end }
    x.report('insertion array') { c.times do ; array << random.rand(range) ; end }
    x.report('lookup set')      { n.times do ; set.include?(element)       ; end }
    x.report('lookup array')    { n.times do ; array.include?(element)     ; end }
  end
  puts "With set size = #{set.size} and array size = #{array.size}"
  puts ""
end

If you try to run those tests on your machine you will see a small difference between insertion in sets and array where arrays are faster. You can also see a big difference between lookup in sets and arrays where sets are a lot faster.

Depending on the context you should seriously consider using sets instead of arrays. For some operations like Array#uniq, efficients structures are used: the array is converted into an hash, the operation is applied and the result is converted back to an array.

Ruby is not Javascript, use Struct instead of structured hashes to represent your data. Structs have faster access than hashes and give you customizable classes.

# Can be extended, replaced by another class...
Customer = Struct.new(:name, :address, :zip)

# faster access and it is Ruby
Customer.new('nicolas', '123 main street', '12345')

# slower access and it is Javasctipt
{name: 'nicolas', address: '123 main street', zip: '12345'}

This is not only about sets or Struct. You can solve efficiently many problems by creating a new data structure or using an existing one that really fit your algorithm.

Conditionnal optimization

Have you ever seen a code like that ?

def choose(string)
  result_action = nil
  options.each_pair do |pattern, action|
    if pattern =~ string
      result_action = action
      break
    end
  end
  result_action
end

It could be a simplification of a web server router, where string is the requested URL, options are the compiled routes and result_action is the action to execute to get the response (Actually, I used the Sinatra’s Base#route! function to write this example).

Imagine that you’ve got hundreds of options. For each request, this choose function will try each option before it find a matching pattern. Now imagine that the most frequent matching patterns are carefully placed in the top of the options list. It is better at the end of the list, isn’t it ? Thus, always try to put the most frequent case first.

My example may seems to be more an algorithmic one, but it apply to if elsif else and switch statements too.

Use the right iterators

reduce is my favorite iterator (Enumerable#reduce documentation) it does almost everything I want on collections. I know I use reduce too much. There are many reduce simplifications in the Enumerable API: map, max, group_by, select, etc. Those functions are more efficient that their reduce equivalence:

collection.map(&:field)                                  # faster
collection.map { |elem| elem.field }                     # slower
collection.reduce([]) { |acc, elem| acc << elem.field }  # slowest

Combining those function are even faster than the reduce equivalent:

collection.map(&:email).group_by { |email| /^.*@(.*)$/ =~ email ; $1 } # faster
collection.reduce({}) do |acc, elem|                                   # slower
  email = elem.email
  /^.*@(.*)$/ =~ email
  (acc[$1] ||= []) << email
  acc
end

But chain more of those iterator methods and using intermediate collections will make the reduce version faster. In doubt, benchmark it.

Another tips is to avoid writing this:

collection.map(&:obj).map(&:attr)

Prefer this version:

collection.map{ |elem| elem.obj.attr }

Benchmark everything

There is more than one way to do it in Ruby. Look at this examples:

  # Using an external block
  def m ; yield ; end
  def m ; Proc.new.call ; end
  def m(&block) ; block.call ; end

  # Testing that some 'obj' is not nil
  !!obj
  !obj.nil?
  obj.present? # w/ ActiveSupport

  # String formatting
  "Hello #{name}, how are you?"
  'Hello ' << name << 'how are you?'
  'Hello ' + name + 'how are you?'

Those things are sneaky but they could improve the performance of some piece of code. The benefits are small but real:

                      user     system      total        real
yield             0.090000   0.000000   0.090000 (  0.087007)
Proc.new          0.470000   0.000000   0.470000 (  0.472521)
&block            0.400000   0.000000   0.400000 (  0.401309)
                                user     system      total        real
string interpolation (#{})  0.280000   0.000000   0.280000 (  0.278142)
string concatenation ( < )  0.300000   0.000000   0.300000 (  0.307941)
string addition      ( + )  0.340000   0.000000   0.340000 (  0.345992)

There is many Ruby implementations and many versions of each one of them. When you create a benchmark, keep it! When your Ruby implementation or its version changes, re-run your benchmark suite to quickly know which way is faster.

Database

In web application there is many database requests. Those requests and the way its results are mapped to Ruby objects must be monitored carefully. Ruby offers logging mecanisms (Logger), database systems offer logging too (Postgresql, MySQL, MongoDB…). Your ORM should have a logging system too. Use those logs to detect slow, repetitive or useless requests.

Once you found the leak, you’ve got some choices to make…

Plain old SQL or Javascript

Sometime, the classic ORM behavior isn’t good enough to do the job efficiently. In this situation you can use the communication layer below your ORM: SQL, Javascript or any native interface to the database system you’re using.

MongoDB uses Javascript to execute map/reduce operations, the ORMs such as Mongoid expose this map/reduce interface. This article about map/reduce in Ruby describe this situation.

In previous posts on this blog we talked about Sequel (fr) and ARel (fr) that helps to write more complex SQL than the Rails’s ORM: ActiveRecord. When pure SQL requests are secure and encapsulated for optimization purpose, I don’t see why I should not use them, do you?

Indexes

The SQL databases and some NoSQL databases such as MongoDB could increase their performances by defining well chosen indexes.

In a recent project I forgot to run my indexes creation rake task. The MongoDB server uses 100% of the CPU during computation, the application is very slow: an asynchronous job that usually took few seconds takes few minutes now. On the development server, mongod used only 1/2% of the CPU.

The most common asynchronous job was doing multi-criteria searches on a table many times. After creating indexes on the right fields, everything is fine.

Refactoring the schema

That’s a rough path but in some situations it is the best way to get improvements. There is many tools that helps relational database migations (ActiveRecord::Migration, Sequel.migration…). For schemaless databases it’s transparent (but need extra attention to manage rollbacks and migrations of datas).

C code with RubyInline

Even a good tuned Ruby code can be slow, JRuby or Rubinus will eventually change this but in the meantime you need the best possible performances. And, no surprises, we have to write C code to get them.

There is a lot of nice articles on google about RubyInline and there is the RubyInline tutorial on github (see example1.rb then example2.rb).

The profiling strategy (as described on the official website) is:

  1. Always keep a log of your progress and changes.
  2. Run code with ‘time’ and large dataset.
  3. Run code with ‘-rprofile’ and smaller dataset, large enough to get good #s.
  4. Examine profile output and translate 1 bottleneck to C.
  5. Run new code with ‘time’ and large dataset. Repeat 2-3 if unsatisfied.
  6. Run final code with ‘time’ and compare to the first run.

We’ve seen how to profile Ruby code in the previous part of this article. It’s now the time to use it.

Conclusion

We’ve seen few simple optimization techniques here. The general idea behind is to be critical about how fast your code execution will be. Share your tricks in the article’s comments.

The Synbioz Team.