Make Rails Associations Faster by Optimizing Named Blocks and String Callbacks
In our previous articles we described how Rails spends much of its time garbage collecting, and that significant speedup can be achieved by memory profiling and fixing memory allocation hotspots. In this article, we'll describe couple more such hotposts dealing with named block parameters and associations, and provide the patches.
Named Block Parameters Considered Harmful (for Performance)
We already wrote that passing a block to a method of ActiveRecord::Associations::HasManyAssociation instance and its friends chews up the memory. For example, a single call to association.select { |record| record.new_record? } can allocate up to 10K of memory depending on the association size. A brief look at associations source reveals that Rails itself has similar code in many places.
Each association is a proxy to the actual array of associated object(s). It seems like method_missing is a good way to implement proxy pattern in Ruby and indeed that's what Rails does. The proxy contains an array of associated objects and sends all missing methods in the proxy into that array. If we simplify the Rails code, we'll see something like this:
class Association
def method_missing(method, *args, &block)
@array.send(method, *args, &block)
end
end
At first, we couldn't understand why this would be slow, but after some digging we got it. Each named &block parameter requires extra processing. Ruby creates a Proc object that represents the block passed and adds a Binding object with the local execution context to that Proc. In an empty Ruby script without any variables defined binding will be around 400 bytes. In actual Rails application bindings may grow up to 10K in size. Now imagine you're doing something with AR object and its association in a loop 100 times. Bah! 1 megabyte of memory is gone.
Each Ruby block is a closure, and it captures its complete environment at the time of creation. Ola Bini has a great article on this. So is all hope lost? No -- turns out that MRI has different implementations for named and anonymous block parameters. When calling a function which takes anonymous block, it simply stores a reference to the caller's stack frame. It's OK to do that since the callee is guaranteed to exit before caller's stack frame is popped. When calling a function that takes a named block MRI assumes that this block may be long-lived and clones the environment right there. So anonymous block parameters are much more efficient than named block parameters. Also see related discussion on Ruby Forum.
The optimization to Rails Association is simple - just pass a new block and yield the old one inside:
class Association
def method_missing(method, *args)
@array.send(method, *args) { |*block_args| yield(*block_args) if block_given? }
end
end
This not only saves memory, but runs faster. I've benchmarked that on Acunote copying 120 objects (each with 6 associations) using ActiveRecord.
With named block parameters:
Benchmark Copy 120
memory: 97527K total in 1698240 allocations, GC calls: 13, GC time: 977 msec
time: 3.25 ± 0.05
With yields:
Benchmark Copy 120
memory: 92670K total in 1636677 allocations, GC calls: 12, GC time: 901 msec
time: 3.15 ± 0.05
As the result, 5 megabytes of memory and 100msec saved for good.
That's Cool! Where's The Patch?
- Patch for Rails 1.2
-
- Patch accepted and committed into trunk and 2.0-stable branch by Rick Olson
Fix your code and remove
&block's where you can
String Callbacks Considered Harmful (for Performance)
This one is even more interesting. Rails allows for string callbacks in before_save, after_save, before_destroy and so on in ActiveRecord models. Each such callback is a string that is evaluated in the context of AR object. Let me cite Rails callbacks.rb source here:
...
def callback(method)
notify(method)
callbacks_for(method).each do |callback|
result = case callback
when Symbol
self.send(callback)
when String
eval(callback, binding)
when Proc, Method
callback.call(self)
else
...
You see, to evaluate the string we need to get the binding. And as we all remember from our named block parameter discussion, the binding takes memory. Even when you don't use string callbacks yourself, Rails associations automatically create them for you.
For example, has_many will define 4 string callbacks. You'll get before_save, after_create and after_update to assure that new associated records are saved when its parent record is saved; and also you'll get one for before_destroy that destroys dependent objects or nullifies their foreign keys.
Rewriting string callbacks into symbol callbacks gives a tangible performance boost. I did that change and benchmarked Acunote again.
With string callbacks in associations:
Benchmark Copy 120
memory: 92670K total in 1636677 allocations, GC calls: 12, GC time: 901 msec
time: 3.15 ± 0.05
With symbol callbacks in associations:
Benchmark Copy 120
memory: 39108K total in 944764 allocations, GC calls: 6, GC time: 479 msec
time: 2.45 ± 0.05
Whoa! Rewriting string callbacks to symbol callbacks saved 52 megabytes and gave 0.7 sec speedup. Nice!
That's Cool! Where The Patch?
- Patch for Rails 1.2
- Patch for Rails 2 (trunk)
- Patch accepted and committed into trunk by Rick Olson
- Fix your code and stop using string callbacks
I think there is a typo in 1.2 patch for String callbacks for belongs_to_counter_cache_after_destroy_for context: it deals with before_destroy and turns into after_destroy.
Posted by: Valdas | February 13, 2008 at 12:46 PM
Fixed. Thanks Valdas!
Posted by: Alexander Dymo | February 13, 2008 at 01:09 PM
Dude -- you rock. The whole "memory is the problem, not CPU" light bulb is really a breakthrough for speeding stuff up, it's disappointing it's taken so long for someone to come along and realize this, but glad you have!
Posted by: Greg | February 13, 2008 at 02:13 PM
You have a problem in your first patch... in the associations/has_many_through_association.rb file you forgot to modify the call to super.
Posted by: Lucas Carlson | February 13, 2008 at 03:02 PM
Yep, I've missed that in 1.2 patch (2.0 one should be fine). I've uploaded updated 1.2 patch. Thanks, Lucas.
Posted by: Alexander Dymo | February 14, 2008 at 12:02 AM
Your work is awsome guy ;)
Continue like this !
Thanks a lots
Posted by: Kwi | February 14, 2008 at 01:35 AM
You guys are kind of people Rails community waited for several years. Thanks a lot.
Posted by: Michael Klishin | February 14, 2008 at 10:58 PM
I wonder how slow ERB is. It uses bindings for each method like form_for or capture. I don't have any idea how to optimize it. Maybe, markaby is much faster due to it's simplicity. I gotta do some benchmarks.
Posted by: Oleg Andreev | February 15, 2008 at 01:40 AM
Thanks for all the info. I am now up and running with all the tools.
Oleg, check out erubis at http://www.kuwata-lab.com/erubis. It is a faster implementation of ERB.
After applying the no_block_args_in_associations patch to rails 1.2.3, I got an "ArgumentError: comparison of Ownership with Ownership failed" which was caused by,
owner.ownerships.sort
where owner has many ownerships. Adding to_a between ownerships and sort fixed the problem. Did anyone else run into this?
Posted by: Paul Kmiec | February 17, 2008 at 08:17 PM
I think I know what's wrong. Patched implementations of method_missing in associations add extra block for called function even when you initially didn't pass the block.
For example, call to
owner.ownerships.sort
will be translated by method missing with (very roughly)
owner.ownerships.target.send(:sort) { |*block_args| yield(*block_args) if block_given? }
This is wrong because sort uses the block to actually compare objects and our automatically added block obviously couldn't do that. Hence the "comparison failed" message.
I'll need to fix the patch to not pass blocks when there's no block to the original function call. Fix is coming soon...
Posted by: Alexander Dymo | February 18, 2008 at 10:05 AM
Ok, new "no named block arguments" patch for Rails trunk is ready:
http://dev.rubyonrails.org/ticket/11109
Patch for 1.2 is coming soon...
Posted by: | February 19, 2008 at 10:15 AM
I made the changes you described to Rails 1.2. Everything seems to work. Thanks!
Posted by: Paul Kmiec | February 19, 2008 at 03:31 PM
Updated patch for Rails 1.2 is at its place too:
http://pluron.typepad.com/pluron/patches/no_block_args_in_associations.patch
Posted by: Alexander Dymo | February 20, 2008 at 09:41 AM
Alexander, have you considered making these into a plugin for Rails 1.2 and 2.0? It would be much easier to just update a plugin every time you make a blog post regarding Rails performance than to hack the patch into a monkeypatch.
Posted by: Dave Myron | March 18, 2008 at 12:32 PM
Thanks for looking into this stuff. I just now realized how expensive named parameters are:
>> Benchmark.measure {10000000.times{go {}}}.real
=> 3.4910409450531
Calling a function with a named block:
>> def go 3; end
>> Benchmark.measure {10000000.times{go {}}}.real
=> 25.6680719852448
Calling a function with a named block which named block is then .call’ed
=> 37.7056579589844
versus yielding:
>> Benchmark.measure {10000000.times{yields_once{}}}.real
=> 5.47482490539551
Also with so much memory saved it will save on the garbage collecting. Rock on.
-R
Posted by: roger matching ties | July 29, 2008 at 10:46 AM
Now if we can just figure out how to get AR creation time low we'll be set. Maybe hacking the mysql adapter to create the objects for us? :)
-R
Posted by: rogerdpack | August 11, 2008 at 01:31 PM