has_many :codes

Why you should think twice before using awesome_print (and similar) in production

Published  

In the previous posts I’ve shown several examples on how to prettify Ruby objects in a console, or in an application’s logs, with the purpose of making this information more readable, more user friendly, thus more useful, especially when debugging; I’ve also suggested how, by overriding an object’s inspect method, it is also possible to obtain information about the object that is not only displayed in a prettier, colourful way, but that is also more complete than what default “printers” usually return; I suggest your read those posts first – unless you have already – to get some context.

In particular, we’ve also seen how easy it is to do all this by using gems such as awesome_print, rather than having to write custom inspect methods.

At this point, whether you already knew awesome_print (and similar) or not, you may be tempted to use it also to improve your application logs in a production environment – unless you are already doing so!; normal production logs, in fact, can be pretty useless or at least difficult to read when investigating unknown problems. So using awesome_print and similar methods also in production, as shown, may sound like a good idea.

However, if you have already read the previous posts, you may also remember that I mentioned that you should be careful as to when and how you use awesome_print. The advice is particularly valid for awesome_print, but it may also apply to other Ruby gems that work in a similar way. The reason lies behind the way these gems -in most cases- access Ruby objects and their properties so that they can print the relevant information.

This may surprise some, but for example awesome_print can be a real performance killer if misused in a production environment! Let’s have a look at an example with a super simple Sinatra application -a single file, test.rb- to illustrate the issue:

require "rubygems"
require "active_record"
require "sinatra"
require "sinatra/activerecord"

class PageStatus < ActiveRecord::Base
  establish_connection :host => "localhost",
    :adapter => "mysql",
    :user => "root",
    :database => "blog_engine_development"
end

get "/test" do
  p PageStatus.all
  PageStatus.count.to_s
end

So here we have a very simple Sinatra application with an ActiveRecord model, and a single route that first logs all the records to stdout (through the default inspect method thanks to the shorthand p) – just to simulate some real debugging information being logged by an application – and then returns the records count. Let’s check quickly that it works, by starting the Sinatra application, then issuing a request with wget, to ensure that the output in stdout is what we expect (the array of records):

# starting the Sinatra application
510 $ ruby test.rb
== Sinatra/1.1.2 has taken the stage on 4567 for development with backup from Thin
>> Thin web server (v1.2.7 codename No Hup)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:4567, CTRL+C to stop

# issuing a request with wget - see the records count returned
511 $ wget -qO- http://0.0.0.0:4567/test
4

# back to the application output
== Sinatra/1.1.2 has taken the stage on 4567 for development with backup from Thin
>> Thin web server (v1.2.7 codename No Hup)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:4567, CTRL+C to stop
D, [2011-01-23T22:00:20.973396 #21308] DEBUG -- : PageStatus Load (0.3ms) SELECT `page_statuses`.* FROM `page_statuses`
[#<PageStatus id: 1, name: "Draft">, #<PageStatus id: 2, name: "Published">, #<PageStatus id: 3, name: "Scheduled">, #<PageStatus id: 4, name: "Deleted">]
D, [2011-01-23T22:00:20.976202 #21308] DEBUG -- : SQL (0.2ms) SELECT COUNT(*) FROM `page_statuses`
127.0.0.1 - - [23/Jan/2011 22:00:20] "GET /test HTTP/1.0" 200 1 0.0786

So, as expected, we can see the output generated by p. Here we are actually not storing this debugging information in a log file, but bear with me for a second, you’ll see it doesn’t really matter. Next, let’s see for example with httperf how many requests per second can be processed. For the purpose of this test, we don’t really need precise figures nor more scientifically or mathematically accurate tests; a rough indication of the application’s performance will be more than enough:

604 $ httperf --server 0.0.0.0 --port 4567 --num-conns 1000 --num-calls=10 --uri /test
httperf --client=0/1 --server=0.0.0.0 --port=4567 --uri=/test --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=10
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 1

Total: connections 1000 requests 10000 replies 10000 test-duration 23.017 s

Connection rate: 43.4 conn/s (23.0 ms/conn, <=1 concurrent connections)
Connection time [ms]: min 16.6 avg 23.0 max 154.7 median 19.5 stddev 9.8
Connection time [ms]: connect 0.3
Connection length [replies/conn]: 10.000

Request rate: 434.5 req/s (2.3 ms/req)
Request size [B]: 64.0

Reply rate [replies/s]: min 426.8 avg 431.9 max 443.2 stddev 7.6 (4 samples)
Reply time [ms]: response 2.3 transfer 0.0
Reply size [B]: header 137.0 content 1.0 footer 0.0 (total 138.0)
Reply status: 1xx=0 2xx=10000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 4.50 system 18.45 (user 19.5% system 80.1% total 99.7%)
Net I/O: 85.7 KB/s (0.7*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

I got around 434 req/sec. Update: As suggested by reader nicholaides, I have also included a test without logging at all for comparison, so to see what sort of impact normal logging has on performance vs no logging at all. So I have simply removed the call to the p method in the test above, leaving everything else as it was:

...
get "/test" do
  PageStatus.all
  PageStatus.count.to_s
end

The result:

504 $ httperf --client=0/1 --server=0.0.0.0 --port=4567 --uri=/test --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=10
httperf --client=0/1 --server=0.0.0.0 --port=4567 --uri=/test --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=10
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 1

Total: connections 1000 requests 10000 replies 10000 test-duration 19.966 s

Connection rate: 50.1 conn/s (20.0 ms/conn, <=1 concurrent connections)
Connection time [ms]: min 14.3 avg 20.0 max 59.9 median 16.5 stddev 8.6
Connection time [ms]: connect 0.2
Connection length [replies/conn]: 10.000

Request rate: 500.9 req/s (2.0 ms/req)
Request size [B]: 64.0

Reply rate [replies/s]: min 493.8 avg 505.2 max 519.4 stddev 13.0 (3 samples)
Reply time [ms]: response 2.0 transfer 0.0
Reply size [B]: header 137.0 content 1.0 footer 0.0 (total 138.0)
Reply status: 1xx=0 2xx=10000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 3.90 system 16.02 (user 19.5% system 80.2% total 99.8%)
Net I/O: 98.8 KB/s (0.8*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

So this tiny sample application performs about 15% better than without logging. Again, this is just an example to give an idea, figures may vary greatly depending on the application and too many other factors, but the point here is that an application without logging is faster than with logging – which should be expected behaviour anyway.

Next, let’s change the code slightly so to use awesome_print instead of the usual inspect method (notice the call to ap instead of p):

require "rubygems"
require "active_record"
require "sinatra"
require "sinatra/activerecord"
require "awesome_print"

class PageStatus < ActiveRecord::Base
  establish_connection :host => "localhost",
  :adapter => "mysql",
  :user => "root",
  :database => "blog_engine_development"
end

get "/test" do
  ap PageStatus.all
  PageStatus.count.to_s
end

With wget, we can verify that, as expected, the debugging information we are sending to stdout is now a lot prettier and more readable thanks to awesome_print:

537 $ ruby test.rb
== Sinatra/1.1.2 has taken the stage on 4567 for development with backup from Thin
>> Thin web server (v1.2.7 codename No Hup)
>> Maximum connections set to 1024
>> Listening on 0.0.0.0:4567, CTRL+C to stop
D, [2011-01-23T22:11:14.943581 #21337] DEBUG -- : PageStatus Load (0.2ms) SELECT `page_statuses`.* FROM `page_statuses`
[
[0] #<PageStatus:0x000001023fb850> {
:id => 1,
:name => "Draft"
},
[1] #<PageStatus:0x000001023fb710> {
:id => 2,
:name => "Published"
},
[2] #<PageStatus:0x000001023fb5d0> {
:id => 3,
:name => "Scheduled"
},
[3] #<PageStatus:0x000001023fb490> {
:id => 4,
:name => "Deleted"
}
]
D, [2011-01-23T22:11:14.958228 #21337] DEBUG -- : SQL (0.3ms) SELECT COUNT(*) FROM `page_statuses`
127.0.0.1 - - [23/Jan/2011 22:11:14] "GET /test HTTP/1.0" 200 1 0.0965

But what about performance? Let’s repeat the same httperf test with the awesome_print gem enabled:

601 $ httperf --server 0.0.0.0 --port 4567 --num-conns 1000 --num-calls=10 --uri /test
httperf --client=0/1 --server=0.0.0.0 --port=4567 --uri=/test --send-buffer=4096 --recv-buffer=16384 --num-conns=1000 --num-calls=10
httperf: warning: open file limit > FD_SETSIZE; limiting max. # of open files to FD_SETSIZE
Maximum connect burst length: 1

Total: connections 1000 requests 10000 replies 10000 test-duration 31.787 s

Connection rate: 31.5 conn/s (31.8 ms/conn, <=1 concurrent connections)
Connection time [ms]: min 23.0 avg 31.8 max 158.3 median 27.5 stddev 11.4
Connection time [ms]: connect 0.3
Connection length [replies/conn]: 10.000

Request rate: 314.6 req/s (3.2 ms/req)
Request size [B]: 64.0

Reply rate [replies/s]: min 301.6 avg 314.4 max 327.4 stddev 9.9 (6 samples)
Reply time [ms]: response 3.2 transfer 0.0
Reply size [B]: header 137.0 content 1.0 footer 0.0 (total 138.0)
Reply status: 1xx=0 2xx=10000 3xx=0 4xx=0 5xx=0

CPU time [s]: user 6.18 system 25.52 (user 19.4% system 80.3% total 99.7%)
Net I/O: 62.1 KB/s (0.5*10^6 bps)

Errors: total 0 client-timo 0 socket-timo 0 connrefused 0 connreset 0
Errors: fd-unavail 0 addrunavail 0 ftab-full 0 other 0

Wow, the application’s performance dropped by almost a good 30%, by just using awesome_print once vs p! And the difference is -37% if we compare with no logging. Think of a real world application, where you may want to log many objects and data, and imagine what sort of impact on performance it would have if you used awesome_print to prettify your logs!

Same test, but with a small change in the code so that we use awesome_print twice in the same route:

…
get "/test" do
  p PageStatus.all
  p PageStatus.first
  PageStatus.count.to_s
end
...

With inspect (p), tests show figures around 350 req/sec, while I got values around 250 req/sec with awesome_print – still around 30% performance; then again, with 3 calls to p (I’ve simply added “p PageStatus.last“), I got around 270 req/sec, and around 190 req/sec with ap. Yet again, the difference is around 30%.

What do these figures show? Essentially two things:

  • however prettier application logs can be, it’s not a good idea to use awesome_print (or similar) in a production environment as it clearly affects performance in a significant way;
  • regardless of which method(s) your application uses to log debugging information, the more information your application has to manipulate and prettify before logging it with separate p/ap (etc) calls, the slower the application itself becomes. Also, as suggested that a test I added later to the post, even a minimum amount of logging does impact the performance of an application vs no logging.

The second point should be something obvious somehow, but from my experience with other developers I am led to think that most people tend to overlook how much heavy logging can hurt performance, and I also think most people only think of performance drawbacks in association with logging to file, because of the added I/O overhead. In truth, as shown in these simple tests, performance is affected by logging regardless of whether information is stored in a file or not.

I usually prefer having as much debugging information as possible in my application logs, especially for new, large and important applications for which even a small unexpected or unidentified problem can produce significant loss in revenue for my client or employer, if it is not fixed right away. I recently built (in Sinatra) a pretty large and complex API (by Sinatra standards) which is responsible for the core operations of my former employer’s business in the online lead generation market; these operations – in particular: cleansing, validation and delivery of data – are made of several processes and elements; without a good amount of logging, it would have been very hard, in some cases, to figure out the source of a problem.

To address performance penalties while being able to collect a good amount of debugging information, I started work on a component that will collect this information in a different way to what other loggers do, with a lower impact on the performance. However it is at a very early stage (know of something similar and ready to use? please let me know!), so as a general rule following these simple tests, my advice is obviously to keep logging limited to a reasonable amount, and to always keep the previous considerations regarding performance in mind when setting up applications in production, especially if you still wish to prettify your logs with awesome_print or similar methods.

© Vito Botta