Why you should think twice before using awesome_print (and similar) in production

awesome_print-ruby-gem-performance-killer

In the previous two posts, I’ve shown several examples on how to prettify Ruby objects in a console, or in an application’s logs, with the purpose of making this information more readable, more user friendly, thus more useful, especially when debugging; I’ve also suggested how, by overriding an object’s inspect method, it is also possible to obtain information about the object that is not only displayed in a prettier, colourful way, but that is also more complete than what default “printers” usually return; I suggest your read those posts first – unless you have already – to get some context.

In particular, we’ve also seen how easy it is to do all this by using gems such as awesome_print, rather than having to write custom inspect methods.

At this point, whether you already knew awesome_print (and similar) or not, you may be tempted to use it also to improve your application logs in a production environment – unless you are already doing so!; normal production logs, in fact, can be pretty useless or at least difficult to read when investigating unknown problems. So using awesome_print and similar methods also in production, as shown, may sound like a good idea.

However, if you have already read the previous posts, you may also remember that I mentioned that you should be careful as to when and how you use awesome_print. The advice is particularly valid for awesome_print, but it may also apply to other Ruby gems that work in a similar way. The reason lies behind the way these gems -in most cases- access Ruby objects and their properties so that they can print the relevant information.

This may surprise some, but for example awesome_print can be a real performance killer if misused in a production environment! Let’s have a look at an example with a super simple Sinatra application -a single file, test.rb- to illustrate the issue:

So here we have a very simple Sinatra application with an ActiveRecord model, and a single route that first logs all the records to stdout (through the default inspect method thanks to the shorthand p) – just to simulate some real debugging information being logged by an application – and then returns the records count. Let’s check quickly that it works, by starting the Sinatra application, then issuing a request with wget, to ensure that the output in stdout is what we expect (the array of records):

So, as expected, we can see the output generated by p. Here we are actually not storing this debugging information in a log file, but bear with me for a second, you’ll see it doesn’t really matter. Next, let’s see for example with httperf how many requests per second can be processed. For the purpose of this test, we don’t really need precise figures nor more scientifically or mathematically accurate tests; a rough indication of the application’s performance will be more than enough:

I got around 434 req/sec. Update: As suggested by reader nicholaides, I have also included a test without logging at all for comparison, so to see what sort of impact normal logging has on performance vs no logging at all. So I have simply removed the call to the p method in the test above, leaving everything else as it was:

The result:

So this tiny sample application performs about 15% better than without logging. Again, this is just an example to give an idea, figures may vary greatly depending on the application and too many other factors, but the point here is that an application without logging is faster than with logging – which should be expected behaviour anyway.

Next, let’s change the code slightly so to use awesome_print instead of the usual inspect method (notice the call to ap instead of p):

With wget, we can verify that, as expected, the debugging information we are sending to stdout is now a lot prettier and more readable thanks to awesome_print:

But what about performance? Let’s repeat the same httperf test with the awesome_print gem enabled:

Wow, the application’s performance dropped by almost a good 30%, by just using awesome_print once vs p! And the difference is -37% if we compare with no logging. Think of a real world application, where you may want to log many objects and data, and imagine what sort of impact on performance it would have if you used awesome_print to prettify your logs!

Same test, but with a small change in the code so that we use awesome_print twice in the same route:

With inspect (p), tests show figures around 350 req/sec, while I got values around 250 req/sec with awesome_print - still around 30% performance; then again, with 3 calls to p (I’ve simply added “p PageStatus.last“), I got around 270 req/sec, and around 190 req/sec with ap. Yet again, the difference is around 30%.

What do these figures show? Essentially two things:

a) however prettier application logs can be, it’s not a good idea to use awesome_print (or similar) in a production environment as it clearly affects performance in a significant way;

b) regardless of which method(s) your application uses to log debugging information, the more information your application has to manipulate and prettify before logging it with separate p/ap (etc) calls, the slower the application itself becomes. Also, as suggested that a test I added later to the post, even a minimum amount of logging does impact the performance of an application vs no logging.

The second point should be something obvious somehow, but from my experience with other developers I am led to think that most people tend to overlook how much heavy logging can hurt performance, and I also think most people only think of performance drawbacks in association with logging to file, because of the added I/O overhead. In truth, as shown in these simple tests, performance is affected by logging regardless of whether information is stored in a file or not.

I usually prefer having as much debugging information as possible in my application logs, especially for new, large and important applications for which even a small unexpected or unidentified problem can produce significant loss in revenue for my client or employer, if it is not fixed right away. I recently built (in Sinatra) a pretty large and complex API (by Sinatra standards) which is responsible for the core operations of my former employer’s business in the online lead generation market; these operations – in particular: cleansing, validation and delivery of data – are made of several processes and elements; without a good amount of logging, it would have been very hard, in some cases, to figure out the source of a problem.

To address performance penalties while being able to collect a good amount of debugging information, I started work on a component that will collect this information in a different way to what other loggers do, with a lower impact on the performance. However it is at a very early stage (know of something similar and ready to use? please let me know!), so as a general rule following these simple tests, my advice is obviously to keep logging limited to a reasonable amount, and to always keep the previous considerations regarding performance in mind when setting up applications in production, especially if you still wish to prettify your logs with awesome_print or similar methods.




Have your say!

Please see my comment policy if this is your first time here or if you have any questions regarding comments.