2007 / April 16th/ On scaling, performance, and realism
By nature I’m a front-end developer first. Programming is a hobby that I think I’m pretty good at — but the stuff I have most experience with is front-end development. Think about that word for a bit: front-end. That’s the side that people view, and the part that the clients see. We’re the last people to touch any website going out, and the people ultimately responsible for everything — because it’s the part you can see and interact with. People don’t interact with databases, they interaction with computers that access web browsers that read HTML, CSS, Flash, Images and other plugins to display the information the user interprets.
Recently there’s been a whole hubaloo about Rails (Alex, David, and even Mark) ’scaling,’ ‘performance’ and egos flying rampant. Hopefully I’ll present an alternate view that helps you take your own opinion on the subject instead of jumping to conclusions.
Performance
What is performance in a web application? Is it how many requests per second a web server can handle? Is it the number of concurrent users supported? Is it the maximum, the minimum, the average, or the 98th percentile? 95th? 90th? What the hell are we talking about?
Zed Shaw (mastermind behind Mongrel, among other things) wrote a rant about statistics a while ago that I love to refer to. The jist of it is: most people talking about performance don’t know what the hell they’re saying. They’ll spout numbers, publish graphs, and make broad statements without knowing what any of the numbers mean. There’s a good start to learning what numbers mean. Once you understand that, you can move on to the next step: what the hell are we measuring?
A great analogy to this is talking about cars. What does it mean when you say “how fast is your car?” Well, there’s 1/4 mile time, 1/8 mile time, track times, xcross times, 1/4 mile speeds, 1/8 mile speeds, top speeds, 0-60 times, 0-100 times, 40-60 times, etc. You starting to get my point? Saying “my car is faster than your car” has little meaning without reference.
Scaling
Scaling is a word that should be forbidden by programmers. I’ve heard this word misused more than people spell their wrong. Scaling is usually referred to as the capability of an application to handle an increase in load.
Hmm. But what is load? Is it pageviews? Application hits? Bandwidth? Remember, you can increase load by either forcing more interaction with your website or by realizing an increase in use of your website. This means that you may double the ‘load’ your application sees by rolling out a v2.0, or you may double the ‘load’ on your application by the number of users using it. You may also increase load on your CPU, on your memory usage, or your disk IO, your network throughput, etc, etc.
Let’s go back to our car analogy. You can mod a car to make it go faster by any number of ways. You can change the track you’re racing on, change the fuel you’re running, change the power device (engine), change the weight of the car, or change the frame’s rigidity, or even change the speed of the ground moving beneath the car. There is no one-size-fits all when it comes to scaling. You never know if you’re going to need to take a 90Ëš turn at 50, or if you’re going to need to go 250mph. The two don’t come hand-in-hand.
Caching
Why does no one talk of caching here? David is the only one who mentions it, but few other people put caching and performance in the same bucket. To me, this is the ultimate act of ignorance. Depending on the levels of caching you use (I’m speaking platform/language agnostic), you can increase overall requests/second by factors of thousands.
Let’s say a page takes 5 minutes to generate. But it’s only generated once every month through a cron job. Subsequent cached hits take a fraction of a second to generate. What’s the point in measuring the initial generation time? It’s not the time the user will be seeing, so it’s not a metric worth measuring.
Caching is a big fucking deal. Really big. Don’t ever forget that.
What do we care about?
Let’s think about this from a different perspective. Instead of thinking about requests per second, throughput and memory usage, let’s think about the only metric that matters at all: what does the person using the site think?
In my experience, people generally categorize performance as one of three options:
- Fast
- Moderate (nothing to report)
- Slow
- Unbearable
What do these equal in requests per second? How many concurrent database connections does it require to produce a ‘fast’ website? Crap. We don’t have a way to measure that. And what about CSS and Javascript load times? Flash? These all factor into the perception of performance, which is the real metric we need to measure.
But we can’t measure it
That’s the simple truth: we can’t possibly measure the single metric that matters for performance. So what’s the use making broad statements like claiming Rails has complete disregard for performance? What performance? Why make statements like PHP is faster than Django? Faster how?
Why is the programming community so hell-bent in attacking other’s technologies using metrics that they don’t understand that have no relevance to the end-user?
I drive a Volkswagen GTI. I know it’s measly little 200ish hp turbocharged engine is nothing compared to say a 911 turbo on the track. But hell if my friends and I don’t think it’s fast. And isn’t that what matters?
But I thought I heard someone once say that Hondas are faster than Volkswagens, in fact, I think they had a graph to prove it…
14 Comments
Make a Comment
don’t be afraid, it’s just text

Warpspire is the place that web professional Kyle Neath writes about the web. 


April 17th | #
Great post, you’ve nailed it.
April 17th | #
You have nailed it indeed.
April 17th | #
I’d take it even further - as someone who digs through aceess logs quite often to generate some kind of reports out of them - what do these numbers really mean? Without additional data from some other source they are quite worthless….
April 17th | #
Kyle, it seems that you went from one extreme to the other. Granted, statistics are often misused, taken out of context, and incorrectly applied, but that doesn’t mean there is no place for them. There are well defined processes for measuring performance (yes, even web apps).
Controlling for variables like server configurations, switches, link speeds, etc. and then benchmarking each application can give us a great comparison of different platforms. One example of this setup is the specweb benchmark suite. You can give it a log file to replay the same traffic pattern, run it on the same hardware, and voila, you have hard performance metrics. Bottom line is, these things do matter.
In terms of the Rails debate. I think everybody realizes that Ruby is not the ‘fastest’ language out there, but what it looses in speed it gains in development time. Choosing a platform is always about tradeoffs, and Alex’s and DHH’s war is nothing but an overblown clash of viewpoints.
April 17th | #
Ilya:
I really dislike the stereotype that “Ruby is slow, but Rails saves time in development.” It’s a bad viewpoint leftover from the “scaffolding era” of Rails.
My point is that even if you are controlling those variables, many times those are the very variables that ultimately affect end-user performance. Such minor variations I’m referring to are things like processor speed, installed memory, and number of application servers.
In the real world, each server is different. What does it matter if a Ruby app is 25% slower than a Python app in idealized conditions if the Ruby app is running on a machine with double the cores and double the clockspeed as the Python app’s server?
Throwing around statistics based on static conditions is of little to no use since the end-result is based on dynamic conditions and subjective opinions.
As long as it is possible and reasonable to make your application serve pages in an acceptable time (subjective), then the rest is irrelevant.
Does it matter if your car has more torque if the wheels are spinning? Races are won on pavement, not on dynos.
April 18th | #
just use common sense. if you’ve developed in another language before, then you can tell immediately that your Rails app runs slower. you don’t have to do a benchmark.
April 18th | #
When it comes to measuring the performance it is always easy to do some statistics. But what I have found over the long time, that I am doing programming is:
if it is fast, think about the logic behind and most often you find another way to get things done.
For example, do you need a database or just a fast lookup table. Do you need immediate response or is it enough, that the user thinks it is immediate. Can you do the hard work in the background or is it nessessary to do everything once the data is entered and the same goes on and on.
April 19th | #
[...] to this warspire post about the Twitter-Rails-Scaling broo-ha-ha here since the warspire site throws a 500 error when I [...]
April 19th | #
“In my experience, people generally categorize performance as one of three options:
Fast
Moderate (nothing to report)
Slow
Unbearable “
The number of options you state seems to not scale well…
;-)
April 19th | #
Caching should be one of the first things you think about when you’re launching a site like Twitter.
MySpace learned this the hard way…
April 19th | #
DrStankus: That was quite intended. I’ve never heard someone categorize something as nothing to report. It’s a null value that belongs in the set, but does not count towards the number of options ;)
April 20 | #
I totally agree that caching is somehow forgotten by many developer. I can’t count the number of times I had to explain it and how to do it. Many people just refuse to get it. And more than that, you need to know what to cache. For that you do need statistics, or measurments.
As for performance perception, thisis also very true. I have an application that showed a 10x improvement in user percieved performance, and went from unbearable to fast in your last scale, just from compressing and packing JS and CSS files, and loading them from multiple servers.
Guy
April 24th | #
Sure, caching is very important. But can you truly cache dynamic pages? If so, doesn’t it depend on the architecture of the application itself?
That said, scaling starts from the application design itself.
It’s all in the architects. It’s less so in programming languages. But choosing the right language and framework to develop the right application can make a big difference in development time which in turn affect scalability/performance to a certain extent.
Nothing is perfect, but how close you obtain perfection is what makes the difference between one or the other.
April 26th | #
[...] Ruby and Rails scaling and performance [...]