*

2008 / March 17th/ Scaling is for nerds

One of the most interesting bits of information from SXSW for me was a small conversation I had while walking to Dinner with Jakob Heuser from Gaia Online. We were discussing frameworks and scaling, and more or less his opinion was that frameworks don’t scale. Don’t matter if it’s Rails, Django, Symphony — they don’t scale in a general sense. I agree with him completely, but let me qualify that a bit first.

The question that struck me in particular came when someone was badgering him at around what size site would you encounter problems scaling Rails. His response was something along the lines of: “No offense, but you won’t ever build a site popular enough to worry about it.” And it’s damn true.

Who cares if your framework can scale if it’s never going to be big enough to worry about? The truth is that almost all web ventures never become big enough to even need a separate database server, let alone a farm of memcached servers, file servers, and enough computing power to plan a trip to Orion. That doesn’t mean they won’t become profitable. You can be extremely profitable with relatively low traffic.

So don’t worry about scaling — scaling is for nerds. By the time you hit pain points, you can bring in someone who really knows what they’re doing. Most importantly, by the time you hit pain points, you should be profitable enough to not worry about bringing in someone who knows what they’re doing.

And no, just because your site goes down for an hour while on the home page of digg.com doesn’t mean that you need to worry about scaling.

16 Comments

comments feed

  1. Gravatar
    kaiser

    March 17th | #

    agreed, but there’s something to be said about not thinking of scaling. you don’t need all the gear at the start, but you should have a path to grow with instead of burning cash when the time comes and hacking a fix together.

    or maybe it’s just me…at least nowadays i can’t fully separate infrastructure needs from actual development. doesn’t make sense to not have those considerations when developing for a medium like this, if only for curiosity’s sake.

  2. Wow! What a painfully pessimistic and shortsighted view of what you do, and the people you do it for! I guess I should thank guys like you, since I’m one of those experts who has to come in and rewrite entire web-based applications so that they can scale.

    This is how the ‘Big Ball of Mud Design Pattern’ propogates (if you’re unfamiliar with it try Google), and possibly why web professionals are so woefully under-paid and under-valued. I hope your potential clients read this blog post first.

    I guess I’m just too old, ’cause I was always taught that any job worth doing is worth doing right.

  3. Gravatar
    Andy Matthews

    March 18th | #

    Subscribing.

  4. Gravatar
    Kyle

    March 18th | #

    kaiser: I think there’s always value in keeping scaling in mind, but I suppose I was more pushing for not bringing it to the forefront. I can’t tell you the amount of people I know that launch a site with 3-4 servers, and end up handling 1,000 uniques a day; meanwhile they easy could have launched with 1 server and focused on the product.

    Steve: Doing it right also means… well, doing it right. It means not spending thousands of hours working to optimize parts of your code that will never become bottlenecks, or worse yet — spending thousands of hours building out a fully scalable architecture for a product that will be scrapped two months down the road. Programming is about solving problems — solving the right problems, not just blindly solving every problem you could possibly encounter.

  5. Gravatar
    Fellow Web Developer

    March 18th | #

    I think Steve’s comment articulates the point incredibly well. If you are a good web developer, than you are one that actively tries to prevent issues before they occur. The words in this article convey a completely different mindset than what a true engineer should have. It’s completely pessimistic, and seems to be written with the desire to simply have a differing opinion, and not one which actually attempts to deliver any sort of relevant/important meaning to it.

    The reason for the need to have a great attention to detail in your developments is not so that you can spend thousands of hours working on optimizing meaningless objects, etc. You can easily take care of that with unit testing and profiling. It is so that when your product does become successful, your program will be so secure that there is much less risk of someone or something ruining you, and hacking into your customer data, etc.

    “I guess I’m just too old, ’cause I was always taught that any job worth doing is worth doing right.”

    Can that statement be any more true?

  6. Gravatar
    Kyle

    March 18th | #

    My post does not anywhere say anything about not optimizing code (this is a vastly different task), or anything about not paying attention to detail. Feel free to continue to argue points I did not make — that’s fine by me :)

    I’d highly suggest you guys read the first paragraph of my article again, and realize at what severity, and what context I’m referring to scaling. Or, as I said — please continue to argue for the sake of arguing.

  7. Gravatar
    Andy Beeching

    March 19th | #

    I reckon the truth is somewhere in-between. I can appreciate how this article could be construed as pessimistic and dismissive of optimisation and scaling best practices in general, but on the other hand it doesn’t explicitly advocate being a lazy developer and not taking any of these issues into account at all.

    I would have thought if you’re writing code that is abstracted and loosely coupled with design patterns/OOP, and unit tested, along with other software engineering techniques, then, as Kyle mentions, when and if you hit scaling issues, introducing extra servers, slave db’s, daemons and all the rest of it should be easier than if you don’t use any of the aforementioned methods. I agree with his point that hundreds or even thousands of man hours up front to solve non-existent scaling problems are a mis-management of precious development time… time that could have been put into the product at hand. This is obviously dependant on the type of project you’re working on, expected traffic, and media you might be serving (i.e. video might require the use of a CDN or dedicated servers).

    All in all mature frameworks most likely have the capacity to scale easily to a certain point, and good developers will mitigate some of the issues by using good software development practices. If your site really is large enough require custom scaling solutions, that’s when specialists can come in and do their thing.

  8. Gravatar
    Kyle

    March 19th | #

    Andy,

    Thanks for putting it as I had intended :) Indeed, as my first paragraph mentions, I’m referring to the scaling in reference to the points beyond “normal” scaling — the points at which frameworks start to fail, and even good development practices can cause problems.

    These kinds of scaling problems are often argued about on the web, but rarely encountered. For example, people are quick to blame Rails in twitter’s scaling problems, but neglect to realize that twitter’s scaling problems are not a common problem. Worrying about the problems twitter has encountered during initial development would have been foolish. Had they spent the time worrying about it, they may have run into the same problem Pownce is having now: too much pre-planning, not enough releasing.

  9. Gravatar
    Andy Beeching

    March 19th | #

    Kyle,

    Good point! Extreme scaling (disregarding the digg-effect) is the domain of only a few thousand sites on the web (ok I’m estimating but it can’t be a very large percentage). There was a lot of chat earlier this year about the scaling capabilities of RoR, Django, and other new frameworks, and about the possibilities of Amazon’s cloud computing service being a potential solution.

    For any new site it’s a question of balance I suppose when deciding where the dev resources are allocated (product vs optimisation/maintainability/scaling). I’m not clued-up enough on Pownce to know if it is a success or not, but it sounds like even Kevin Roses’ reputation is pulling in the punters.. yet. On one hand you have to show enough product to get a buzz going (say 37s and Getting Real), but if it takes off then you need to make sure the site won’t fall over!

    Ultimately it might just come down to money, how many devs you can afford (and what quality), and how much hardware you can harness.

  10. Gravatar
    Michael

    March 20 | #

    Scalability’s a funny thing. I’d rather worry about everything but frameworks. Seriously, just be happy you have the pleasure of working with a framework if you do. I agree that you shouldn’t really have to worry about immense scalability for most sites out there.

    One of the senior devs in my company was gathering some stats the other day and he noticed we’re averaging about 2.25 million page views a day on our multiple listing app. Something like 8+ million hits a day. Scalability with this app does have a lot to do with the framework (which we’re seriously lacking), but there are also more practical things like simply setting up a proper robots.txt because search engines alone in one day would kill our servers.

    I don’t think most people have a clue about real scalability issues, so I tend to agree with you in that you just don’t have to worry about it because no site that you or I make is going to get this big. Period. You’re never gonna need three photo servers, a couple database servers, load balancers, and so forth. You’re never gonna need to worry if that one text file you call on every page is going to build up the bandwidth and be a performance issue, or whether you need to start caching crap left and right. Frameworks are good but generally won’t help or hinder the real issues.

  11. Gravatar
    David

    April 11th | #

    There is a balance.

    I remember when I first started on my PHP Framework (CX) that I read something along the lines of “if you had anything worth saying, you wouldn’t be building a framework to blog on - you would be blogging!” And like your point above, you just need to get out and do it.

    But, a “job worth doing is worth doing right”. Big companies didn’t make it without some planning.

  12. Gravatar
    Gabriel Kent

    April 18th | #

    hmm…

    Seriously… its actually pretty easy to scale now — in some cases within a utility model.

    Just got back from beers with the guys from 3tera… so falling upon this now… is interesting.

    3tera provides a superior virtual infrastructure IDE + core services & management… while ec2/appengine provide a functionally scaled down albeit more utility-ish service.

    Most think only of scaling ‘up’ while gracefully scaling ‘down’ is probably more important, because as you point out, most sites won’t need the upper capacity of a given container. Using as little as possible and being charged only for that use… is scale… too.

    bah… scale is for everyone now.

    I’ve been pushing them on the fact that even though their service is so superior, the cost of entry into their realm is too prohibitive for the OSS world. I think if some love and support were there, I bet some initial subsidization could occur to bring down the initial costs for OSS related projects. 3tera seems very open to this. For disclosure, I am a customer and a friend of the co… I would just really like to see general grid capacity behind their tech.

    …this is like ‘real names’ stuff :D

    enjoy!

    (;||<

  13. Gravatar
    Randall

    April 24th | #

    Just to add to what Gabriel said, the good thing in all this is that worrying about scaling a site is becoming less of a concern, due to the explosion of Cloud Computing. It’s still a little more expensive than hosting a site at dreamhost, but it is quite comparable to a managed server.

    At Qrimp, http://www.qrimp.com, we use Mosso to scale the infrastructure so we can concentrate on the developing the software. Mosso abstracts the hardware layer into a multipurpose infrastructure and Qrimp abstracts the software into a multipurpose platform. The benefits are compounded.

    These simplifications of layers are going to continue to feed Moore’s law. While Moore focused on hardware computing power, but his ideas can be extrapolated into solving problems with technology. Not only is the hardware getting faster, but it’s getting easier to put it to use, because building the software is getting easier. It’s exponential growth in both directions.

  14. Gravatar
    Kyle

    April 24th | #

    Gabriel & Randall: I see cloud computing services like EC2 and Mosso to be a portion of solving the scaling issue, but by no means does it mean using them solves all your problems. Processing power is only one very small aspect of scaling bottlenecks, and you never know where yours is going to hit.

    But again, almost all websites won’t even need a second server so thinking about issues like this right off the bat is careless, IMO.

  15. Gravatar
    Jakob Heuser

    May 7th | #

    Hey,

    Good to find you post SXSW, it was really good to talk to you and I definitely remember this conversation. To a lot of the people who are being very critical, it’s important to remember that the comment “you’re never going to be big enough to worry about it” is true for a majority of companies. That doesn’t mean write bad code, and it doesn’t mean pick a framework that will be difficult to divorce yourself from later. It does mean when you are trying to demonstrate business value, you probably won’t be at a size where scaling is a concern and you should be focusing on the “demonstrating value” part.

    Once you’ve demonstrated enough value, and once you’ve enough traffic that scaling will be a concern, you’ll probably (1) know where your pain points are and (2) have gained enough experience to begin tackling them. Otherwise, you will be taking your best guesses at where your scaling won’t work and end up with the Premature Optimization Anti-Pattern.

    The original quote “frameworks don’t scale, architectures do” is attributed Cal Henderson (though Blaine was the one to first mention it at our panel).

  16. Gravatar
    MaxTheITpro

    May 9th | #

    I wonder how this scaling thingy plays out with a web hosting outfit like Media Temple and it’s grid hosting infrastructure. Could their type of service offering take less stress out of coders having to ponder the scaling question? I’m just curious. Has anyone here ran their web applications on such a hardware architecture? Great discussion though.

Make a Comment

don’t be afraid, it’s just text

Comments are parsed with Markdown. Basic HTML is also allowed.