*

2006 / August 2nd/ Owning up

Anyone who’s ever owned a services business knows that eventually you’re going to screw up. Bad. Soon you’re going to have to explain this to your customer, and it isn’t going to go well. Many businesses take the “it’s not our fault” route and try and blame it on someone or something else. At first, it might seem like a good idea, but is it worth it? Sometimes it’s just a good idea to own up to your mistakes.

Making excuses

Not more than 6 months ago, I was a semi-happy-okay-probably-disgruntled TextDrive customer. At first, I really wanted to love the company. It seemed like they were doing everything right and for all the good reasons. A part of my hosting even went to support open-source projects (how awesome is that?). I was even on the “bleeding edge” running my rails apps through lighttpd, I thought life couldn’t get much better.

That was, until my site started going down. A lot.

The truth is that when I finally left TextDrive, my site was down more often than it was up. TextDrive was nice enough to provide me with an RSS feed of server outages and updates. Unfortunately, it’s never a good day when you wake up to check Bloglines and notice half of your unread items are from server outages.

The truth is TextDrive screwed up. But, the problem was not that they were screwing up in my eyes — it’s that every time they screwed up, it was constantly someone else’s fault. At first it was bad hardware. Then a bad datacenter. Then processes going amuk. Then they even blamed other users (for using too much resources).

Let’s take a quote from a recent forum post about some other outages:

What I do need you to understand is that recent outages have not been the result of negligence or careless sysadmin, nor have they been the cost of life on the bleeding edge, nor are they caused by customers treating shared servers like sandboxes, nor are they directly down to a single platform, framework or application. The outages have been caused by a combination of issues, albeit largely to do with the way email processing and webmail behave on FreeBSD 5.

Ahh, so it’s FreeBSD’s fault.

All these measures and more are underway, and all of them will make our lives better. It’s our job now: it’s all we do. If we’re a bit behind in answering support tickets, if we can get a little testy on this forum from salt in wounds, please understand the effort that is going into restoring and maintaining stability.

So, since servers are constantly going down, the customer needs to deal with the staff being bitchy and slow?

These kinds of responses are why I left TextDrive. It was a continual blame train with no end in sight. I know, I know. They’re working on it. I fully understand the immense amount of work that the staff puts into these problems. I understand that the life of a sysadmin is far from glamorous. But I still don’t see that as a reason to not own up to your own mistakes. If it’s not TextDrive’s fault — where am I, the customer, when the shit hits the fan? Blaming an open source project? That’s not the greatest position to be in.

Responses

As you peruse the responses to the TextDrive announcement, you notice a couple of things. First off, there’s not that many. Less than twenty, and this issue happened back in June. This is because TextDrive does not link these letters to the public. Rather, they’re hidden in forum posts. Furthermore, if you read through the responses, while most of them are positive, a lot of them have the customers offering suggestions. This s because of the way TxD approached the problem and blamed other people.

Then you get into the nasty stuff:

In the past week alone, Barclay has gone down, what, about 6 times? Contrast that with my previous hosting companies: Pair Networks, where I had my site from 1999 to 2003: zero unscheduled downtime, and one incident of scheduled downtime, when they relocated their data center. JohnCompanies, 2003-2004: unscheduled downtime once, scheduled downtime 4 or 5 times, but always short and during the night.

I know TxD provides a very high level of service, and I wouldn’t be surprised to hear that it is getting harder to maintain Web servers for various reasons. But I don’t know all that much about how to run Web servers–that’s why I pay you guys to do it. I think I’m a pretty patient and understanding guy, but I need dependable service.

So, any idea how much longer this is going to go on?

That last sentence sounds like it’s coming from someone who’s just had too much. They’re sick and tired and just want a solution: no more excuses. Then there are the people who admit that choosing TextDrive was a mistake:

It’s the end users that get fed up, not geeks. I host a hotel’s web site with TextDrive. When it or the email for that domain goes down, I get a call on the cell phone from a disgruntled hotel manager. I have a day job, so this is very uncool. And it’s been regular lately.

I didn’t want to admit it, because as dasil003 says, everything else about it is great, but I think it was a mistake to bring a site like this which demands reliable uptime onto TextDrive. These problems are supposedly being addressed, but I’m at the point where I’m thinking it’s too little, too late.

There’s also some arguments and more excuses from the staff. Even though I wasn’t a part of TxD when this particular problem occurred I can see why people were getting fed up. The end result is a few slightly less pissed off customers, and a few really pissed off customers that are angry with themselves for choosing TxD. Ouch.

Owning up

On the other hand, there’s always the option of simply owning up to your mistakes. Even if they aren’t your mistakes.

This is the path that Dreamhost chose when their building nearly caught on fire and resulted in a series of something that looked like armageddon for the blogsphere when Dreamhost, (mt), and MySpace all went down this past week.

The other thing to keep in mind is that this was sort of the straw that broke the camel’s back. Many people at DH had been suffering from a series of semi-unrelated issues that all resulted in periodic downtime, sluggishness and internal services going down. As a result, DreamHost published this article on their blog: Anatomy of an ongoing disaster. It’s one of the best business letters I’ve ever read. If there was ever a way to reassure my confidence in DreamHost, this was definitely it.

Let’s take a look at the first two paragraphs on the page:

As I’m sure most of you already know, we’ve had nothing but troubles, large troubles, for pretty much the last three weeks. A lot of these troubles were our fault, a couple of them were at least ostensibly beyond our control, and they all compounded each other.

Here I’ll try and go into as much detail as possible about what happened, why, and the steps we’re taking to stop this sort of thing from ever happening again. I can’t excuse what happened, just apologize and hopefully elucidate.

By prefacing this letter with a no-excuses mentality, I immediately feel like DreamHost not only understands the problem, but they’re taking full responsibility. The letter continues going into some more technical details of their problems and explaining why so many problems were happening in a row. The letter continues on, always reinforcing the fact that this was their fault, and not the hardware/software’s.

Apparently they’re not supposed to be able to support this, or at least it’s a bad idea. So, this was entirely our fault.

Even problems that were clearly not their fault, they own up to:

We’re also going to be buying our own UPSes, since we have learned we can’t trust our data center OR our building to do it. We’ll start by putting the core routers on them, then our internal databases and servers, then our file servers, and finally the hundreds of customer mail, web, and database servers.

The final paragraphs are the best part (in my opinion):

I also want to say for the record that none of these problems in my opinion stemmed from “overselling”. Rather, I’d say it’s the result of bad luck. And incompetence on our (and the building’s) part.

I don’t know if we’ll be able to change our luck, but hopefully we’ve at least learned something and will be able to become a tiny bit less incompetent in the future.

I hope you’ll all stay with us to find out.

Those last statements are what really give me confidence in their staff. By owning up to their own mistakes, DreamHost puts them in the position of completely understanding the problem. Not only that, but since they’re the ones at fault, they can assure the customers that they can fix the problems.

Responses

In contrast to TxD’s responses, the DreamHost responses are a lot like a stadium of people applauding Steve Jobs as he introduces the latest & greatest. With well over 200 comments, it’s clear that they’ve made this letter public and advertised it on all possible channels (they even sent a link to it in the monthly newsletter).

P.s. And THANKS for the explanation – knowledge is power, and I feel a bit more powerful today :)

Statements like that are some of the best possible results of a letter explaining problems. The customer ends up feeling better about themselves.

I’m not a DreamHost customer, but I just wanted to come here and leave a comment to say I admire the way you’re owning up to the problems that have been occurring recently. Laying everything out in the open for everyone to see is the way business should be done!

They’re even getting compliments from people who aren’t customers! Now of course, that’s not to say that there aren’t those who are too thrilled with the post either:

Perhaps is Josh spent more time leading this company away from disaster and less time on the stupid newsletter’s rifed with sophmoric humor, then maybe this company has a chance to survive. Maybe now is the time to sell out.

But then again, can you really expect to please everyone? The point here is that the vast majority of the 200+ comments are entirely positive.

9 Comments

comments feed

  1. Gravatar
    franky

    August 2nd | #

    I loved the comparison with Steve Jobs, because little by little DH indeed reaches almost the same effect with their blog. Didn’t DH also manage it to get cheered for overselling? Besides that their argument for overselling all were valid, but until that day I had never seen any major hoster publicly admit (well yeah) and argument why they oversell.

  2. Gravatar
    Dan

    August 3rd | #

    I have no personal experiences with either companies, and will probably never have the use for it… When doing business like this – Its all about attitude…

    I wont judge one company over the other… But respect to dreamhost for the blogpost, Anatomy of a(n ongoing) Disaster.. – Im impressed, this would be the one thing that would make me choose dreamhost in favour of any other hosting company its class…

  3. [...] Kyle Neath, creator of this blog’s WordPress theme, has a great post about recent uptime woes at TextDrive and Dreamhost and the differences in the way they handled telling their customers. [...]

  4. Gravatar
    Aaron

    August 3rd | #

    I felt the same way a couple months ago about TxD. I was one of the first to sign up and ended up on Barclay, been there up until 2 months ago when I decided to venture out to another host. That other host costed more and provided a much lower level of service than I was used to at TxD, which is why I decided to re-instate my account. This time I popped for the higher-end business account, but even their business servers are a little on the slow side.

    This is why I am waiting for them to lower the price point on their new container hosting. Garunteed resources, no noisy neighbors to screw my site over, and I’m in control of pretty much everything on my account. Hopefully the cheaper containers come sooner rather than later, but TxD’s service is the primary reason I am still with them.

  5. Gravatar
    Nick

    August 3rd | #

    I’m a Dreamhost customer and have been for over a year and a half now. This recent outage was pretty long and pretty annoying, but I read their blogpost the same way you did.

    I was satisified by the explanation, pacified by the fact that they admitted mistakes, and happy to continue hosting with them.

  6. Gravatar
    Michael Egan

    August 5th | #

    I’ve also been on Dreamhost for just over a year. I’ve had my share of minor problems but even in the middle of these serious outages I was impressed with both the response time and response quality the tech support team gave me.

    Even with their transparency I’m not sure if I would trust them with a client whose business needs 100% up time – but for all my personal and hobbyist needs I’m definitely happy.

  7. Gravatar
    Scott Mackenzie

    August 8th | #

    I tried out TextDrive and gave up on it within a week. This isn’t a ‘bag TextDrive ‘ comment, but ever since day 1 it didn’t feel like I’d really signed up for something worthwhile. I too was under the impression it was ‘the best’ Rails host and that everything would be super efficient and easy to get going… this wasn’t the really the case. I’m not sysadmin savvy and therefore found it a little hard to get going.

    So I signed up for Dreamhost and haven’t looked back. I you compare what you get and how much it costs, it’s an easy winner.

  8. Gravatar
    kaiser

    February 4th | #

    i know exactly how you feel. i’m looking into a new host now actually. the last 4 weeks have been so shitty with service. sigh.

    i wish textdrive lived up to its hype. (i wish rails did to, but that’s another story.)

  9. [...] been following somewhat the trail of fury following TextDrive and Dreamhost around and I can’t help feeling two things: sadness for my friends who have [...]

Make a Comment

don’t be afraid, it’s just text

Comments are parsed with Markdown. Basic HTML is also allowed.