Outage post-mortem

Posted by Akhil Gupta on January 12, 2014

On Friday evening our service went down during scheduled maintenance. The service was back up and running about three hours later, with core service fully restored by 4:40 PM PT on Sunday.

For the past couple of days, we’ve been working around the clock to restore full access as soon as possible. Though we’ve shared some brief updates along the way, we owe you a detailed explanation of what happened and what we’ve learned.


What happened?

We use thousands of databases to run Dropbox. Each database has one master and two replica machines for redundancy. In addition, we perform full and incremental data backups and store them in a separate environment.

On Friday at 5:30 PM PT, we had a planned maintenance scheduled to upgrade the OS on some of our machines. During this process, the upgrade script checks to make sure there is no active data on the machine before installing the new OS.

A subtle bug in the script caused the command to reinstall a small number of active machines. Unfortunately, some master-replica pairs were impacted which resulted in the site going down.

Your files were never at risk during the outage. These databases do not contain file data. We use them to provide some of our features (for example, photo album sharing, camera uploads, and some API features).

To restore service as fast as possible, we performed the recovery from our backups. We were able to restore most functionality within 3 hours, but the large size of some of our databases slowed recovery, and it took until 4:40 PM PT today for core service to fully return.


What did we learn?

Distributed state verification

Over the past few years our infrastructure has grown rapidly to support hundreds of millions of users. We routinely upgrade and repurpose our machines. When doing so, we run scripts that remotely verify the production state of each machine. In this case, a bug in the script caused the upgrade to run on a handful of machines serving production traffic.

We’ve since added an additional layer of checks that require machines to locally verify their state before executing incoming commands. This enables machines that self-identify as running critical processes to refuse potentially destructive operations.

Faster disaster recovery

When running infrastructure at large scale, the standard practice of running multiple replicas provides redundancy. However, should those replicas fail, the only option is to restore from backup. The standard tool used to recover MySQL data from backups is slow when dealing with large data sets.

To speed up our recovery, we developed a tool that parallelizes the replay of binary logs. This enables much faster recovery from large MySQL backups. We plan to open source this tool so others can benefit from what we’ve learned.

We know you rely on Dropbox to get things done, and we’re very sorry for the disruption. We wanted to share these technical details to shed some light on what we’re doing in response. Thanks for your patience and support.

Akhil
Head of Infrastructure

  • Eastrich

    Thank you so much Akhil.
    Now we understand what went on,in the last 48+ hours.
    Our thanks to you and all the DB team.

  • Kamus Hadenes

    If you had posted this on the same day the issue happened instead of just saying “we’re working on it” without any decent explanation, you could have saved at least one customer (me). Don’t take me bad, I always loved Dropbox and the language it is developed in (Python), and I was really excited when I saw that Guido was on your team, but that disrespect with us customers was so outrageous I had no other choice but to quit. You guys did and do a wonderful work and I’m sure you’re all very talented, but this lack of communication forced me to leave, regretfully. Keep on the good work and communicate better with your customers. Maybe I get back one day :)

    • akerl

      Really? I’d much rather they do what they did: confirm that they’re aware of the issue quickly, and get back to the real task of fixing it. The whole reason post-mortems exist is because writing out a detailed explanation mid-crisis wastes valuable time and often requires information that isn’t fully collected until the dust has settled.

      • Kamus Hadenes

        “Our routine maintenance script had a little bug that happened to bring some servers down during an OS upgrade. Only some databases where affected, all files are safe. We’ll detail everything after we fix this, stay tuned”. Took me less than a minute to write, and they knew what was the problem on the very first hours.

        • Santiago Rojas

          Except for the part that not all custumers could understand the meaning of, those words you know? “script”, “os upgrade”, databases of what. All that is just more confusing for normal people.

      • Armin

        I don’t fully agree. Clearly getting things running again has priority, but good customer care would have been an approximate forecast of expected outage time. A message like “be patient, we are gradually recovering” sounds good, but was a waste of valuable time of the customers, who kept trying if they were amongst the lucky few. And some of them grew really angry. If you are losing all of your customers, the whole work recovering the service is obsolete. There are always two sides of the coin.
        I

    • Logic

      It’s better they wait until the end, when they know what’s wrong and how to fix it then speculating and releasing bits of information. You need to step off your pedestal. The only one being disrespectful is you. They don’t owe you anything beyond the service they offer outlined in the service agreement. You are the worst kind of customer.

      • Kamus Hadenes

        When exactly was I disrespectful? I praised their good (astonishing actually) work on the software itself and showed my opinion on their communication strategy (or lack thereof). And yes, if I’m paying or not, they DO own me and everyone else a explanation at least to keep moods cool. “We have an issue” or “we’re working on it” is nothing more than stating the obvious, and the mindless repetition of those phrases spent more of their time than a simple, one or two line explanation of the problem. I really don’t see your point here.

        • Logic

          Your “praise” is backhanded. “You guys are so great, but you didn’t do this to my unreasonable expectations so I’m leaving, but if you want me back, change this.” And no, they don’t owe* (not own) you any explanation. In fact, all they need to say is “There was an incident and it has been corrected.”, that is unless you own part of the business or have a contract specifying otherwise. You view yourself as self-important. In fact, you view yourself more important then everyone else. Your ignorant in the ways of business and your arrogant in your importance.

          Even if they knew what’s wrong, going public saying “Yeah, we know what’s wrong, but guess what, it’s not fixed yet” is no better PR then “It’s broke, please stand by”. The moment you start leaking bits of information as to whats wrong and the progress of it, you open the flood gates. All of a sudden anyone with a twitter or forum account thinks they are a CE and are owed a detailed, personal explanation of what’s wrong, when it’ll be fixed, etc. The better route is, inform people you know there’s a problem and notify them when it’s fixed. Over communicating with people (the “customers”) who can’t help the issue, is a pointless venture.

          • Joseph

            You’re, not your. If you are going to fix someone typo, at least do it right.

          • Chris Sherlock

            Shouldn’t that be “someone’s”?

          • Anthony

            “Someone’s.” It’s possessive. If you’re going to be pedantic, do it right.

    • Defenestrator

      As someone who’s worked in a similar sort of environment, posting something like this on the same day just isn’t possible. The people working on it are too busy trying to fix the problem to stop and do a write-up, and usually there’s not enough information available yet. Posting something sooner would have meant taking longer to bring service back up, and probably would have been full of speculation that may have to be retracted later. I know it’s frustrating to not know from the outside, but getting a “working on it” at the time and a next-day writeup is really quite good.

    • Andy

      I agree with Defenestrator here. They *could* have stopped and written up something like this, but it would take time. Time that could be used to fix the problem. Not only is it not practical, and sometimes not possible, but it would cause problems. Would you rather a half hour is spent writing up a detailed explanation like this every time there’s a status update, or would you rather they write brief updates and take the 4 hours to fix the issue like they did. If they’d have spent each of these 8 previous updates writing an explanation like this, those 4 hours would have not been spent fixing the problem, and it would have been fixed 4 hours later.

  • Neives

    There’s no problem, you guys are doing an excellent job and i’m a really huge fan and user :-D
    You could give users more free space as gift for these little problem :-D :-D LOL

  • Alan

    I’m glad to get a clear update. However, I still can’t sign up for Dropbox for Business. I suspect that you turned that off and have forgotten to turn it on. Please let me sign up for a free trial of DropBox for Business!

  • http://digitizormedia.com/ Debjit Saha

    Okay so this was apparently a hoax then? The hacking incident reported by TC – http://techcrunch.com/2014/01/10/dropbox-offline-hacker-group-claims-credit/

    • Andy

      Yes it was. The hacker group just took credit for this.

  • kent

    The main communication mistake, I believe, was not posting the information regarding the disruption in a more conspicuous place–I only discovered this tech blog by accident while trying to submit a support ticket. I good solution may have been to send an email to users. I spent a good hour in a panic that my files were not being shared with coworkers during a time when many people were relying on me to disseminate files via DropBox. Surely, your PR department was not assisting with the fix and could have crafted an email to send.

  • http://www.mrgeek.me/ Ali Gajani

    Makes sense!

  • Stephen Washburn

    Good clear update. Also in my opinion as the issue took longer to resolve, you clearly got better at providing updates here on the tech blog and started pointing people to the post via twitter. I would’ve liked to see those steps happening sooner and more obviously – maybe a link to the tech blog for updates as part of the static status.dropbox.com maintenance page – but credit where credit is due you got better throughout the issue.

  • FrankenPC .

    I sincerely hope this trend of providing accurate and honest postmortems continues and spreads to consumer titans like Target (COUGH!). Bravo Dropbox.

  • Sue

    Still my photos taken on Friday are not in my files? I need them when will they arrive please ?

  • Kevin

    Put. The apology. In the first. Paragraph.

    • Chris Sherlock

      Too much text for you to read?

      • Phil Boswell

        Is it just me or are rather a lot of people confused about the distinction between “apology” and “post-mortem”?

    • Max Yankov

      Apart from the tone of that comment, it’s really good advice.

  • Sue

    It says the photos tab is turned off ??

  • http://www.miguy.com.au Michael Guy

    Does not appear to be fixed or resolved yet when i open up the photos tab,

    so the apology is quite moot.

    appreciated, but not helpful. there is no notice of this on the dropbox.com site itself nor a link to the explanatory article that the site is down.

  • callaghan001

    Great post and thanks for sharing the in-depth update. From time to time this stuff can happen and learning experiences like this one will only serve to make the product better for the future.

    Credit to you and the team for the quick response and quick restore – go enjoy a few well deserved days off now :)

  • Daniel

    Dear Dropbox team,

    I am your business customer. Like most businesses around, my business email is on my mobile phone. For your information and surprise, USA is not the center of this world and waking up my child because of an idiot message at 4am (my local time) in the night is really not necessary just to tell me Dropbox is down and you are working on it “As you may be aware, Dropbox experienced a site outage last night that occurred during internal maintenance”.

    Thank you and have a great day.

    • Daniele Sluijters

      There’s a feature called Quiet Hours on most mobile devices. By all means feel free to enable it, put your device on vibrate and/or whitelist only certain people whom may wake you up during the night.

    • Henrik

      Mute your phone and stop whining!

    • Dan Walker

      Actually, it’s good that Dropbox are open and transparent about these kinds of issues and keep the user base up to date as best they can.

      The issue you have mentioned is your own. You need to learn to better setup your email client or your mobile phone. You can use mailbox rules and sleep modes on your phone for these kind of local issues.

    • Dan

      Yea dude. You fail.

    • PROFESSOR OPINION

      You make a great point. Maybe they should just turn off Dropbox at 4am your time since I’m sure your business will never need it at 4am.

    • billycravens

      I receive emails all hours, and if you’re a normal business owner, you get such a large number of email on your phone that having a sound play with every email would be pointless. Even so, if the noise is so loud that it wakes your child, either you have an incredibly loud phone, or your house is quieter than a library. I’d recommend modifying your living arrangement or turning off email notifications, either all the time (my preference) or during the evening at least.

    • andy

      Here’s a novel idea: Put your phone on silent when you don’t want it making noise.

    • Sergey Filatov

      Have you tried dnd mode on iOS or some apps for that on Android? This is not a problem, all modern smartphones have a “Sleep Mode” functionality. You can just set times when you usually going to sleep and when you usually wake up

    • adrian

      As a businessman you should treat such emails as very important.

  • Dan Walker

    Thanks for the update, it’s refreshing to see a company as big as Dropbox is to admit their problems and their plans to remedy them in the public domain.

  • silvia bianco

    Great! but I had big problems with the payment of my upgrade. It was charged to my card all the times I tried but I want only 1 upgrade. I write to you kind assistance, please let me know something.

  • RayZ

    So tape backups saved the day. :P

    • co_alpine

      not sure they said “tape” did they? nasty word that is, tape ;)

  • iambowen

    When doing so, we run scripts that remotely verify the production state of each machine.

    Do you use the nagios check api to get the server/DBMS status to decide which one should run the upgrade script?

  • Joshua

    My compyter can still not connect to the server. He does not synchronize. I tried on the home and school network.

  • http://na-dache.blogspot.com Matthew Carroll

    Dropbox,
    Thanks so much for your service, and working so hard to restore things after the outage. I forgot how much I depend on you guys until all of a sudden you weren’t there. Thanks for the explanation, and keep up the good work.

  • Freda

    Still not working for me. Yes, I can access it and down load one file at a time into my app “forscore” but when I try to access dropbox from that app I get an error message. In order to load the 300 plus files I need to be able to access drop box via the app. It worked great before it crashed. VERY frustrating.

  • freda

    Finally working. Yeah! Just in time for me to load my files for my gig today. Whew!

  • Charlie

    still not able to use photo tab and cannot generate links to send pictures to our clients. When will the service be fully opperational.

  • Andrew

    While any outage is inconvenient, I appreciate the rundown of what happened. Far better than most service companies…

  • Santi Gutiérrez Díez

    Good job!

  • Ulf Benjaminsson

    In this post mortem there’s not even a mention about the most severe failure – the two days of information hiding, massively complicating the lives of paying customers as thousands of them kept troubleshooting in the dark.

    Even my teeny tiny electrical company knows how to do this shit. They send me an SMS *every time* there’s a power failure that might affect me, describing the problem and estimating when it will be fixed.

    I don’t expect SMSs – you can pick a more scaleable channel – but hiding the information on some sub-blog and among inane twitter spam – for days – is decidedly *not* how to do it.

    This was amateur hour. And not a single valid lesson learned. Boo.

    • Joe

      Agreed, keeping users in the dark is not a way to win business or influence customers.

  • azazeo

    Friday + evening + OS upgrade… Ok.

  • Zach

    Although it was quite an inconvenience that Dropbox was inaccessible during this time, and that (at least for me) the desktop version of Dropbox was not syncing on my machine, you guys handled this extremely well. Posting updates, restoring everything and getting it all up and running quickly. Plus, added features to your infrastructure it great. Thank you guys at Dropbox for doing such a wonderful job.

    • DTW

      I think it is nice that so many people are positive about this and it reflects the good will that Dropbox has justifiably earned. But, the reality is that what is being sold by Dropbox and purchased by the customer is access to their data 24/7/365 with an uptime of 99.9% which is 8 hours per year. To be down for two days is really major.
      And what should people expect other than that they will work to get it back up and running quickly? If they were down for much longer, there wouldn’t be much point in using the service. What they are selling is priced at a premium and they command the premium because of their reliability and ease of use. It is all about trust. I understand what happened and I am impressed by how they got it put back together, but isn’t that why I am here in the first place? That, without being down for two days is what I bought.

  • Mark

    Still not syncing properly! I can download fine but can’t upload, file just endlessly syncs…

    • adrian

      same here

  • DTW

    I understand about having reasonable expectations and that things can go wrong sometimes. But if this were an ISP outage, it would have been a major outage and well outside the uptime parameters that reasonable people could expect of a system that they rely upon.
    I did not receive an email notification, which should have been sent to all users as soon as possible. So, I wasted a lot of time trying to figure out why the major folder reorganization that I had just done wasn’t syncing. And when I went to Dropbox, the information about the outage wasn’t at all obvious. There was a banner directing me to tech.dropbox.com – you think it could have been a hot link?
    I think there were updates, and they could/should have been more frequent. I understand that they were putting all resources into recovery – I don’t really care that much about the “why.” I care that I wasted a bunch of time and lost a whole day and that I had to find out the hard way. I also care that it seems that Dropbox is priced at a premium to most other services, that I am paying for a Pro subscription, and they seem more focused on their issue, and what they learned and not on the obvious fact that a credit should be issued for service not rendered and that my lost day has a value which combined with their insufficient communication yields a loss far greater than two days of credit for my loss of access to their service. This is why some people don’t trust the cloud.

  • dluciv

    There were some words about DDOS. Did it really happen?

  • Bluesipp

    Over 800 links to my Blog still not working. Error 404. Is this a result of Friday?

    • http://andrewchamp.com/ Andrew Champ

      Dropbox should not be used as a web hosting solution for your files. Do not use it as such. I would recommend using your server.

      • Bluesipp

        Thanks for the reply. Afraid I don’t understand the term web hosting solution.
        The files I keep in my Dropbox are in the Public Folder which I believe is specifically intended for sharing via published links. Are you writing as a Dropbox representative?

        • Andy Booth

          If you don’t understand the term “web hosting solution”, frankly you shoulnt be complaining about you blog being offline. Do the research, do it right. WordPress is not the way forward.

          • Bluesipp

            I wonder what satisfaction folk get from unhelpful remarks such as this?

        • Karthikeya

          What he means by web hosting solution is a place where you store the file and let others download. Dropbox is good, but it’s always more reliable to store the file you’re hosting on your own server, so that it does not rely on one user account at some other hosting service.

          • Bluesipp

            How very kind Karthikeya, thank you . Happily I have a separate back-up (around 3,000 files)
            Even better news is that Dropbox have sorted me out and all works perfectly again.
            Thanks

  • http://codecondo.com/ Alex

    Glad it wasn’t the 3l33t h4ck3rs messing about.

  • Benedict Eduard de Pio

    i thought it was hacking. :) thanks Dropbox tech for this info

  • Marcelo Altmann

    I would like you to share some more information about the backup tool you have build to replay MySQL binary logs, my biggest question is how you deal with transactions that need to be applied in some particular order.

    • ardillon

      The short answer is that you cannot do anything to speed up the application of binary logs. However, you can make your data set amenable to cheating. If you have “transactional entities” in the same database and explicitly do NOT support cross-entity transactions, you can play back the binlog in parallel. You need discipline and a layer of code above MySQL to do this safely. Say “user” is your entity. In every transaction for a user, add a comment “/* user: */”. When you want to replay fast, create n connections, and hash each user id against a connection. The integrity is now preserved by entity, rather than the database as a whole. It’s simple. It’s fast. It’s in Vitess. There is it called “keyspace_id”.

  • DH

    I appreciate the information in the postmortem, and I certainly recognize that, by definition, a postmortem cannot be produced until after the system outage has been resolved.

    My frustration with this situation is primarily related to how long it took for Dropbox to begin posting somewhat regular updates on this blog and on twitter. The tweet from @Dropbox on Friday evening that said “The service is back up” may have had a grain of truth in it, but was quite unhelpful knowing that all mobile and desktop apps were still not functioning, at the least. Just a few more words at that point to say, “The web site is back up. Mobile and desktop apps will not be functional until the API services are back up. We don’t yet have an ETA for completion of this recovery” would have been all I wanted.

    I feel for the staff that spent their weekend consumed in the effort to recover from what sounds like a small, but very costly mistake in a script. I’m not angry, but I am very wary of depending on Dropbox in the future.

    • http://www.henkimaa.com/ yksin

      THANKS FOR FIXING THE PROBLEM, THANKS FOR THE POST MORTEM, PLEASE CONTINUE IMPROVING COMMUNICATIONS WITH YOUR USERS

      Thank you Dropbox for the hard work you put in over the weekend to fix this problem & restore full service to most users. (I’m sorry that some users still seem to be having problems, & hope their issues can be fixed quickly too.)

      I want to reiterate what others have said, & what I’ve said elsewhere, about the OTHER problem Dropbox had over the weekend, which was in communicating accurately & forthrightly to its users about the extent of the problems. The first informative updated did not appear on this blog until yesterday 1:59 PM (Pacific time). Until then, I along with numerous other users were left very much in the dark about there being a continuing systemwide problem on Dropbox’s end. This was because Dropbox had communicate on Friday night that service was restored — even though numerous users continued to be unable to sync. Please take a look at all the tickets opened over the weekend that would not have been opened had you been more forthcoming from the start. Think of the amount of time wasted by rank & file Dropbox staff responding to those tickets because upper levels of management failed in its decisionmaking about how to communicate to users.

      Truly, Dropbox, the technical issue was horrendous, but until that Sunday 1:59 update, your poor communications compounded the frustration, & seems also to have lost you some of your customer base. People who update Twitter & Facebook feeds for a company are rarely the company’s big decisionmakers & leaders; but I hope they will get the message that it is just as important to clearly communicate to customers with REAL information about the extent of a problem, & what is being done to rectify it, as it is to do the actual fixing of the problem.

      Mark V. on the user forums at https://forums.dropbox.com/topic.php?id=110222&page=36… stated it well: “I think the biggest concern I have, besides the obvious of data security and can’t access my files, is the total lack of communication. Not even on the home page does it reference an issue. People spend hours trying to trouble shoot what they think is on their end, installing and uninstalling only to find out it’s not their fault.” Exactly. Please take steps with the communication end, as well as the technical end, to make sure nothing like this happens again. Dropbox management personnel should get training in how to maintain informative transparency & candor with their customers. You’ll have a lot fewer
      people jumping ship if you do.

  • Grey Hodge

    As an IT professional, I really appreciate this report on what went wrong. It makes Dropbox that much easier to trust even when things go wrong. As noted in other posts, please just be more clear when communicating DURING the outage. We don’t need to slow down recovery with constant updates, just making the updates clearer and more accurate. Other than that, I’m still very happy with Dropbox.

  • Steve

    This doesn’t explain how the script impacted master slave pairs or how you lost data and had to recover from backup? Would love a more detailed description here? Do you use sync or async replication between master/slave? It seems like you should be able to lose a primary MySQL machine at anytime and not have data loss. If you aren’t built for that, you are going to have bigger problems down the road.

    • grao

      Agreed. What caused the slaves to fail?

  • Jean-Luc

    Fully agree ith the previous comments. I’ve followed with some … anxiety the progresses over the week-end and regretted a real lack of information. Saying ‘nothing new’ or ‘ work in progress’ has a very different meaning than nothing said.
    But thanks for this postmortem. I’m still enjoying Dropbox and remain confident.

  • big cloits

    I run a little online business, and I broadcast service bulletins on multiple channels, right where people are going to find it. My customers cannot miss the bulletin — if they go looking at all, they’ll find it. This is not difficult to do.

    Dropbox, your bulletins for this incident were lame: they were not prominent, clear, or frequent enough. There are many ways you could have made it much easier for your customers to discover that there was a service failure, and what its status was. And so I wasted the better part of an hour troubleshooting pointlessly and anxiously before finally stumbling on this blog (which I easily could have continued to miss, shudder). That’ll be about $250 for my time, please.

  • http://www.henkimaa.com/ yksin

    THANKS FOR FIXING THE PROBLEM, THANKS FOR THE POST MORTEM, PLEASE CONTINUE IMPROVING COMMUNICATIONS WITH YOUR USERS

    Thank you Dropbox for the hard work you put in over the weekend to fix this problem & restore full service to most users. (I’m sorry that some users still seem to be having problems, & hope their issues can be fixed quickly too.)

    I want to reiterate what others have said, & what I’ve said elsewhere, about the OTHER problem Dropbox had over the weekend, which was in communicating accurately & forthrightly to its users about the extent of the problems. The first informative update did not appear on this blog until yesterday 1:59 PM (Pacific time). Until then, I and numerous other users were left very much in the dark about there being a continuing systemwide problem on Dropbox’s end. This was because Dropbox had communicated on Friday night that service was restored — even though numerous users continued to be unable to sync. Please take a look at all the tickets opened over the weekend that would not have been opened had you been more forthcoming from the start. Think of the amount of time wasted by rank & file Dropbox staff responding to those tickets because upper levels of management failed in its decisionmaking about how to communicate to users.

    Truly, Dropbox, the technical issue was horrendous, but until that Sunday 1:59 update, your poor communications compounded the frustration, & seems also to have lost you some of your customer base. People who update Twitter & Facebook feeds for a company are rarely the company’s big decisionmakers & leaders; but I hope those leaders will get the message that it is just as important to clearly communicate to customers with REAL information about the extent of a problem, & what is being done to rectify it, as it is to do the actual fixing of the problem.

    Mark V. on the user forums at https://forums.dropbox.com/topic.php?id=110222&page=36… stated it well: “I think the biggest concern I have, besides the obvious of data security and can’t access my files, is the total lack of communication. Not even on the home page does it reference an issue. People spend hours trying to trouble shoot what they think is on their end, installing and uninstalling only to find out it’s not their fault.” Exactly. Please take steps with the communication end, as well as the technical end, to make sure nothing like this happens again. Dropbox management personnel should get training in how to maintain informative transparency & candor with their customers. You’ll have a lot fewer people jumping ship if you do.

    Thanks to the improvements in communication beginning with that 1:59 update, I have chosen NOT to jump ship. Please continue to improve.

    • http://www.henkimaa.com/ yksin

      P.S. I personally pay for two separate Dropbox Pro accounts.

    • rad r.

      +1.
      If perception = reality, then please better manage perception by way of transparent communication.

  • Lori Winston

    Gee, if only you had more pool tables, menu choices and funky furniture in your office, it would have inspired the kind of innovative thinking that could have prevented this problem!

  • joemomma

    This is why you use box instead of Dropbox!

    • sonyxperiageek

      But their maximum upload file size limit is so shit!

      • sonyxperiageek

        To John below me (whose comment is still not yet approved): That’s the amount of cloud storage you can get. I was talking about the maximum upload size for each ‘individual’ file..

  • TVV

    I’m still not able to get to the http://www.dropbox.com website.

  • silvia bianco

    I’m still waiting for an answer…

  • thescrybbler

    The official statement says no data was lost in the outage, but I definitely lost a file update. I checked the relevant file on the Dropbox site and the file history says the last update was on 1/9 but in fact I manually uploaded it to Dropbox on 1/10 at approximately 17:45:21 -0800 PST when I fired off my last email and left the office. Check your data, people; you may have lost updates too.

  • Guest

    It’s still

  • TheNiteOwl

    I’m still missing incredibly important folders. Is there a way to make contact with Dropbox to find out what has happened to them and how to get them back?

    • TheNiteOwl

      I discovered that someone who had been sharing the folders deleted them, coincident with the Dropbox incident. I was able to restore them all following the directions I found in the help files.

  • oni

    The mobile app (PDF-expert) on the iPad is still not syncing with dropbox……

  • gary

    My Android systems (phone & tablet) are both showing data from about a month ago!

  • Brett

    Still not able to create folders or sync with my android device..I have clients waiting, no word on the tech blog.

    • name

      You have no clients.

  • Andrew Mortimer

    So, Dropbox, what you’re saying is that no one tested the upgrade script and just *fingers crossed* ran it?

    • http://www.danesparza.net Dan Esparza

      I don’t hear that in their explanation. I hear they came across an edge case they didn’t anticipate, and they’re trying to be transparent while they learn from it.

      • Andrew Mortimer

        Shouldn’t you be testing your infrastructure, in multiple environments? No?

  • silvia bianco

    still waiting…

  • Jerry

    When will this be resolved

  • Guest

    “The problem is solved” must mean something different to me, because when the time to upload 11 songs is listed as DAYS (29 of them, to be exact) then something is not right.

  • Linda

    I have just been trying to sync Dropbox with my partner and we cannot upload shared files. Cleary this problem is not solved. It is most alarming as we rely on Dropbox for our business.

  • tokyoguru

    As of now, “The Photos tab is currently turned off” still nd I cannot easily access the photos which I moved. And recent iCloud iWorks documents remain inaccessible in folders not files despite months of “We’re fixing it” messages. Well worth the money – hahaha!

    • anon

      Dude, it went down 2 days ago…

  • christian gibson

    Even though this explanation of the hiccup is limited (‘some of our machines’ – what percentage? what was the subtle bug? etc….) it is refreshing – or even unique – that Dropbox tells us this. Compare this to this news published by banks when an internet problem occurs or by airlines when a flight is delayed.It would be great if other providers get the message. Thumbs up for dropbox!

  • Steve Weinrich

    Akhill,

    Thank you for your openness and frankness. The tech details are cool.

    My reply, though, is not to you, but to your peers in Customer Service and their boss.

    The lack of a prominent announcement on the home page as soon as the outage occurred clearly identifies the personality and business acumen of both the manager/director of customer service and the VP to which that person reports. Crisis events like this bring out the best and worst in people; stress someone and they will show their true colors.

    From my experience, there are two ways this went down in those early hours Saturday morning:

    1. Customer Service, seeing the situation for what it was – despite the infrastructure team’s optimistic claim that they were “minutes” away from full service recovery – said they planned to post a prominent announcement on the home page to let their customers know the problems they were experiencing were DropBox’s problem, not theirs. But the VP said “No, that will cause us to look bad” and the plan was scrapped.

    – or –

    2. Customer Service believed the infrastructure team when they said: “It’ll be up in just a few minutes…” (no offense intended, Akhil) and therefore didn’t do anything in response, hoping it would all just blow over. Worse, the VP didn’t step in and educate Customer Service about the fact that the infrastructure team can say whatever they want, but how they need to realize that customers are experiencing the problem NOW, so they need to make the announcement NOW.

    Either way… shame on the VP.

    Honestly, I don’t know how it actually went down, nor do I have any special knowledge or insight into the DropBox organization. Perhaps their org chart is public knowledge and I could have used names instead of titles. Who knows. Either way, I suspect there are some DropBox employees reading this, saying: “Dude! Was he in the building?! How does he know?”

    Coming from a large organization that delivers real-time services to countless customers, I’ve seen these exact same personalities come out in similar crisis, and I am certain they were at play in the DropBox management team this weekend.

    My concern is not the outage: It’s tech. It happens. Deal with it.

    My concern *is* the people in management that didn’t know to put the customer first. Instead, they put themselves first, under the guise of protecting the company’s image.

    Lesson Learned: Worry about your customer first; tell them what THEY need to know, and the rest will take care of itself.

    -Steve
    Paying DropBox User

    • Yoyo

      I can’t agree more with you, Steve! During
      emergencies, communications with customers are critical.

  • Chris

    We are a small business that has ALL of our files on Dropbox, and functionality still has not been restored for us.

    • John
    • Ehh Derp

      Ehh, that is just plain dumb. With external HD so cheap, why would you not have a copy on that as well? Free service in the cloud and you think that that is an excellent idea to keep all your business in that basket. What if you lose internet connectivity? You would be equally as stuck. Duh…

      • NoHeDidnt_YesHeDid

        You are assuming the users only access their files via the web. With all of their files on their local machines, the only loss when/if DB falters is the ability to sync for that period – the files remain intact updated locally awaiting sync.

  • Anton Chevantosky

    As of Thursday Jan 16, 10:30 am, I still cannot sync my computers.

  • Josh Berkus

    FWIW, our DR stuff is somewhat better than MySQL’s in Postgres-land. You could consider a change of database platform.

    • Harold Thétiot

      Yeah same thought use PostgreSQL instead of MySQL and slony to replicate.

      • Josh Berkus

        Not sure I’d use Slony at this time; binary replication is more reliable and easier to manage.

  • http://motionthings.net/ motionthings

    So. When dropbox had a hiccup this weekend your app used about 850MB of traffic on my mobile phone. I pay about a dollar for every MB, so that comes to about $850 on my phone bill. Where do I send the invoice for that? simon@maanen.no

  • WIndowsPhone

    Where is the Dropbox app for Windows Phone 8?

    • Vikash Bajaj

      you should have thought of that before buying a windows phone.

    • Marcelo

      Please Dropbox Team, we need a official Dropbox App for Windows Phone 8 !!!!

  • Peter

    Cool

  • Jorge Guimarães

    Who can help me, please? I updated Dropbox to the version 2.6.2 and it does’nt run anymore… message to contact the Dropbox team and send this message:

    pid: 6284
    appdata: u’C:\Users\Jorge\AppData\Roaming\Dropbox’
    real_path=u’C:\Users\Jorge\AppData\Roaming\Dropbox’
    mode=040777 uid=0 gid=0
    parent mode=040777 uid=0 gid=0
    dropbox_path: u’C:\Users\Jorge\Dropbox’
    real_path=u’C:\Users\Jorge\Dropbox’
    not found
    parent mode=040777 uid=0 gid=0
    HOME: None
    TMP: C:UsersJorgeAppDataLocalTemp
    TEMP: C:UsersJorgeAppDataLocalTemp
    tempdir: u’c:\users\jorge\appdata\local\temp’
    real_path=u’c:\users\jorge\appdata\local\temp’
    mode=040777 uid=0 gid=0
    parent mode=040777 uid=0 gid=0
    Traceback (most recent call last):
    File “dropboxclientmain.pyc”, line 1758, in main_startup
    File “dropboxclientmain.pyc”, line 1019, in run
    File “dropboxclientmain.pyc”, line 406, in startup_low
    File “dropboxclientmultiaccountinstance_database.pyc”, line 364, in __init__
    File “dropboxclientmultiaccountinstance_database.pyc”, line 65, in __init__
    File “dropboxsqlite3_helpers.pyc”, line 527, in __init__
    File “dropboxsqlite3_helpers.pyc”, line 496, in rebuild
    File “dropboxsqlite3_helpers.pyc”, line 256, in conn
    File “dropboxsqlite3_helpers.pyc”, line 236, in _create_conn
    File “dropboxsqlite3_helpers.pyc”, line 140, in connect
    File “dropbox_sqlite_ext__init__.pyc”, line 67, in load_extension
    OperationalError: Impossível localizar o módulo especificado.
    Thanks.

    • Jorge Guimarães

      It’s all okay now and again. How? Back to Dropbox 2.4.11.

  • gary

    As of Jan. 22 my Dropbox is still not working. I am migrating to Copy.

  • leigh

    All of my dropbox files are gone.
    Is there a way to call someone for help.

  • MadeByChipmunk

    30th of January and still my dropbox sync is endlessssss!

  • scwebbie

    My desktop drop box is still continuously syncing and won’t quit. i’m about ready to boot it off my computer. Is there a fix?