Surprise! The cloud may fail! |
If you did not read the news, maybe you noticed how multiple services from different companies failed at the same time: Foursquare, Reddit, Hootsuite, Quora and several hundred more companies failed or suffered problems with the fall of the Amazon Elastic Compute Cloud (EC2), a cloud computing platform used by a large number of services.
The promise of system-independence that Amazon offered as a guarantee of redundancy and stability failed to fail: several systems located in geographically separate locations failed at once due, it seems, to an uncontrolled backup procedure that made countless copies of Itself, in a cascade effect that rapidly consumed all available space and gave rise to what has already been called the "cloudgate" or the "cloudpocalipsis". Something that, indeed, should never have occurred, and which raises doubts of all kinds about the maturity and development of cloud computing as a whole.
Or not? In fact, is it in any way different from the fall of a power plant? Or the failure of a drinking water supply station? If we know something about technology, it is impossible that it should not fail, and that what we must do is take the appropriate measures so that when it fails (not "if it fails", because failure is something that reaches the category of metaphysical certainty ), The effects of the fault are as low as possible. Electrical power fails in my house often enough so that years ago I decided to purchase a modest uninterruptible power supply (UPS) for domestic use, and I know that such failures are perfectly common in many people's lives, not just In Spain but in other countries in which I have lived. When it fails, it is a major nuisance in your daily life, if not a small catastrophe due to problems of all kinds. And if you call the company, they excuse themselves and basically tell you that, that is a failure and they can not do anything, that things fail from time to time. And we talk about services such as light or water, which carry with us many, many years, in which we trust fully and on which we build many aspects of our life, around a reliability that we take for granted.
Okay, the bug should not have occurred. As we have said on other occasions, the cloud is as good - or bad - as good - or bad - be your providers. There is no "cloud", there are companies that provide services in it. Companies in which to establish certain levels of confidence, risks to be estimated and valued, avoiding both one end (to be systematically uncovered) and the other (invest more than the risk can actually assume). Both the defect and the excess pose problems, which can range from interruption of service and loss of reputation to excess cost. Technology, oh surprise, may fail. If the possibility of that failure is crucial for your company, reduce it, preferably with different suppliers. A service like this blog that you are reading has several systems of immediate alert, several alternative procedures in case of fall within my hosting provider, Acens, and even so, despite receiving protocols of attention similar to that of Acens clients with a Criticity of service infinitely greater than mine, is even made a daily backup on Amazon. And that if all else fails ... it gives me practically the same, because the service provided by this page can be anything but critical. The possible impact of a full day's fall from my blog is practically nil, because the next day, my readers will surely continue to be there: I play every day much more depending on what may happen inside my head and on Consequence of leaving my keyboard, than what can happen inside my server.
The important thing is to consider a fall like this, happened at a time of low impact (during the holiday period and one of the lowest traffic days of the year) as something to learn. For Amazon, understand that failures - within an order - can happen, shit happens, but that should not fail other fundamental elements such as communication. For those who have critical truth processes with an important impact on the transactional, directly translate to economic value, which must be redressed to the extent that it can alleviate at least part of the possible damage, and that analysis is not a napkin count that Was done once in the service, but a dynamic analysis based on the different options available, the evolution of its cost, that of our operating volume, etc. A risk analysis, cost and benefit, which can not be neglected.
AWS has allowed us to scale a complex system quickly, and extremely cost effectively. At any given point in time, we have 12 database servers, 45 app servers, six static servers and six analytics servers up and running. Our systems auto-scale when traffic or processing requirements spike, and auto-shrink when not needed in order to conserve dollars. In the ten months since we launched the public beta of our free, self-serve gamification platform we have handled over one billion API calls. Without AWS, that simply would not have been possible with our small team and limited budget. Many others have realized similar benefits from the cloud, and AWS has quickly become a critical part of the startup ecosystem.Indeed, Amazon Web Services (AWS) fell. No system is one hundred percent fault-free, and there are many lessons to be learned from all of this. But without Amazon, many things would simply be impossible. It is simply a cost versus profit balance.
Keith Smith, CEO of BigDoor, affected by the fall of Amazon Web Services (AWS)
For Amazon, the ruling is going to be an important loss. Many things can fail, but what should not fail is the essence of what you promised your customers (completely independent systems) and your communication with them. The cloud computing is in its beginnings, and we will see failures like the one of yesterday in numerous occasions. But as tangible as those same failures are their advantages in terms of scalability, flexibility, cost, performance, efficiency and many others, to the point of becoming fundamental advantages that define, for many companies, the true being or not being, the decrease Of entry barriers that make many things that would not otherwise be possible actually be possible. Which does not mean that, like everything, from time to time it may fail.
0 comentarios: