<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>SysAdmin Talk &#187; Disaster Recovery</title>
	<atom:link href="http://sysadmin-talk.org/category/disaster-recovery/feed/" rel="self" type="application/rss+xml" />
	<link>http://sysadmin-talk.org</link>
	<description>Practical advice from front-line SysAdmins</description>
	<lastBuildDate>Fri, 05 Aug 2011 16:59:32 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>7 reasons why High Availability will help you fail in even more spectacular ways than ever!</title>
		<link>http://sysadmin-talk.org/2010/10/7-reasons-why-high-availability-will-help-you-fail-in-even-more-spectacular-ways-than-ever/</link>
		<comments>http://sysadmin-talk.org/2010/10/7-reasons-why-high-availability-will-help-you-fail-in-even-more-spectacular-ways-than-ever/#comments</comments>
		<pubDate>Thu, 07 Oct 2010 11:36:18 +0000</pubDate>
		<dc:creator>Wesley David</dc:creator>
				<category><![CDATA[Disaster Recovery]]></category>
		<category><![CDATA[High Availability]]></category>
		<category><![CDATA[IT Professional]]></category>
		<category><![CDATA[SysAdmin]]></category>
		<category><![CDATA[Practical Advice]]></category>

		<guid isPermaLink="false">http://sysadmin-talk.org/?p=494</guid>
		<description><![CDATA[High availability solutions do not magically guarantee the safety and availability of your systems even if they&#8217;re working flawlessly. That n+1 failover cluster you spent all that money to build? It could just be an impressively expensive disaster waiting to happen. This article will not talk about the obvious tenets of a successful high availability [...]]]></description>
			<content:encoded><![CDATA[<p>High availability solutions do not magically guarantee the safety and availability of your systems even if they&#8217;re working flawlessly. That n+1 failover cluster you spent all that money to build? It could just be an impressively expensive disaster waiting to happen.</p>
<p>This article will not talk about the obvious tenets of a successful high availability deployment. I&#8217;m going to assume everyone reading this has taken &#8220;HA 101&#8243;. <span id="more-494"></span></p>
<p>Before you even think about purchasing a high availability system of some sort, you need to understand a few important points. It may come as a surprise to some people just how many points of consideration in a HA system are not strictly technical in nature. Here are seven things to keep in mind when designing a high availability solution.</p>
<p>Oh, but before you read any further, skip down to point four. Yes, read the fourth fact first. Come on back when you&#8217;re done with it.</p>
<p>Are you back? Great! Carry on.</p>
<p><strong>First,</strong> you&#8217;re high availability system will fail you if you&#8217;re protecting the wrong things. Before <em>any </em>technical decisions are made, you need to understand the business&#8217;s processes. Get input from the business leaders and see what daily tasks take place in the business that are critical. There are probably a few applications and processes out there that you either don&#8217;t know anything about or have no idea how crucial they are. A recent exploration into one of my client&#8217;s online payment gateway systems exposed a tangled web of credit card slips, multiple payment gateways and unnecessary systems. It exposed several areas that were previously unknown to me and some things that I hadn&#8217;t realized were as important as they were. I&#8217;ve worked for them for over  two years and didn&#8217;t know any of that!</p>
<p>Get with the leaders of the business units and understand what is truly important to the operation of the business. It&#8217;s possible that you will end up spending a lot of your company&#8217;s money on a failover system that protects services that weren&#8217;t as important as you first thought, or were less important than other systems you weren&#8217;t aware of. Knowing <em>what </em>should be made highly available is more important than knowing <em>how </em>to make something highly available.</p>
<p><strong>Second,</strong> a high availability system will tumble down like the proverbial house of cards if ownership is monolithic and imperial. You should assign &#8220;ownership&#8221; of the service that is made highly available in two ways. First, one person or group should be a technical owner (I.e. an IT person or team) that knows the underpinnings of what makes the service tick. Next, a second person or group needs to be the <em>business </em>owner of the service, knowing what purpose that service has and how important it is. If ownership isn&#8217;t clearly defined and comes into dispute, handling a problem with a HA system can become problematic.</p>
<p>An IT person may want to perform maintenance at a certain time that isn&#8217;t the best option, but only someone that uses that service would know what window of time would be best for the service. A service owner may demand an upgrade at a time when technical windows are not optimal. HA is reduced to an expensive buzzword if there is no communication between the service users and the service stewards. It should probably be noted here that the most important part of a HA system is the service&#8217;s availability, not the hardware&#8217;s availability.</p>
<p>Really, this step isn&#8217;t much different from the steps that should be taken with a non-HA system. However, the tendency of thought is that if a system is highly available, sometimes unnecessary risks will be taken because &#8220;Hey! It&#8217;ll fail over!&#8221; Having a few extra eyes on it can prevent problems born from presumption.</p>
<p><strong>Thirdly, </strong>your HA system will fail if the Sysadmin(s) caring for it are slobs. You need disciplined IT habits. If you administer your high availability systems with all the alacrity of a drunken unicyclist, you will be introduced to cold, hard pavement in a most unpleasant manner. Or at least, the unemployment line. You need to follow change control processes, update software and hardware in approved manners and have rollback procedures tested and at the ready. Of course, you can&#8217;t forget the SysAdmin trifecta either: documentation, documentation and documentation. Ignoring these habits will ensure that your high availability system will become a &#8220;wistfully available&#8221; system.</p>
<p><strong>Fourth,</strong> your HA systems will make you gnash your teeth in agony if they&#8217;re not properly secured. I would venture to say that most IT people do not consider security to be a crucial part of highly available systems. In fact, I am cursed by my own words because I placed it fourth in the list. To prove a point, I&#8217;ll leave it here.</p>
<p>If you have a HA system that isn&#8217;t patched, you will probably still have a HA system. However, you will also have made a hacker very happy for the highly available spam canon that you provided to him after he rooted your box. Or maybe it was compromised by a more rudimentary virus that just destroyed data and you now have highly available space heaters that look suspiciously like servers. Possibly worst of all, a targeted attack could allow an intruder to steal information any time they wanted with the exception of 5 minutes and 15 seconds per year (the downtime implied with a 99.999% uptime SLA). Keep your stuff secured and patched.</p>
<p><strong>Fifth,</strong> highly available systems will fail in breathtaking ways if you have tunnel vision. Yes, your fancy NAS installation is geoclustered and all of your offices are replicating to an offsite datacenter. High-fives for everyone! However, it relies on LDAP authentication, and while you have two LDAP servers replicating with each other, they&#8217;re located in only one of the company&#8217;s server rooms. And there&#8217;s only one WAN link to some of the offices. And one of the geoclustered NAS nodes is in an office&#8217;s server room running on just a single circuit. And the switch closet it connects to before the line leaves the building has no lock on the door. And I think I just made you cry.</p>
<p>When designing HA solutions, don&#8217;t get caught up in the minutia of an individual implementation of some form of HA. Remember, <a href="http://thenubbyadmin.com/2010/06/16/epic-uptime-bragging-rights-or-epic-fail/">HA is about <em>service </em>availability not <em>system </em>availability</a>. It doesn&#8217;t matter if the servers have had uninterrupted run-time if access to the vital service that the servers are running has been interrupted because of an unreliable network connection, unavailable authentication methods or any number of other problems.</p>
<p><strong>Sixth,</strong> high availability does not change the fact that your backups are only as good as your restores. You still need verified backup and disaster recovery plans. Just because you&#8217;re fancy schmancy HA product is working like a charm and you&#8217;ve ramped up your change management and security habits to clinical OCD levels, doesn&#8217;t mean that things can&#8217;t go horribly wrong. If an errant application  stomps all over a database, do you know what you have? That&#8217;s correct. You have a highly available stomped-on database. You <em>do </em>have backups in some form, correct? Government agencies have been known to demand old emails on occasion. You do keep archived copies of all communication, right? You don&#8217;t have potentially sensitive information scattered around in PSTs, right? (*cough* <a title="Automatically import PSTs into Exchange 2010" href="http://www.red-gate.com/products/pst_importer_2010/?utm_source=sysadmintalk" target="_blank">Red Gate&#8217;s PST Importer 2010</a> *cough*) A configuration change  on the spiffy firewall cluster hosed some ACLs. You do take regular configuration backups, right? Furthermore, you know how to restore those backups because you regularly practice restoration procedures&#8230; right??</p>
<p>Many times we SysAdmins can forget what we&#8217;re <em>not </em>protected from by HA systems. We are not protected from the necessity of backups, backup versions, archiving, restoration drills and disaster recovery plans.</p>
<p><strong>Seventh,</strong> HA systems do not guarantee that the product it is protecting will support being run in that configuration. Check the applications and services that you&#8217;re planning on making HA to see if they&#8217;re compatible with your plans. Some applications don&#8217;t like some kinds of HA. The clustering abilities of an operating system can be less robust than a third-party application. For instance, Windows Server 2008 offers some improved clustering features compared to previous versions, but some applications just don&#8217;t support being run in a windows cluster. In that case, you might have to try a third party clustering / synchronizing product either with software or hardware.</p>
<p>However, it&#8217;s no fun to purchase servers and set up a cluster only to have the application flop around like a freshly caught salmon. Is that transparent clustering tool really, <em>really </em>transparent in every way? Equally un-fun is having your support contract nullified while you&#8217;re on the phone during an emergency when the technician discovers you&#8217;re running it in an unsupported environment. Be careful and research your choices carefully.</p>
<p>Those are the seven ways that high availability systems will help you to fail in even more spectacular ways than you ever thought possible. Remember, technology is fun and useful, but it&#8217;s no silver bullet and it&#8217;s certainly no substitute for the scientific method and some good, ol&#8217; fashioned horse sense.</p>
]]></content:encoded>
			<wfw:commentRss>http://sysadmin-talk.org/2010/10/7-reasons-why-high-availability-will-help-you-fail-in-even-more-spectacular-ways-than-ever/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>High Availability: you *are* Joking, Right?</title>
		<link>http://sysadmin-talk.org/2010/02/high-availability-you-are-joking-right/</link>
		<comments>http://sysadmin-talk.org/2010/02/high-availability-you-are-joking-right/#comments</comments>
		<pubDate>Fri, 05 Feb 2010 10:28:07 +0000</pubDate>
		<dc:creator>Robert Chipperfield</dc:creator>
				<category><![CDATA[Disaster Recovery]]></category>
		<category><![CDATA[Exchange]]></category>
		<category><![CDATA[Exchange 2007]]></category>
		<category><![CDATA[Exchange Archiving]]></category>
		<category><![CDATA[Tutorials]]></category>
		<category><![CDATA[Backups]]></category>

		<guid isPermaLink="false">http://sysadmin-talk.org/?p=104</guid>
		<description><![CDATA[At the moment, I&#8217;m working in Red Gate&#8217;s Exchange Server Archiver team. One of the great things about the way we work is that I sit just across from Scott, who&#8217;s one of our sales guys, so as a developer I get to hear what customers and potential customers are asking about, what they wish [...]]]></description>
			<content:encoded><![CDATA[<p>At the moment, I&#8217;m working in Red Gate&#8217;s Exchange Server Archiver team. One of the great things about the way we work is that I sit just across from Scott, who&#8217;s one of our sales guys, so as a developer I get to hear what customers and potential customers are asking about, what they wish we did, and what they love.<span id="more-104"></span></p>
<p>Something that came up this afternoon was a question about recovery of archived messages in the event of a catastrophic failure of the Exchange environment (think: no backups, nothing, all gone). Doing some research around the area, we found the recommended practices for backing up one of the other major archiving tools. It went along these lines (name removed to protect the guilty):</p>
<p><code>PREBACKUP.BAT:<br />
REM ---------------------------------<br />
REM prebackup.bat<br />
REM This script is to put **** into read-only mode so we can run backups<br />
REM ---------------------------------<br />
net stop /y "*** Task Controller Service"<br />
net stop /y "*** Storage Service"<br />
net stop /y "*** Indexing Service"<br />
net stop /y "*** Shopping Service"<br />
regedit /s c:\readonly.reg<br />
net start *** Storage Service"<br />
net start "*** Indexing Service"<br />
net start "*** Shopping Service"<br />
net start "*** Task Controller Service"</code></p>
<p>Hang on just one second&#8230; your recommended backup plan involves:<br />
a) Having to stop all your services, preventing users from retrieving their archived messages<br />
b) Running a regedit script?!<br />
c) Doing all this in a batch file?</p>
<p>I mean, really? In this day and age, that&#8217;s a recommended thing to do as part of your regular maintenance plan?</p>
<p>Whenever I&#8217;m supporting customers, I always get rather anxious if I think I&#8217;m even going to ask them to recycle an IIS application pool, which shouldn&#8217;t take anything down for more than a second if at all&#8230; and as for restarting a machine, well, that&#8217;s an absolute last resort.</p>
<p>Is it just me?</p>
]]></content:encoded>
			<wfw:commentRss>http://sysadmin-talk.org/2010/02/high-availability-you-are-joking-right/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
	</channel>
</rss>

