This is a read-only archive. Find the latest Linux articles, documentation, and answers at the new Linux.com!

Linux.com

Feature

Best practices for the Linux home office, part 3

By Corinne McKay and Daniel J. Urist on August 19, 2005 (8:00:00 AM)

Share    Print    Comments   

In the parts one and two of this series on using Linux in a small office, we covered what to look for in hardware and the operating system, security concerns, and choosing an Internet service provider. In this final installment, we'll talk about protecting the data on your hard drive.

The hard drive is the single most likely point of failure in your computer, and the most critical component. While power supplies also frequently fail, modern journaled file systems will generally keep you from losing your data if this happens. If your machine has a single hard drive and nonexistent or insufficient backups, losing the hard drive may literally mean losing your business. A close relative of ours learned this the hard way. He ran a home office on a single hard drive machine with no reliable backup system. When the hard drive died, he lost two years' worth of work, and spent several thousand dollars on data recovery that took several weeks and was only partially successful. Using RAID can turn a hard drive failure from a business-ending catastrophe into a minor inconvenience.

RAID, or Redundant Array of Independent (or Inexpensive) Disks, is a configuration of multiple hard drives to achieve fault tolerance and/or performance benefits on your system. For our purposes, the RAID level we're interested in is RAID 1, also known as mirroring, meaning that the two drives contain identical information. Performance-wise, RAID 1 is theoretically a little faster for reads and a little slower for writes than a single hard drive, but this depends on the implementation. Practically speaking, there's no significant performance difference between using RAID 1 and using a single disk. Both hard drives should have the same capacity; it's simplest to use two of the same model.

When adding a second hard drive to your machine, be aware that the additional heat generated by the second drive may require additional cooling fans. Refer to the case manufacturer's documentation for recommendations.

You can implement RAID 1 in two different ways, using software or hardware. Hardware RAID requires a RAID-capable hard drive controller. The RAID device is transparent to the operating system and looks and performs like a single disk as far as the OS is concerned. This means that all RAID administration is done through the hardware device, requiring software that talks directly to the hardware device, usually including a BIOS-level program to configure the device. Some higher-end hardware RAID devices also provide administrative software that runs under the operating system. However, the software, which always comes from the hardware RAID vendor, is usually not free, and it may not work with your version of Linux.

An advantage to hardware RAID is that many recent motherboards have built-in hardware RAID, and the user doesn't have to do any operating system configuration to set up RAID (although you will need a driver that supports the RAID device itself.) Most popular hardware RAID devices are supported under Linux. A significant disadvantage is that it may be impossible to convert a non-RAID machine to a RAID machine without completely backing up the existing hard drive, rebuilding it with a second drive as a RAID device, and then restoring your data. Another major issue is failure notification. If there is operating system-level administrative software, it may provide automated failure notification, but otherwise, you'll have to carefully monitor your log files for failures.

Software RAID achieves the same end as hardware RAID, but software RAID is handled by the operating system rather than by a hardware RAID controller. Creating and installing onto software RAID devices is provided by most major Linux distributions. You can also convert an existing single disk installation into a RAID 1 installation, although it's tricky. Advantages of Linux software RAID include automated failure notification by email that is easy to set up, and the fact that software RAID can provide more redundancy than hardware RAID.

While most distributions allow you to set up software RAID at install time, you will probably have to manually configure the RAID subsystem to send you email alerts of failures. You can do this by editing the mdadm.conf file (usually /etc/mdadm.conf or /etc/mdadm/mdadm.conf) and adding a MAILADDR directive; see the mdadm man page for details. You will also need to have a working local mail subsystem.

It's critical to install the RAIDed hard drives on separate controller channels if you're using your motherboard's on-board IDE controller. If this isn't possible because you already have too many IDE devices, then you will need to install a second IDE controller. This has the added benefit of providing redundancy at the controller level, and this configuration is popular in enterprise-level installations. While a controller failure is far less likely than a hard drive or power supply failure, it will mean downtime if both of your RAID disks are attached to a single controller. However, it's rarely catastrophic, since it probably won't affect the data on your hard drives.

Backups and disaster recovery

Without a backup and disaster recovery plan, your small business is teetering on the edge of disaster. The first step toward safety is to assess your backup and recovery needs. Do you need data backups only, high availability, or a full disaster recovery and business continuation plan? Do you need to back up your applications and operating system, or just your user files? When developing your backup system, your guiding principle should be: how much data are you willing to lose, and how much effort and money are you willing to spend to make sure that your business isn't disrupted in the event of a catastrophic event?

Although it involves a great deal of worst case scenario planning, needs assessment is critical to the continuation of your business in the event of a disaster. Various factors, technical and otherwise, should enter into your decision. For instance, how long can you go without working before your clients look for someone else to do the job, or before you have to take another job to generate income? Consider whether the information you work with is available elsewhere. If not, backups are doubly important. Finally, consider whether your business needs a physical location to work from. In some business sectors, home office workers can get by with a laptop in a cafe with Wi-Fi, while those who have daily face-to-face meetings with clients must consider where to meet if the office is a smoking ruin.

The nature of your business will also dictate how long you need to keep your backups, and how long your backup media needs to last. For instance, if you work with financial records in the US, you may be legally obligated to have your data accessible for seven years, meaning your media needs to last that long. Recent legislation in the US, such as the Sarbanes-Oxley Act, has placed even more emphasis on corporate data security.

The primary purpose of backups is not to restore your machine in the event of an avoidable hardware failure. If you've done your homework and built a highly available machine with RAID and ECC RAM, the main purpose of backups should be to retrieve deleted files and recover from a true disaster, such as a flood or fire. An event such as a failed hard drive should be an inconvenience, not a disaster. It's a good idea to archive your whole system on a periodic basis, though you're unlikely to run backups every day if it involves sitting at your computer and swapping five CDs while the backup program runs. Your daily data backups are most likely to get done on schedule if they run unattended.

Your choice of Linux distribution will affect how you do backups. Choose a generic install, trim out the packages you don't need, and there's not much of an imperative to back up your OS; just download it again if you have to. Modern package managers such as APT and yum make it simple to reinstall packages when necessary. In addition, not backing up your OS makes it easier to fit all of your backed up material on one disk, so your backups can run unattended.

The next step is to choose your backup media. Excluding the option of third-party Internet-storage services, or simply emailing yourself copies of important work-in-progress, options for backup include tape, hard drives, and optical media such as CDs or DVDs. Let's look at the pros and cons of each.

Tape used to be the preferred backup medium, and digital linear tape (DLT) format is still the gold standard for enterprise backups. Tape is impractical for most home use because hard drive volume has outstripped the capacity of low-cost tape formats, leaving only the option of buying an extremely expensive enterprise-quality tape drive and expensive tapes to go with it. One advantage of modern DLT backup tapes is that they are extremely stable. Depending on the nature of your business, there are cases where buying a tape drive may make sense, particularly if you need to keep your backed-up data for a long time. DLT tapes are rated to last up to 30 years under optimal storage conditions.

Hard drives have recently gained in popularity as a backup medium due to their low cost, large capacity, and high speed. Options include using a removable USB hard drive, putting extra hard drives in your primary computer, or purchasing another computer to use as a backup device. While you might be able to buy an additional internal hard drive for as little as $50, the disadvantage of backing up to a hard drive is that the drive itself is relatively fragile, and it's not portable when it's installed inside a computer. For backing up very small amounts of data and if portability is a major concern, a keychain-style USB flash drive is a good option, since it is physically more stable than a conventional hard drive.

Optical media such as CDs and DVDs are good options for home office users. Almost every computer produced today has an optical media drive. The media itself is inexpensive, relatively stable, and has reasonable storage capacity. CDs and DVDs are also extremely portable, making them a good choice for off-site backups. Downsides include the fact that optical media is slow to write to and slower to read from than a hard drive. Optical media also does not last forever. Most unrecorded optical media is estimated to have a shelf life of between five and 10 years; how long your recorded disks will last depends on how you store them and whom you believe, so if this is a concern, you should check the manufacturer's information on the product. Estimates range from 20 to 100 years for CD-RW, although some reports advise not using CD-RW for long-term backups at all.

Selecting backup software

For making a complete clone of your system, a program such as Mondo Rescue is excellent. Traditional utilities such as dump, tar, and cpio also work but are less convenient to use. All these programs can back up to your choice of media.

There are fewer good, simple options for unattended backups of user data. In our home office, we use the application Sitback, which can back up to your choice of media. Sitback's virtues include easy scheduling of backups to run at a time when you aren't working and reliable notification of backup success and failure. Sitback stores its archives in tar format, which means that you don't need Sitback to read the archives, which is often an issue with proprietary backup software. In our case, we schedule Sitback to run at 3 a.m. and pare down our data to fit everything on one CD-RW. We run a full backup each night and no incremental backups. The backup takes seven CD-RWs, each labeled for a day of the week, and we swap the disks each morning. Sitback does not directly support encryption, so it's not a good option for offices handling confidential data.

Another option for user data backups is the combination of DAR (Disk Archive), its scheduling program SaraB, and the GUI front end KDar. These programs work well for installations that won't fit on one piece of optical media and that alternate between full and incremental backups, or that need to encrypt backups. SaraB is a powerful scheduler and can support many configurations of backup scheduling and automated failure notification. KDar's easy-to-use restore function is especially useful if someone who works with you is less technically skilled and needs to be able to restore their own data. However, KDar cannot schedule unattended backups, and the DAR suite is not designed to work with tapes. It is also more work to set up than Sitback if you need only one disk's worth of backups made in the same way every day.

Whatever backup method you choose, if you keep data in an RDBMS such as MySQL or PostgreSQL, you will need to follow special procedures (usually dumping the databases or shutting down the server) to back up the databases.

Offsite backups are worth considering, since all of your carefully labeled backup disks won't do much good if your landlord locks you out of your house or your office floods. "Offsite" can mean various things: colocating a machine to use for hard drive backups, storing some of your most critical data online with an Internet storage service, or keeping backup disks someplace other than your house. If your data isn't sensitive, this could be as simple as keeping the disks at a friend's house. If you have secure or confidential client data, a safe deposit box may be a better bet; if longevity is a concern, check the media manufacturer's storage condition recommendations, or consider using a data storage service, since some backup media must be stored under controlled conditions. If your offsite backups contain any confidential information, they should be encrypted as well.

Once you have backups in place, it's critical to test them. There's nothing as tedious as testing backups, but nothing as horrifying as realizing that you've lost a month's worth of work on a project due in an hour, and that all of your carefully labeled backup disks are blank. So, we offer some suggestions for testing your backups.

First, set up a test schedule; for example, test your backups on the first business day of every month. If you have a large number of backups, test a random sample to make sure you can read them, and that they have the data that you expect. In addition, practice preventive medicine; if you have any media errors, discard the problem media immediately. Never assume that an error message is your backup program malfunctioning; follow up every error until you figure out what went wrong, then fix it. Make testing a priority. As urgent as it seems to meet today's deadline, the month's worth of data on your backup media is probably more important to your business.

With a backup plan in place, it's much easier to concentrate on your work rather than on the ominous grinding noise coming from your hard drive. A little bit of effort up front can save not only countless hours and dollars later, but also possibly your business itself in the event of a true disaster.

Conclusion

The most important best practice for the Linux home office is having an attitude of professionalism toward your computer equipment. Look at your computer as a business production system, not as a do-it-all home entertainment system that you also use for business. By having the same attitude of professionalism toward your home office as you do toward your business, you'll keep things running smoothly and profitably for a long time to come.

Share    Print    Comments   

Comments

on Best practices for the Linux home office, part 3

Note: Comments are owned by the poster. We are not responsible for their content.

Linux orphans some important IDE Raid cards

Posted by: Anonymous Coward on August 20, 2005 10:14 PM
I have an assortment of older IDE Raid Cards from 2 companies.

Promise Fastrack (66 & 100)
2 AMI IDE Raid cards

These cards are functional and were used in some systems that we needed to use LINUX on (and wished to upgrade these existing systems with a newer version of LINUX)! The systems being upgraded were not heavy duty servers, but we did like the hardware RAID and the use we get from these old Promise controllers.

At least the Promise one used to be supported by various Linux distros. But, when I went to install SuSE 9.3 on machines with these Raid cards... guess what... the newer Linux kernel I was told in an install pop up did not support them anymore.

This is not good for Linux to orphan such products! Why was it done... I have looked and looked for a reason why and to this day I can not find a reason why anyone's hardware that was supported before was suddenly not supportable with an Linux kernel upgrade?

Does anyone know the answer to this?

#

Same with Fedora (dropped old IDE RAID support)

Posted by: Anonymous Coward on August 20, 2005 10:18 PM
Tried Fedora core 3 too...
Same story... no support in newer kernel for these older Promise and AMI IDE Raid cards.

With Microsoft, I never ran into this problem.

What is the real story?

#

Re:Same with Fedora (dropped old IDE RAID support)

Posted by: Joseph Cooper on August 21, 2005 12:52 AM
Probably just nobody decided to update it.

Microsoft has people paid to develop all this stuff, Linux doesn't. Linux right now generally doesn't have the same kinda hardware support, not a whole lot one can do about that, unfortunately.

I'd figured either switch to BSD or Windows, or just setup Linux to run soft-RAID.

#

Re:Linux orphans some important IDE Raid cards

Posted by: Anonymous Coward on August 22, 2005 01:35 AM
Most likely is that no one uses them anymore so they went unmaintained.

If you truly have a need to retain the older hardware then I would suggest that you band together with other users and write your own updates for it. People who provide you with free software have no obligation to support your needs if their need is met. The nice thing is that you have the ability to get the support yourself if you wish.

As to the assertion that Windows wouldn't drop a driver I disagree. A huge amount of older software has been dropped from Windows over the years and you cannot get any support at all for older operating system versions. In that case you must either upgrade your hardware extensively or switch OS's because you won't be able to update the drivers you depend on.

Switching to a BSD isn't much better as they have many driver issues. It is harder to ensure driver support with the BDS's, not easier.

Finally, who believes that it is easier to switch operating systems than to just get a supported RAID system? If your box is doing anything non-trivial then this would be an inefficient solution to your problem. Before you decide to switch over a non-trivial installation ensure that you have a good plan and a tested migration path. If that is overkill then why are we having this conversation?

It is unreasonable to assume that free software owes you anything in support. Hey, its FREE! If you want support spend the money with vendors who will support you. Listen to what they say. If they tell you that it would be better to upgrade then do it. If you choose not to follow their advice then either pony up for your own support or move to that buggy mainstream software and put up with crashes and reboots and forced upgrades of hardware and software. Whatever you do stop whining about it. Checking Google for this it appears that Promise drivers stopped supporting this at kernel 2.4.7-10 which is back in Redhat 7.x days. Change is a given with technology, much as I hated to give up my 60Mhz Pentium box it really was a good choice.

#

Re:Linux orphans some important IDE Raid cards

Posted by: Anonymous Coward on August 23, 2005 01:50 AM
First, you just made a great case for any non-develper to avoid using F/OSS software. Most small businesses have enough to deal with as it is. Do-it-yourself software development is not an attractive proposition.

Second, these people are talking about Suse and Red Hat, both companies producing commercial distributions. It is quite possible that "it's free" wasn't even true for this person.

While it may be the case that upgrading hardware is the best option, it should either be the developer of the driver or vendor of the OS to say so, and at that point within some kind of official support capacity and with a bit more finsesse.

Unless, of course, you want the community to be seen as self-righteous and unhelpful.

#

The answer is simple

Posted by: Anonymous Coward on August 22, 2005 05:18 AM
You could have found the answer yourself if you had wanted to. The answer is that the drivers were not open source and it wasn't Red Hat or Suse that dropped support but Promise that stopped providing support for newer Linux versions.

#

Ok - then is there a Raid SATA or ATA card that...

Posted by: Anonymous Coward on August 22, 2005 07:23 AM
Ok - then is there a Raid SATA or PATA card that...
has open sourced the drivers for their low end RAID 1 with spare cards?

Just asking?

I really did like those cards and they still work?

Is there anyway to breath more life into them?

#

Re:Ok - then is there a Raid SATA or ATA card that

Posted by: Joseph Cooper on August 22, 2005 07:34 AM
Put them in a non-Linux system, and replace the ones in the Linux system.



If there's no drivers, there's no drivers, and the cost of porting them or creating new ones would far outweight the cost of a new RAID card, which can probably be had for spare change on <a href="http://www.pricewatch.com/" title="pricewatch.com">Pricewatch.com</a pricewatch.com>.

#

Cheap backup tapes are unlikely to last 5 years

Posted by: Anonymous Coward on August 22, 2005 06:57 PM
Cheap backup tapes are unlikely to last 5 years, especially if they aren't stored in ideal conditions. Seriously, how many people actually go out and look for the best tape or even the most cost effective tape? Almost none. Most go for what's on sale when they buy or what's the cheapest, quality or lack thereof does not come into play.


I've been through this several times with different groups. One was even bragging about having 6 years of data on tape. However, when it was time to put their money where their mouth was, the tapes no longer worked (even after finding the right model tape drive on the second hand market).


Over time the magnetic recoding medium separates from the plastic. Also, the lubricants evaporate making the tape more fragile, re-tensioning helps some, but eventually there will be none left. If the tape is not re-tensioned (rewound) frequently enough, then the magnetic fields can get muddled, too. Or the substrate can stretch. Anyone who's had audio cassette tapes has hear that warble from an old tape as it stretches.
See <a href="http://www.clir.org/pubs/reports/pub54/4life_expectancy.html" title="clir.org">How Long Will Magnetic Media Last?</a clir.org> for more. If data is to be retained offline (off the hard drives) for any period of time over 3 years then a plan to periodically copy old data to new tapes must be active.

#

Proprietary format, DRM hinder legal compliance

Posted by: Anonymous Coward on August 22, 2005 07:13 PM
Closed formats, which are not a problem in the specific examples given by the authors, could very well be. It is possible to have closed source and closed formats on Linux.


Closed formats and protocols are not a major problem technically, just a pain to reverse engineer when the time comes. However, the big problems come because of legislation, like the EUCD or DMCA, which make this type of circumvention highly illegal. If a company or its creditors own a format, access to that data is completely up to them.


DRM makes that more so. In order for DRM to work, control over the format and its lifecycle are in the hands of a third party and, in effect, the lifecycle of the data as well.


However, this makes a stronger case for businesses and most agencies to drop, if nothing else, MS Office and other proprietary formats and move to suites which comply with open, standard formats like OpenDocument. <a href="http://www.openoffice.org/" title="openoffice.org">OpenOffice.org 2</a openoffice.org>, among others, supports OpenDocument.

#

DAR link is 404

Posted by: Anonymous Coward on August 24, 2005 10:49 PM
correct link is <a href="http://dar.linux.free.fr/" title="linux.free.fr">http://dar.linux.free.fr/</a linux.free.fr> (you have a leading "www." which doesn't work)

#

This story has been archived. Comments can no longer be posted.



 
Tableless layout Validate XHTML 1.0 Strict Validate CSS Powered by Xaraya