HOME     NEWS     GALLERY     LICENSING     ARTICLES     SHOP     ABOUT     CONTACT    

Introduction to backups for artists

by Dawid Michalczyk
Updated: 23 November 2020


Summary: An introduction to taking backups with emphasis on the data commonly used by artists.


A backup, in computer lingo, refers to making a copy of important data for the purpose of data recovery. The word "data" refers to anything stored on a computer system: images, programs, documents, videos, etc. Should the important data get damaged or lost, a properly made backup will restore it all.


Common backup types


console screen
A backup does not need to rely on dedicated software. Making a copy of a file is a basic form of backup.
The best backup methods rely on simple and time proven concepts. The simpler the procedure, the more likely it is to work correctly. New or unnecessary technologies are best avoided till proven reliable and necessary.

A full-backup consist of making a copy of all important data. When you copy a folder with important files, from say a hard drive to a USB memstick, you actually make a full-backup of those files. Due to simplicity, this approach is the most reliable of all backup types. Its main advantage is the ease of backup creation and restoration - since no spacial backup software is needed. The main disadvantage is that each backup will use as much space as the important data that was copied. If the data is large, the backup process can be very resource intensive in terms of time, backup space requirements, and the processing power needed to carry it out. Imagine the time needed to do a full-backup of a digital library consisting of thousands of HD movies. Such operation can take days.

An incremental-backup works differently in that it backs up only the modified and newly added files since the last backup was done. When using this method, a full backup is created first and then incremental backups are run on regular basis. For large amounts of data, incremental-backup is often the only practical way to do backups. It requires much less space than taking full backups and is less resource intensive to run. On the other hand, contrary to full backups, incremental backups need dedicated backup software to keep track of which files to backup.

Compressing the backup data is a popular option. Such practice reduces the amount of space needed on the backup media. Although compression adds an additional layer of complexity, it can be a good (if relied on wisely) and sometimes a necessary solution.


Essential backup strategies


Regardless of the backup type and data, the following backup strategies should always be followed:

  • backup should be taken on a regular basis
  • backup should be automatic and need as little human supervision as possible
  • backup should be stored in a safe remote location
  • backup should rely on well established hardware and software technologies

Backup should be taken on a regular basis. The more frequently the data changes the more often it should be backed-up.

Backup should be automatic. Except for the initial configuration of the backup program and the occasional supervision, the whole backup process should be automatic and completely transparent. That is, the backup should run by itself without attracting any attention unless necessary.

Backup should be stored in a safe remote location. Should the location of the important data get damaged, destroyed, or exposed to theft - a remotely stored backup becomes invaluable. How remote? Disasters like fire, flood, tornado, earthquake, etc., can cause widespread damage. Ideally a backup should be stored in a far away enough, minimal risk location.

Backup should rely on well established hardware and software technologies. Such technologies are typically in widespread use - thus cheaper and easier to troubleshoot or get help in the event of failure. As the established technologies become gradually replaced by new and better ones, so should the backup media and hardware, and, if used, the software to re/store the data. There is no guarantee that the common backup media of today, like optical discs or USB memsticks, will be in widespread use in ten years. The same is true for software. Thus, a good data preservation strategy should include continual migration of the backup data to mature and well established technologies of the time.


A bit about data compression


Compression makes data smaller and thus is a popular option when saving files because less space is used. The downside is the extra time needed to compress the data and later to uncompress it when opening the compressed data.

Data compression is done by a compression algorithm, which is a method employed to reduce data size. There are two types of data compression algorithms: lossy and lossless. Lossless compression reduces the size of the data without modifying its content. Lossy compression modifies data content, which allows for even smaller than lossless compression. Furthermore, there are many different lossless algorithms and many different lossy algorithms. For example, the PNG and TIFF image formats both rely on lossless algorithms to compress image data, but the algorithm used by PNG is different from the one used by TIFF.

many noname colorful CDs
Most of the low cost burnable DVDs have a life span of around two years. Higher quality DVDs can last up to five.
Some file formats that rely on compression, like MP3 or JPG, are highly specialized. They use lossy algorithms and produce very small file sizes but can only compress a particular type of data. Other formats, like ZIP or BZIP2, rely on lossless compression algorithms and can work on any data. However, they will never produce a smaller file size than special purpose formats like MP3 or JPG.

Because lossy compression changes the data, formats like JPG, MP3 or any other lossy format degrade the original data to some extent. In other words, saving an image or music in a lossy file format will make it different than the original. Usually, the difference (called compression artifacts) is so small that most people don't see it or hear it. However, this largely depends on the compression settings. The more the data is compressed, the easier it is to notice the difference.

For the above reasons, lossy compression should never be used when saving important master / original data. Only lossless compression is suitable for that. PNG and TIFF are examples of image file formats that have lossless compression. Such formats are ideal for storing hi-resolution master images of finished artwork.

A lot of space can be saved thanks to compression. I took one of my images and saved it in different image formats: BMP (which has no compression), TIFF (lossless compression), PNG (lossless compression) and JPG (lossy compression). All lossless compression was done with maximum compression settings [1]. I then compressed those files with three general purpose lossless compressors: ZIP, BZIP2, 7ZIP. Since JPG is a lossy format it is only included for the sake of comparison. The Book.txt is Sun Tzu's The Art of War.


file format
size in bytes
zip'ed
bzip2'ed
7zip'ed
Colony.bmp
1 440 054
911 154 (63%)
693 481 (48%)
713 287 (49%)
Colony.tiff
662 948
652 315 (98%)
655 239 (99%)
652 955 (98%)
Colony.png
611 676
611 923 (100%)
613 466 (100%)
610 711 (100%)
Colony.jpg
303 217
302 933 (100%)
302 852 (100%)
300 268 (99%)
Book.txt
343 695
130 340 (38%)
100 696 (29%)
91 187 (26%)

(The percentage in the table above indicates how much the compressed size is out of the initial size. The smaller the better.)

The compression times vary somewhat but not too much to be impractical. PNG is a clear winner among images. It uses about 58% less space than BMP! Notice that only one of the general purpose compression tools, 7ZIP, further compressed (slightly) the already compressed PNG file. The book file was compressed down to about 26-38% of its original size, which is typical for text compression.

Generally, text files (TXT, HTML, XML, etc) can be compressed the most of all file types. Images that have been compressed with their own algorithms (PNG, JPG, TIFF, etc) can't later be compressed much if at all. Images which don't have own compression (BMP, RAW, etc) can often be compressed quite a bit, though this depends on the actual image data.

ZIP is a commonly used compression archive format - it's fast and compresses well. It can be used on a single file or a whole directory structure. Its been around for a long time and is universally available. But there are other, less known, good alternatives. For example, 7ZIP, RAR, and BZIP2 compress significantly better than ZIP but are a little slower.

One popular backup solution is to compress a whole directory structure into an ZIP archive and copy the archive to a backup media. The problem with this approach is the possibility of losing all files in the archive if the archive gets corrupted and can not be recovered. Therefore it's better that each file is compressed and stored individually, because the probability that all files get corrupted and unrecoverable is much smaller since each file uses much less space than an archive. Thus it's safer to compress files individually.


What backup media to use


inside a harddrive
External hard drives are a popular backup media due to large capacity and speed.
The commonly used backup media of today are external hard drives, USB memsticks, tapes, optical discs (DVD, Blu-ray, etc), and online cloud storage. Every media has its pros and cons. External hard drives are the fastest and often the best option for large amounts of data. They are also the most expensive and not very durable. Tapes are slow but can store a lot of data and can last decades. USB memsticks have a very small physical size but can get lost easily, and may not offer enough space for your data. Optical discs are probably the most common backup media used due to very low cost. Unfortunately, they are not very reliable, and most have a relatively short expected life span of between two to five years. Online backup solutions are limited by the speed of your internet connection and access to it, and may not offer sufficient space for your needs. However, backing up data online is very convenient.

Reliability is important to consider when choosing the backup media. How robust is the media and for how long can it retain the data? The quality of the media plays a significant role here. All media degrade over time, but some degrade more than other. For example, most of the low cost burnable DVDs have a life span of around two years. Higher quality DVDs can last up to five. Very high quality DVDs with a golden layer are expected to last decades. Generally, if the handling and storage conditions are good, quality media should last at least few years without data loss.

A combination of different media may often be the ideal solution. For example, my own backup practice includes using an external hard drive, USB memsticks, and online storage. Because everybody has different needs, I recommend evaluating different backup media in order to decide which suits your needs best. Keep in mind that using high quality products will minimize the possibility of a backup failure.


The necessity of verifying backups


One of the most important aspects of taking backups is making sure they are error free. The backup data may prove useless if corrupted due to media or other error. It is therefore essential to immediately test the backup for its validity. Errors will be detected and a new backup can be taken right away. Any respectable backup program provides an option for data verification. What good is a backup if its data is corrupted?


Final notes


Depending on your needs a dedicated backup software may be a necessary investment. Make sure to research this carefully. Usually, products from reputable companies that specialize in backup solutions are best. There are also many good open source or free software alternatives.

close-up of electronic chip
The quality of the backup hardware, media and software are equally important.

It's best to avoid products which rely on proprietary or closed solutions. For example, a commercial backup software may store the backup data in an unknown format only supported by this particular backup software. Avoid that. If the company goes out of business and the backup, or backup software, breaks, your backup data may be lost forever. Look for products that rely on well known, mature, and ideally open technologies. For example, PNG is an open format for storing image data. What this means is that the specification, or blueprint, for that format is publicly available for anyone to use it. This increases compatibility and reduces reliance on any specific vendor or product.

For most artists the important data consists mainly of images and 3d files. To save space rely on PNG, TIFF or JPG bitmap image formats. Vector images and 3d files can be compressed individually if needed. A basic incremental backup software that regularly copies the important files from your harddrive to a backup media may be all that is needed. It's best to make two sets of the backup data and store each at different location. One close to home, like a friend's place or a bank box, and the other far away.

Setting up a proper backup procedure may initially require a significant amount of time and cost money. There is a lot to research and consider. In the end however, a good backup procedure will prove an exceptionally valuable investment. As you read this, your screen could go blank due to a hard drive crash. All your valuable data - artwork, reference images, documents, photo albums, etc. - could be lost forever. Unless you are prepared and have a backup.


RESOURCES

zip - the zip file format, a popular archiver with compression.
7zip - a file archiver supporting a variety of archive formats and compression algorithms.
bzip2 - my favorite file compression tool.
png - a highly versatile, and my favorite image file format.
rsync - synchronizes files and directories from one location to another.
tar - an archive file format designed for tapes but commonly used for many other media.
backup software - a list of commercial and free backup programs.
storage review - a good source of hard drive reviews.

FOOTNOTES


1. The LZW algorithm was used for the TIFF image. PNG image was compressed with maximum compression. JPG with the lowest compression setting(100). Zip, bzip2 and 7zip were all set to use maximum compression. The following switches were used:

    zip: -9 (used on all test files)
    bzip2: -9 (used on all test files)
    7zip: -m0=ppmd:o=4 (used on Colony.bmp)
    7zip: -m0=lzma:a=1:d=0:lc=8:LP0:PB0:mf=bt2 (used on Colony.tiff)
    7zip: -m0=lzma:a=1:d=0:lc=8:LP0:PB0:mf=bt2 (used on Colony.png)
    7zip: -m0=lzma:a=1:d=0:lc=8:LP0:PB0:mf=bt2 (used on Colony.jpg)
    7zip: -m0=ppmd:o=20:mem=26 (used on Book.txt)


Dawid Michalczyk is a freelance illustrator and an artist. To see examples of his artwork and writings visit his website at http://www.art.eonworks.com
Copyright © 2006 Dawid Michalczyk. All Rights Reserved. This content may be copied in full, with copyright, contact, creation, information and links intact, without specific permission, when used only in a not-for-profit format.



Art and illustration studio
of Dawid Michalczyk.

Subscribe

   Receive the latest artwork
by email or RSS Feed.

Follow me on

Eon Works on Twitter Eon Works on Youtube Eon Works on Facebook

Top 9 images

      1. Ancient giants
      2. Stellar vista
      3. Endless opposites
      4. Planet scape
      5. Edge of perception
      6. Starry evening
      7. Future bandits
      8. Epsilon hunter
      9. Singular ambience

New in gallery

new artwork in gallery

Eon Works - Art and Illustration Studio. All content copyright © 1995-2021 Dawid Michalczyk. All rights reserved.