How to install and use Duplicity to Automate Backups

Duplicity automated backups ubuntu redhat centos debian windows macos

Duplicity is a powerful open source backup tool that allows you to perform encrypted and incremental backups. It supports a variety of backends for storing backup data including local or remote file systems, FTP, SSH, WebDAV and cloud storage services. Duplicity uses GnuPG for encryption and signing of backup archives.

In this comprehensive guide, we will cover how to install and configure Duplicity and then use it to setup automated backups on Linux.

Installing Duplicity

Duplicity is available in the default repositories of most Linux distributions.

On Debian/Ubuntu

$ sudo apt update
$ sudo apt install duplicity

On CentOS/RHEL

$ sudo yum install epel-release
$ sudo yum update
$ sudo yum install duplicity

On Arch Linux

$ sudo pacman -S duplicity

On Fedora

$ sudo dnf install duplicity

For other Linux distributions, consult your package manager.

Once installed, verify that Duplicity is available by checking the version:

$ duplicity --version

Generating GPG Keys

Duplicity uses GnuPG keys to encrypt and/or sign backup archives. We need to generate a keypair for this purpose.

Import the GPG tools package if not already installed:

$ sudo apt install gnupg

Generate a new keypair with RSA encryption.

$ gpg --gen-key

Choose the key type as “RSA and RSA” with size 4096 bits. Set an expiry period if desired.

Provide your details for user ID such as name, email etc. Add a secure passphrase for the keys. This will generate a public/private keypair for you.

List the keys to find the ID:

$ gpg --list-keys

Export the public key for backup. Substitute the ID appropriately:

$ gpg -a --export 1234ABCD > public.gpg

The public key needs to be transferred to any remote backend you intend to use such as SSH or cloud storage. The private key should be kept securely on the local system perform the backups.

Configuring Duplicity

Duplicity supports a variety of storage backends such as local, SSH, FTP, WebDAV etc. We will look at configuring for some common backends:

Local Filesystem

To backup to a local directory, set the backup destination URL like:

file:///home/user/backups

SSH

To backup to a remote system over SSH:

ssh://user@host//path/to/backup

This assumes you have SSH access setup between the systems.

Amazon S3

To backup to an Amazon S3 bucket:

s3://s3-bucket-name[/prefix] 

The S3 credentials can be supplied via a ~/.boto config file or environment variables.

See the Duplicity S3 Backend documentation for details.

Google Cloud Storage

For backing up to Google Cloud Storage:

gs://cloud-storage-bucket[/prefix]

Authentication can be done by various means including service account files, ADC JSON files or environmental variables.

Refer the GCS Backend section in the docs for more details.

Swift

To backup to an OpenStack Swift container:

swift://container_name[/prefix]

Authentication is through environment variables. See the Swift Backend documentation.

WebDAV

To use a WebDAV server as backend:

webdav[s]://hostname[:port]/path

Duplicity will prompt for username/password as required.

See WebDAV Backend for details.

In this manner, you can configure backups to a variety of storage endpoints. Now we are ready to create our first backup.

Creating Backups

With the backend URL configured, we can now make a full backup.

Set the following environment variable to avoid interactive prompts:

export PASSPHRASE=your_passphrase 

Then run a full backup:

$ duplicity /path/to/source dir file:///path/to/destination

This will recursively backup the /path/to/source directory to local destination /path/to/destination.

To backup to a remote location over SSH:

$ duplicity /path/to/source ssh://user@host//backup/path

For backing up to cloud storage:

$ duplicity /local/source s3://s3-bucket[/prefix]

Duplicity will prompt for any required credentials like SSH password or AWS secret keys. The backup will be encrypted and stored in the destination.

To backup certain folders only, specify them as include paths. For example:

$ duplicity include /path/to/folders1 include /path/to/folders2 /path/to/source file:////path/to/destination

This will only backup the specified folders from the source.

After the initial full backup, consecutive backups will be incremental. This saves time and storage space. Duplicity uses librsync to efficiently determine changed content.

To force a full backup instead of incremental, use the --full-if-older-than option:

$ duplicity --full-if-older-than 60D /path/to/source ssh://user@host//path/to/backup

This will do a full backup if last full backup is older than 60 days.

Backup Scheduling with Cron

We can automate the Duplicity backups using Cron jobs.

Open the crontab for editing:

$ crontab -e

Add a cron schedule like:

0 1 * * * /usr/bin/duplicity /path/to/source ssh://user@host//backup/path

This will run a backup job everyday at 1 AM.

For weekly backups:

0 1 * * 0 /usr/bin/duplicity /path/to/source ssh://user@host//backup/path

This will trigger the backup every Sunday at 1 AM.

Similarly you can schedule monthly, yearly backups etc.

For more granular control, you can trigger separate full and incremental jobs:

0 1 * * * /usr/bin/duplicity --full-if-older-than 30D /path/to/source ssh://user@host//path/to/full/backup
0 */4 * * * /usr/bin/duplicity /path/to/source ssh://user@host//path/to/incr/backup 

This will run full backup on 1st of month and incremental every 4 hours.

Remember to redirect output if the jobs produce a lot of output to avoid cron email spam.

Restoring Backups

To restore the latest backup version:

$ duplicity restore ssh://user@host//backup/path /local/restore/path

This will restore the backup available in the remote location to specified local path.

To restore an earlier version from a specific date:

$ duplicity restore --time 2020-01-01T12:30:00 ssh://user@host//backup/path /local/restore/path

List all stored backup versions:

$ duplicity collection-status ssh://user@host//backup/path

Delete old backups:

$ duplicity remove-older-than 6M --force ssh://user@host//backup/path

This will delete all backup versions older than 6 months.

In this manner you can manage your backed up archives. Restoring selected versions whenever required.

Duplicity on Mac with Homebrew

On MacOS, Duplicity can be installed via Homebrew:

$ brew install duplicity

Usage remains same as on Linux:

$ duplicity /path/to/source /path/to/destination

Schedule cron backup similarly using the native crontab:

$ crontab -e

Duplicity on Windows

Duplicity can be installed on Windows using the Cygwin Linux environment.

First install Cygwin with the rsync and python packages.

Then install Duplicity via pip:

C:\> pip install duplicity

Now you can use Duplicity to backup files locally or to remote Windows shares:

C:\> duplicity C:\Users\user\Documents E:\Backups

Automate the scheduled backups using the Task Scheduler.

A Windows native port of Duplicity called cwDup is also available though with reduced functionality.

Duplicity Best Practices

Here are some best practices to follow when using Duplicity:

  • Backup to remote or offline storage for protection against malware, ransomware etc.
  • Encrypt and sign backups to ensure security. Passphrase protect your GPG private key.
  • Validate backups by performing restores periodically.
  • Retain multiple versions but prune old backups to save space.
  • Store metadata like GPG keys and configs separately from backup data.
  • Test backups work before relying on them for restores.
  • Automate on a schedule but also backup manually after important changes.
  • Send backup logs/notifications to monitor job status.
  • Split large volumes across multiple disk drives for faster throughput.
  • Take backup snapshots to avoid backing up open files in unstable state.
  • Isolate backups from network with an air gap for maximum security.

Conclusion

Duplicity is a robust open source solution for encrypted incremental backups. It provides a lot of flexibility in backend storage options. Utilizing GPG encryption allows for secure transfers and storage of backup archives.

With this guide you should now be able to setup automated Duplicity backups jobs to a local or remote location. Storing backups off-site or in cloud storage protects against local disasters and provides redundancy.

Regular testing and validation of backups ensures your data is protected when needed for restores. Following best practices around security, validation and monitoring helps maintain robust backups.

LEAVE A COMMENT