How to Use Tar in Linux (Ubuntu, CentOS, Red Hat)

Use Tar in Linux Ubuntu 18.04 / 20.04 / 22.04, Debian CentOS 7 / 8 or Red Hat 7

Tar is a utility for archiving and compressing files in Linux and UNIX-like operating systems. It allows you to bundle multiple files and directories into a single .tar archive file, while preserving permissions and directory structures.

Tar is extremely useful for backing up data, moving groups of files between systems, and preparing source code distributions. In this comprehensive guide, we will cover the basics of using tar, as well as some more advanced features and examples.

An Overview of Tar

Tar stands for Tape ARchive. It was originally used to archive data onto tape drives for long-term storage. While tar can still write archives to tape drives, it is more commonly used with regular files and pipes today.

Some key facts about tar:

  • Tar archives bundle multiple files and directories into a single .tar file.
  • Archives can be compressed using gzip or bzip2 to save space.
  • Tar preserves permissions, ownership, file modification times, etc.
  • Archives can span multiple tapes/volumes (for backing up to tape drives).
  • Tar is standardized – archives created on one UNIX system can be extracted on any other compatible UNIX system.
  • Tar is a very old utility that exists on every Linux and UNIX platform. It has decades of widespread use which makes it very stable and reliable.

Let’s go over some common tar terminology:

  • Archive – The .tar file created by bundling files using tar. This file contains the archived content. Archives can be compressed using gzip/bzip2 which are signified by .tar.gz or .tar.bz2 file extensions.
  • Bundle – Synonym for archive.
  • Tarball – A tarball is also just another way to refer to a .tar.gz or .tar.bz2 archive file.
  • Extract – The process of unbundling an archive and writing the extracted files to disk.
  • Compression – Tar supports optional compression with gzip or bzip2 to save space. The compressed archives have .tar.gz or .tar.bz2 extensions.
  • Append – Adding files to an existing archive. Does not affect existing content.
  • Concatenation – Combining two archives end-to-end.

Now that we understand tar terminology and basics, let’s move on to some usage examples.

Creating Archives

To create a new tar archive, use the tar -cvf command. This breaks down as:

  • -c – Creates a new archive.
  • -v – Verbose output. Lists files processed.
  • -f <archive-name> – Output filename.

For example :

$ tar -cvf archive.tar /path/to/folder

This will archive the given folder recursively into archive.tar. You can give multiple file/folder paths to add multiple entries:

$ tar -cvf archive.tar /path/one /path/two /path/three

To compress the archive using gzip, use -czvf instead of just -cvf:

$ tar -czvf archive.tar.gz /path/to/folder

For bzip2 compression, use -cjvf:

$ tar -cjvf archive.tar.bz2 /path/to/folder

You can control the verbose output using -v. Omit it to hide the file listings:

$ tar -cf archive.tar /path/to/folder

Viewing Archive Contents

To view files contained within a tar archive without extracting it, use:

$ tar -tf archive.tar

The -t lists the contents.

For compressed archives, you need to add the compression flag:

$ tar -tzf archive.tar.gz
$ tar -tjf archive.tar.bz2

Extracting Archives

To extract an archive, use -xf:

$ tar -xf archive.tar

This will extract the contents of archive.tar in the current directory while preserving permissions and attributes.

For compressed archives:

$ tar -xzf archive.tar.gz
$ tar -xjf archive.tar.bz2 

You can extract to a specific directory using -C:

$ tar -xf archive.tar -C /tmp/extract-here

This will extract the archive into /tmp/extract-here.

Appending to Archives

You can append files/directories to an existing tar archive using -rvf instead of -cvf:

$ tar -rvf archive.tar /new/folder

This will add /new/folder recursively to archive.tar without affecting existing contents.

Updating Archives

To update existing files in an archive or add new files, use -uvf:

$ tar -uvf archive.tar /path/to/update

This will add any new files under /path/to/update, and replace any existing files in the archive with the updated versions.

Deleting from Archives

Deleting files from tar archives involves creating a new archive without those files.

First, extract the archive contents to a temporary location:

$ mkdir /tmp/archive-temp
$ tar -xf archive.tar -C /tmp/archive-temp

Then delete the file you want removed from the temporary folder.

Finally, create a new archive from the temporary folder:

$ tar -cf new-archive.tar /tmp/archive-temp

new-archive.tar will now contain the archive contents minus the deleted file.

Excluding Files/Paths

To exclude certain files/paths when creating an archive, use --exclude:

$ tar -cvf archive.tar /path --exclude=/path/to/exclude

This will prevent /path/to/exclude from being added to archive.tar.

You can have multiple --exclude options. For example, to exclude all .log files:

$ tar -cvf archive.tar /path --exclude=*.log

Including Only Matched Paths

Instead of excluding certain paths, you can choose to only include matches using -T:

$ tar -cvf archive.tar -T include-list.txt

Where include-list.txt containsPatterns like *.py can be used to only match certain extensions.

Compression Options

By default, tar uses gzip for compression. You can specify different algorithms:

  • For gz (gzip): -z
  • For bz2 (bzip2): -j
  • For lzma: -J
  • For lzop: -Z

For example:

$ tar -cjf archive.tar.bz2 /path # bzip2 compression

You can also set the compression level, which usually ranges from 1 to 9 (higher = better compression but slower):

$ tar -czf -9 archive.tar.gz /path # gzip level 9

Archive Verification

Once an archive is created, you can verify it has not been corrupted or altered using -W:

$ tar -Wvf archive.tar

This will check the integrity of the archive.

For compressed archives, add the compression flag as usual:

$ tar -Wzvf archive.tar.gz

Tar Over Pipes and Remote Access

Tar can read/write archives locally or remotely via stdin/stdout pipes.

For example, to create a tar over SSH:

$ ssh user@host 'tar -cf - /path/to/archive' | tar -xvf -

This pipes the tar output over SSH to extract locally.

You can also extract an archive and pipe it over SSH for remote extraction:

$ tar -cf - /path/to/archive | ssh user@host 'tar -xvf -'

Piping tar through SSH compressing/decompressing can significantly speed up transfers:

$ tar czf - /path/to/archive | ssh user@host 'tar xvzf -'

These are just some examples – tar gives you a lot of flexibility with pipes.

Splitting/Spanning Archives

If your archive does not fit onto a single volume like a tape drive or disk, you can split tar archives into multiple chunks.

To split by size:

$ tar -cvf - --tape-length=1G /path | split -b 1G - archive.tar.

This will split archive.tar into 1GB chunks named archive.tar.01archive.tar.02, etc.

You can also split by number of chunks:

$ tar -cvf - /path | split -b 100m -d -a 5 - archive.tar.

This splits the archive into 5 parts (-a 5) named archive.tar.01archive.tar.02, … archive.tar.05.

To reconstruct the archive from the chunks, use cat to concatenate them back in order:

$ cat archive.tar.0* > archive.tar

Then extract as usual with tar -xf archive.tar.

Archiving Special Files

  • To archive device files like /dev/sdb, use the --preserve-devices option in GNU tar or --formats=v7 in BSD/Solaris tar.
  • For tracking file hardlinks accurately and archiving them properly, use --hard-dereference.
  • For archiving system extended attributes (SELinux, ACLs, etc), use --xattrs.
  • To keep empty directories in the archive, use --keep-directory-symlinks.

Refer to the tar documentation for details on these and other specialty archiving options.

Useful Tar Flags/Examples

Here is a quick reference of some useful tar flags and operations:

# Create archive
$ tar -cf archive.tar /path/to/files
# Compressed archive 
$ tar -czf archive.tar.gz /path/to/files
# View archive contents
$ tar -tf archive.tar
# Extract archive 
$ tar -xf archive.tar
# Extract to specific folder
$ tar -xf archive.tar -C /tmp 
# Append files to archive
$ tar -rvf archive.tar file1 file2
# Update files in archive
$ tar -uvf archive.tar file1
# Delete file from archive
$ tar --delete -f archive.tar file_to_delete 
# Archive a remote folder over SSH
$ ssh user@host 'tar -cf - /path/to/archive' | tar -xvf -
# Verify archive integrity  
$ tar -Wvf archive.tar
# Compression levels 1-9
$ tar -czf -9 archive.tar.gz /path  
# Split archive into chunks 
$ tar -cf - /path | split -b 100m -d -a 5 - archive.tar.

This covers a wide range of tar usage examples. Be sure to refer to the man pages for your specific tar implementation for more details and supported flags.

Conclusion

Tar is an essential tool for working with groups of files on Linux/UNIX systems. It allows you to bundle any number of files, directories, and special files into a single portable archive that preserves permissions and attributes, and has many everyday uses for file backups, transfers, Docker and CI/CD workflows, and software distribution. It is a standardized UNIX utility guaranteed to be available on Linux and macOS systems.

Hopefully this guide gave you a broad overview of tar and how to use it effectively for managing archives on a Linux system like Debian, Ubuntu 18.04 / 20.04 / 22.04, CentOS 7 / 8 or Red Hat 7.

If you have any questions or would like to know more about this article, please post your question in the comments.

LEAVE A COMMENT