- A tarball groups and compresses source code while preserving structure and permissions, making it the standard format for distributing projects on GNU/Linux.
- Los commands The tar key (c, x, t, v, f, z, j) allows you to efficiently and flexibly create, list, and extract .tar, .tar.gz, and .tar.bz2 archives.
- Most programs are compiled from tarballs using systems like Autotools or CMake, following the configure, compile, and install flow.
- Well-named and structured tarballs are the basis for generating .deb, .rpm packages and universal formats like Snap or Flatpak.
If you work with Linux Sooner or later you're going to come across one code tarballThat typical .tar, .tar.gz, or .tar.bz2 file that contains the complete project ready to compile or package. At first glance, it might seem a bit intimidating, but once you understand how to create and use it, it becomes an essential tool in your daily work.
In this guide you will see How to create a code tarball package step by stepThis guide covers how to compress and decompress with tar, how to prepare a project for distribution as source code, and how to take that tarball a step further by packaging it as .deb, .rpm, or even in universal formats like Snap, Flatpak, or AppImage. Everything is explained in detail, but in simple terms, so you can put it into practice with confidence.
What exactly is a tarball and why is it used so much?
A tarball is nothing more than a file generated with the tar tool which groups many files and directories into one. Originally, “tar” came from Tape archive, because it was used to dump data onto tapes, but today it is mainly used to package source code and make backups.
When you see extensions like .tar, .tar.gz, .tgz o .tar.bz2You're actually seeing a tar file plus a compression layerThe typical process is: first, all the files are grouped with tar, and then that tar is compressed with gzip (.gz), bzip2 (.bz2) or similar to reduce the size.
In the GNU/Linux ecosystem, tarballs are used for distribute the original source code of the programsThey are used to move large directory trees between servers, to back up websites or data, and to share projects independently of the distribution (they don't depend on .deb, .rpm, etc.). That's why they are so universal.
A key point is that tar respects permissions, owners, and directory structureThis is crucial when you package code or configurations that need to be installed as is on another machine.
Basic commands with tar: create, list and extract
The command tar is one of the most used in Linux And despite having many options, in everyday use, remembering just a few is almost all you need. The basic syntax always follows the same pattern:
tar file.tar paths…
Among the most common options for working with code tarballs are c, x, t, v, f, zyjEach combination allows you to create, browse, or extract different types of .tar, .tar.gz, or .tar.bz2 files without too much hassle.
Create a simple .tar file
To package uncompressed code, for example the directory ~/projects/myprog-1.0, you can run:
tar -cvf miprog-1.0.tar ~/projects/miprog-1.0
The options you're using in this tar command are very typical: c to create a new file, v to view on screen what is being added (verbose mode) and f to indicate the file name The output file is (miprog-1.0.tar). The result is an uncompressed tarball that you can then distribute or save as a backup.
Create a compressed .tar.gz or .tgz tarball
In practice, it's almost always desirable for the code tarball to take up less space. That's what we use for. gzip with the -z option of tar, so that both processes (tar + compression) are done in a single command:
tar -cvzf miprog-1.0.tar.gz ~/proyectos/miprog-1.0
Here's the option z indicates that gzip should be used as a compressor. The generated file can be called .tar.gz or .tgz, because for practical purposes they are equivalent. The content remains the same, but compressed, ideal for uploading to a website. downloadsattach it to an email or publish it as a "release" in a repository.
Create a compressed tarball .tar.bz2
If you need a more aggressive compression (slightly smaller files at the cost of taking a little longer to compress and decompress), you can use bzip2 with the option -j:
tar -cvjf miprog-1.0.tar.bz2 ~/proyectos/miprog-1.0
The flag j tells tar to use bzip2It is common to find name variations such as .tar.tbz or .tar.tb2, but the idea is identical: a tar compressed with bzip2 and ready to distribute.
List the contents of a tarball
Before removing or touching anything, you might want to see what's inside the tarballThat's what option t (for table of contents) is for:
tar -tvf miprog-1.0.tar
This command will give you a detailed list with file paths, permissions, and timestampsThe same pattern applies to .tar.gz and .tar.bz2, since tar detects compression based on the zoj options:
tar -tvf miprog-1.0.tar.gz
tar -tvf miprog-1.0.tar.bz2
Extracting files from a tarball
To unzip and extract To change the content in the current directory, you only need to replace the 'c' with an 'x', keeping the rest the same:
tar -xvf miprog-1.0.tar
If you want to specify a particular directory where to save the files, add the option -C followed by the destination path. This way you can keep your sources or backups organized:
tar -xvf miprog-1.0.tar -C /home/user/src/
The same applies to tarballs compressed with gzip or bzip2, simply by adding the corresponding compression option, such as tar -xvzf file.tar.gz o tar -xvjf file.tar.bz2.
Extract specific files and use wildcards
One of the advantages of tar is that you can extract only some files instead of decompressing everything. For example, if inside sampleArchive.tar there is a script called example.sh that you want to recover:
tar -xvf sampleArchive.tar example.sh
This same approach applies to compressed tarballs, adding zoj according to touchYou can also use the option –wildcards to extract only files that match a pattern, which is very practical for things like extracting only .jpg images:
tar -xvf sampleArchive.tar –wildcards '*.jpg'
In the case of .tar.gz or .tar.bz2, the command would be similar, changing to -zxvf o -jxvf to respect the compression method used in the original file.
Create a source code tarball from scratch
Beyond compressing individual folders, the classic use of the tarball in the world Unix es distribute the source code of a programThe idea is very simple: the author prepares a clean directory with the project, packages it as a tar.gz or tar.bz2 file, and publishes it. Users download this tarball, extract it, and compile it locally.
In many cases, that code tree has a name like this: program-versionFor example, hello-sh-1.0, so that the resulting tarball is called hello-sh-1.0.tar.gz and, when decompressed, it generates exactly that directory with everything needed inside.
A very typical example for practice involves creating a mini project With a simple script, give it execute permissions and then package it. For example:
$ mkdir -p hello-sh/hello-sh-1.0; cd hello-sh/hello-sh-1.0
$ cat > hello <
#! / Bin / sh
# (C) 2011 Foo Bar, GPL2+
.echo "Hello!"
EOF
$ chmod 755 hello
$cd..
$ tar -cvzf hello-sh-1.0.tar.gz hello-sh-1.0
With this procedure you have generated a trivial tarball with source code which you could then unpack into any Unix-like distro and run the script without further ado.
Download, unpack, and prepare the source code
It is normal nowadays for the author of a free program to publish their code in the form of tar.gz or tar.bz2 on your website or on GitHub-type platformsThese files contain the source tree, documentation (README, INSTALL, etc.) and, often, the files needed to generate Makefiles and compilation scripts.
If the project doesn't come as a tarball but from a version control (Git, Subversion, CVS…)Then you'll need to clone the repository with `git clone`, `svn co`, or `cvs co`, and then package that directory yourself with `tar`, preferably using the `--` option. –exclude-vcs to avoid including version control system metadata:
tar -cvzf project-1.0.tar.gz –exclude-vcs project-1.0
Sometimes the source file is compressed in other formats such as .zip or .ZIn that case, you decompress it with the appropriate tools and, if you want to follow the classic distribution standard in GNU/Linux, you repackage it as a tarball (tar.gz or tar.bz2) maintaining the program-version structure.
Install compilation tools and read the documentation
Before you start compiling anything from a tarball, it's important that your system has basic development toolsIn Debian, Ubuntu and derivatives, this is usually as simple as installing the build-essential metapackage:
sudo apt install build-essential
In other distributions the names change but the idea is the same: in Fedora/Red Hat you can use sudo dnf groupinstall «Development Tools»and in Arch Linux use sudo pacman -S base-develThe important thing is that you have a compiler (gcc, g++), make, and standard headers.
Once the tarball is decompressed, the first thing to do is enter the project directory and Read the README and INSTALL files very carefully.Because that's where the compilation method, dependencies, and any other peculiarities are usually explained. You can use less, cat, or your preferred editor for that.
Compiling source code from a tarball: classic method
Most simple projects distributed in tarballs include a makefile (you can learn to Create and master Makefileor a compilation system based on Autotools. In these cases, the typical workflow consists of three steps: configure, compile, and install, with some variations depending on the project.
In many cases, after decompressing the tarball, you simply need to run something like this in the root directory of the code: ./configure && make && sudo make installBut underneath there are quite a few interesting details that are worth knowing so you don't go in blind.
Autotools: configure, Makefile and company
Many free programs use the GNU Autotools (Autoconf, Automake, Libtool, gettext...) to generate portable Makefiles for various platforms. If you see files like configure.ac, Makefile.am, or Makefile.in in the source tree, it's almost certain that the project is based on this system.
The author usually executes in his environment autoreconf -i -f To generate, from those source files, the configure script and the corresponding Makefile.in files. Then, on your machine, you will call ./configure, which will create the final Makefiles adapted to your system (detecting libraries, paths, compiler, etc.).
The simplified flow would look something like this: starting from configure.ac and Makefile.am The configure and Makefile.in files are built, and then, by running ./configure, the Makefile and config.h files are generated. Finally, a simple make Compile the binary. If you need to adjust installation directories, you can pass options to the configure script, for example. ./configure –prefix=/usr to change the standard installation prefix.
Compiling with CMake as a modern alternative
Another construction system that is very widespread today is CMakeYou'll recognize it by the presence of the CMakeLists.txt file in the project root. The pattern for compiling from a tarball usually involves creating a separate build directory, generating the configuration there, and compiling within it.
mkdir build
cdbuild
cmake..
make
sudo make install
This approach keeps the source tree cleaner and allows you to generate projects for different platforms or IDEs using the same CMakeLists.txt files. The logic, however, remains the same as with Autotools: configure, compile and install.
Resolve dependency errors when configuring
It's quite common for the first pass of ./configure fail because a library, header, or development dependency is missing. The script usually displays a clear message such as "libejemplo not found" or "cannot find header xyz.h".
In many distributions, libraries are packaged in three variants: binary, library, and development libraryFollowing the example pattern, you might have packages named example, libexample, and libexample-dev. To compile from the tarball, you usually need the -dev variant (it includes headers and files necessary for linking), so if you're encountering a problem with libexample, the practical solution is usually to install libexample-dev and try again.
It's possible that after installing a -dev library, you might run configure again and it will complain about a different library. That's normal; it's part of the process. Trial and error until you meet all the requirementsOnce you get past that phase, configure finishes successfully and you can start compiling.
Compile with make and install with make install
Once the configuration step has gone well, you can now run make to compile the project. Depending on the size of the program and the power of your machine, this process can take from a few seconds to several minutes. You will see the following appear in the terminal the compilation and linking lines that are being executed.
If everything goes correctly and no compilation errors appear, all that remains is to install the result on the system with sudo make installBy convention, many projects are installed under /usr/local (binaries in /usr/local/bin, libraries in /usr/local/lib, configurations in /usr/local/etc…), so that they are not mixed with the distribution's own packages.
Some projects also implement the objective make uninstallThis allows you to revert the installation and delete the files copied with `make install`. Not all installers include this feature, but if it's listed in the documentation, it's a relatively convenient way to uninstall what you've compiled from the tarball.
When and how to create a tarball for packaging in Debian
In the Debian and derivatives world, the original code tarball is usually called version_name.orig.tar.gzThe idea is to clearly separate the "upstream" source code from the Debian-specific modifications (patches, control files, post-installation scripts, etc.) that are added to the debian directory of the source package.
If the original developer named their tarball gentoo-0.9.12.tar.gz, for example, as a Debian maintainer you can rename it to gentoo_0.9.12.orig.tar.gz and work from there. On top of that source, you will add the debian subdirectory with all the necessary files to build the .deb files.
The tool dh_make This makes the task much easier: starting with gentoo-0.9.12.tar.gz, it decompresses the source code, generates the appropriate .orig tarball, and creates templates in debian/ that you can then adjust. After running it, you'll see that your working directory contains both the renamed original and the source tree ready for packaging.
Properly name the package and version on tarballs and packages
When distributing source code in tarball form, it is essential that The file name and internal directory follow a consistent conventionIn the Debian ecosystem, for example, it is recommended that the package name use only lowercase letters, digits, + and – signs and periods, with a reasonable length and always starting with an alphanumeric character.
The version number, for its part, should have only alphanumeric characters, +, ~ and periodsalways starting with a digit. This makes it easier for tools like dpkg to compare versions correctly. If the program author uses unusual strings (dates with text, Git hashes, etc.), you can save that information in the Debian changelog, but it's best to keep the "upstream version" field of the tarball clean and organized.
In cases of previous or candidate versions, a ~ (tilde) so that dpkg can correctly interpret the orderFor example, if you have a pre-release gentoo-0.9.12-ReleaseCandidate-99.tar.gz, you can rename it to gentoo-0.9.12~rc99.tar.gz, so that 0.9.12 is considered superior to 0.9.12~rc99 when the final version arrives.
From tarball to binary package: package as .deb
A code tarball is just the first step if your goal is Distribute the program as an installable .deb package ( most popular software package installersStarting with the original tar.gz file, you can create the necessary structure to generate a .deb file, add the compiled binary and metadata, and build the package with dpkg-deb.
Inside, a .deb file basically contains three parts: debian-binary (a minimal text file with the format version), control.tar.gz (metadata and control scripts) and data.tar.gz (the binaries and files that will be installed). With the `ar` utility you can inspect a .deb file and see exactly these components.
To package something very simple, like a "hello world" program in C, you could first compile the binary from the source code that came in the tarball and then mount the directory structure of the .deb Manually. You would create a directory called hola/, inside it a subdirectory called DEBIAN with the control file, and in the appropriate path (for example, hola/usr/bin/), you would copy the executable. Then you would simply call dpkg-deb –build hello to generate hola.deb.
From tarball to RPM pack and other formats
Something very similar happens in the Red Hat world with the RPM packagesThe source code is usually also distributed in tarballs, and from there SRPMs (RPM source packages) are generated that contain both the original tarball and one or more patches and the .spec file that describes how to compile and install the program.
In this case, the working directory structure includes paths such as RPMS, SRPMS, SOURCES, SPECS…The original tarball goes to SOURCES, the .spec file to SPECS, and from there you can use rpmbuild to generate the binary .rpm and, if desired, the .src.rpm file. The .spec file contains data such as the package name, version, dependencies, build commands, and the list of files that should end up on the system.
The end result is very similar to a .deb: a binary package that the user can install with their usual package manager, but whose root is usually a organized and well-versioned source code tarball.
Create tarballs and package in universal formats
In recent years, formats such as Snap, Flatpak, and AppImage They aim to offer "universal" packages for Linux, avoiding much of the fragmentation between distributions. Although each has its own mechanics, they can all take as their starting point a code tree that you could also distribute as a tarball.
In Snap, for example, you define in the file snapcraft.yaml How your application is built: name, version, source (which can be a URL to a remote tarball), build plugins (autotools, cmake, etc.), and the final command to execute. Snapcraft will take care of downloading the tarball, compiling it, and packaging it along with its dependencies.
Flatpak works with a JSON or YAML manifest This describes the runtime, the SDK, and the modules to be built, also specifying the source (which can be a local file or a remote tarball). Again, the tarball is the natural container for the source code that will later be integrated into the universal package.
In AppImage, although you often start from pre-compiled binaries, you can also get there from a classic code tree: you compile from the tarball, generate the structure AppDir And finally, you create the self-executing image that the user can launch without any actual installation.
Tarballs, direct compression, and other practical uses
Besides packaging code, tar is used extensively for compress and decompress generic dataConfiguration directories, personal folders, spot backups, etc. In Unix-like systems, including many versions of Android And macOS, tar usually comes standard and offers a very reasonable compression ratio, close to 50% in many cases, with a fairly high operating speed.
But be aware, there are nuances: tools like gzip or bzip2 only compress files, not directories.When you run `tar -czf` or `tar -cjf`, tar actually first groups directories and then passes the result to gzip or bzip2. That's why it's so practical to use tar as a "front" for these utilities.
If you want to quickly check the size a tarball will be or measure the impact of compression, you can chain tar with wc -c To count bytes, using a script as standard output. For example:
tar -czf – sampleArchive.tar | wc -c
This trick makes it easy to get an idea of how much weight you're saving by packaging a specific directory, which is very useful when preparing code distributions or backups that you're going to move over the network.
As you can see, mastering the use of tarballs for package and distribute code step by step It opens many doors for you: from creating simple compressed copies of your projects to preparing clean sources to package in .deb, .rpm or universal formats; all while maintaining permissions, directory structure and good versioning practices that will make your life (and that of those who use your software) much easier.
Passionate writer about the world of bytes and technology in general. I love sharing my knowledge through writing, and that's what I'll do on this blog, show you all the most interesting things about gadgets, software, hardware, tech trends, and more. My goal is to help you navigate the digital world in a simple and entertaining way.