Here in this tutorial, we will show you how you can easily compress files in GZIP in Java. As per the definition of GZIP in Wikipedia, GZIP is normally used to compress a single file. But we can compress multiple files by adding them in a tarball (.tar) before we compress them into a GZIP file.
For this single file compression, we don’t need to add any libraries or dependencies in our Project. The API is already available in the JDK.
Below is a sample code on how you can compress a file in GZIP using the GZIPOutputStream in Java.
public static void compressGZip(Path fileToCompress, Path outputFile) throws IOException { try (GZIPOutputStream gzipOutputStream = new GZIPOutputStream(Files.newOutputStream(outputFile))) { byte[] allBytes = Files.readAllBytes(fileToCompress); gzipOutputStream.write(allBytes); } }
To test this, we just simply call this method and pass the file we want to compress and the destination GZIP file. The below code tries to compress our pom.xml file in our project directory.
Path fileToCompress = Paths.get("pom.xml"); Path outputFile = Paths.get("pom.xml.gzip"); compressGZip(fileToCompress, outputFile);
This creates a new file in our project directory:
To decompress, we will be using the GZIPInputStream and copy the contents to the output stream.
public static void decompressGZip(Path fileToDecompress, Path outputFile) throws IOException { try (GZIPInputStream gzipInputStream = new GZIPInputStream(Files.newInputStream(fileToDecompress))) { Files.copy(gzipInputStream, outputFile); } }
And to test this, we will run the code below to decompress our previously created GZIP file.
Path output = Paths.get("pom2.xml"); Path input = Paths.get("pom.xml.gzip"); decompressGZip(input, output);
This in turn will decompress the pom.xml.gzip file to pom2.xml file.
Since GZIP is used to compress single files, we cannot just add files in GZIP and compress it. We need to first create a tarball file which contains the multiple files that we want to compress.
Java doesn’t include API to create a .tar file. Thus, we will be using the Apache Commons Compress library in our project.
To start with, add first the dependency in your project. If you are using maven, add the below code to your pom.xml file:
<dependency> <groupId>org.apache.commons</groupId> <artifactId>commons-compress</artifactId> <version>1.20</version> </dependency>
And here is the sample code on how we can use it to create the tarball file and compress it to GZIP.
public void compressTarGzip(Path outputFile, Path... inputFiles) throws IOException { try (OutputStream outputStream = Files.newOutputStream(outputFile); GzipCompressorOutputStream gzipOut = new GzipCompressorOutputStream(outputStream); TarArchiveOutputStream tarOut = new TarArchiveOutputStream(gzipOut)) { for (Path inputFile : inputFiles) { TarArchiveEntry entry = new TarArchiveEntry(inputFile.toFile()); tarOut.putArchiveEntry(entry); Files.copy(inputFile, tarOut); tarOut.closeArchiveEntry(); } tarOut.finish(); } }
To test this, we just need to pass the output file path for our .tar.gz file and the list of paths that we want to compress.
Path tarGZOut = Paths.get("tarball.tar.gz"); compressTarGzip(tarGZOut, Paths.get("pom.xml"), Paths.get("compress-file-example.iml"));
In our above code, we wanted to compress both our pom.xml file and our .iml file in our project directory. Running this code creates our tarball.tar.gz file.
To decompress our .tar.gz file that contains multiple files, we will just need to iterate each archive entry in our GZIP file and save it to our destination. Here’s how you can do it:
public void decompressTarGzip(Path fileToDecompress, Path outputDir) throws IOException { try(InputStream inputStream = Files.newInputStream(fileToDecompress); GzipCompressorInputStream gzipIn = new GzipCompressorInputStream(inputStream); TarArchiveInputStream tarIn = new TarArchiveInputStream(gzipIn)) { ArchiveEntry entry; while ((entry = tarIn.getNextEntry()) != null) { Path outputFile = outputDir.resolve(entry.getName()); //save to output directory Files.copy(tarIn, outputFile); } } }
And for example, if we wanted to decompress our earlier tarball.tar.gz file, we can easily do that using the code below. Note that I created first a folder on which the extract files will be saved.
Files.createDirectory(Paths.get("tar")); decompressTarGzip(Paths.get("tarball.tar.gz"), Paths.get("tar"));
And this results to creating a folder named tar inside our project directory and the files that were extracted:
And that concludes our tutorial on how we can compress files in GZIP in Java. The next tutorial is how you can compress files in Zip in Java.