In 1991, Phil Katz introduced a new compression algorithm into his then-obscure PKZIP utility. This compression algorithm was called DEFLATE, and it outperformed everything else available at the time. The compression capability of DEFLATE lead directly to the widespread dominance of PKZIP's ZIP file format. With the arrival of DEFLATE, many earlier compression methods fell into disuse or remained only as internal components of established file formats. Technology such as ARC and LHZ quicky turned to obscurity and obsolesence. The CAB file survived - and continues to survive, regrettably - only in the form of a resource file for Windows drivers. All other archive-compresssers of the 1980s and 1990s were displaced by the superior technology of ZIP.
More than three decades later, DEFLATE still remains the most frequently encountered means of losslessly-compressing data. It is found not only in ZIP, but in PNG, in EPUB, in CBZ, in DOCX and JAR, in the gzip compression utility commonly used any unix environment, in the WOFF font format, as a common compression within PDF files, as a transparent compression option in the BTRFS and ZFS filesystems, and in compressed HTTP traffic. Those three decades and counting have seen the introduction of BZIP2, LZMA, PPMd, zstd, Brotli, and a host of more exotic means of compression, and a problem has become apparent: The problem of 'good enough.'
When a technology is 'good enough' it acts to inhibit further progress in the field. There is little reason to adopt a new and improved technology that is not widely supported, and no reason to support something which is seldom used. To justify the trouble of compatibility-wrangling the new technology must offer a compelling advantage. In this way a technology which is Good Enough becomes a barrier to further improvements. Although new advances continue to flow from hard-working mathematicians and developers, these advances remain obscure and unused. This is why DEFLATE remains the most commonly used of all general-purpose lossless compression algorithms, even though others are available that would outperform it with ease in almost any use case: Many are able to achieve significantly higher compression ratios, or just slightly better with higher speed. This inability of better technology to displace the good-enough has lead to a sort of 'technological stasis' in some fields.
For another example of this, take the common MP3 file. This is a prime example of a 'good enough' technology. The MP3 format was introduced in 1993. For most of that decade it remained fairly obscure as few people had the hardware for digital music and no means existed to obtain MP3 files other than certain underground forums on the then-new internet. MP3 rapidly entered public awareness with the sudden arrival of Napster in 2000. The MP3 format instantly became the dominant means of distributing digital audio. For many people, 'MP3' means music: When they are searching for a place to download (often illegally) music, they head to a search engine and search for the song title plus 'MP3'. Or they find their desired track on YouTube and search for 'youtube to MP3 downloader.' Yet the MP3 format's performance - in terms of bitrate verses quality - is far inferior to more modern codecs. Many have tried to dethrone MP3 from its place as the dominant standard for digital audio, but MP3 remains stubbornly common. Some of these formats are open standards, such as Vorbis and Opus. Others are propritary, such as AC3, ATRAC and WMA. None has ever achieved as much success as MP3, though all of them are capable of providing better quality audio for the same size file, or a smaller file for the same quality, and most offer superior metadata handling. This means that, for many users, audio compression technology has barely advanced since 1993. Some improvements have been made to MP3 encoding, but the 'good enough' format has acted to inhibit the widespread adoption of better alternatives.
The modern Opus codec is generally regarded as matching the quality of an MP3 of twice the bitrate. To people with a large library of music or audio books, this matters.
This is a pattern which can be seen over and over again in the field of data compression. With falling storage and bandwidth costs there has been little motivation to adopt better technologies, allowing old technologies to entrench and remain in use decades after they should have been consigned to the legacy bin.
The story of MP3's endurance is paralleled by that of JPEG. The format was introduced in 1992, and remains by far the most popular format for lossy-compressed images. Attempts have been made by superior technologies to displace it. None have suceeeded.
The attempted JPEG replacements JPEG2000, AVIF, HEIC and JPEG-XL. All of them are able to out-compress JPEG, matching quality in a smaller file, yet none has achieved anywhere close to the universal support of JPEG. JPEG2000 died in the 2000s through a combination of patent concerns and low adoption. The remaining three remain caught in a format war, backed by rival consortiums waving theats of patents or promises of sanctuary. Even with the might of Google, Apple and JPEG backing AVIF, HEIC and JPEG-XL respectively, none made more than the slightest progress in displacing he unassailable rule of a format released in the same year as the first 33.3Kbps modems.
A quirk of history can be found lurking within the JPEG specification. When the format was first introduced, it was written to support two methods for the lossless compression stage: Huffman coding and arithmetic coding. Arithmetic coding was superior in compression performance, by around ten percent, but the procesing power required for it was excessively demanding for a 1992 home computer and the technology was subject to a patent owned by IBM who refused to license it on royalty-free terms. Thus the less-efficient but fast and non-patented Huffman coding was also supported. Due to this patent, almost no JPEG decoders supported arithmetic coding - and when the patent expired, no software was written to create arithmetic-coded JPEG because it would not have been viewable on existing software. Due to this chicken-and-egg problem, the arithmetic coding remains an unused aspect of the JPEG specification - leading only to the strange quirk that it is possible to losslessly reduce the size of almost any JPEG by somewhere between 5% and 20%, by converting it into a perfectly-valid specification-complient JPEG which almost no software will be able to open.
A similar legacy lies within ZIP files. When Phil Katz designed the format, he anticipated that there would be advancements in compression technology and made plans to accomodate whatever the future might bring. To this end, ZIP was designed to provide the option of choosing from a variety of compression methods - including those not yet invented. One of the first to be added in this way was Katz's own DEFLATE, which lead to the rapid rise of the ZIP format. BZIP2 would later be added to the supported list, followed by LZMA and PPMd. Yet these last three came too late: By then, ZIP was already in widespread use. It was seldom practical to make use of these higher-compression methods, because the resulting files would not be readable by anyone who did not use one of the rare decompression programs to support them, and nor was it worthwhile for any decompressor to impliment an algorithm which no ZIP file in real-world usage would ever employ. Thus the chicken-and-egg problem once more arises. DEFLATE remains the one and only compression method usable in ZIP files which can be confidently read by all software. In the case of many formats which use ZIP as a container, such as EPUB, their format specification explicitly forbids the use of any compression algorithm other than DEFLATE or the identity transform 'store' method. Even today, the ZIP file extraction capability built in to Microsoft Windows supports DEFLATE compression but no other.
The JPEG2000 image format was finalised and published by same organisation responsible for the original JPEG in 2000. The wavelet-based compression was easily able to out-perform JPEG's image quality at equal size, or match a JPEG at smaller size. For a time his format was widely regarded as the obvious successor to JPEG, but it never achieve widespread adoption. A large part of this was concern over software patents, but like other formats attempting to replace JPEG it faced the same problem of compatibility: Few people or software publishers wished to endure the trouble of incompatibility just to achieve smaller files. Despite the apparent failure of JPEG2000 though, it does remain widely used in one area. It is a supported image compression method within PDF files, and for this reason all software for viewing PDFs will still call upon a JPEG2000 decoding library.
Some success has been had. The venerable GIF format is, very slowly, disappearing. This image format from 1989 is responsible for bringing animation to the internet. By limiting images to an 8-bit palette combined with the then-cutting-edge LZW compression scheme, it could turn images into something small enough for even dialup internet users to accept - and the alpha mask support was highly valued by early web designers. When the format was quickly extended to support animations, it exposed a generation of internet users to the obnoxious dancing graphics that became a hallmark of 90s web design. Problems soon arose with GIF though - the images are limited to 256-color palette images, so gradients and photographs almost always appear degraded, and the simple LZW compression without a predictive element - revolutionary at the time of introduction - was very rapidly superceded. An obvious successor came out in 1997: PNG, a format specifically designed to replace GIF. Featuring support for 24-bit color and a vastly superior compression system based around per-line predictive filtering followed by DEFLATE, PNG was superior to GIF in every possible way. Even so, after more than three decades, the decrepit GIF format is still widely used - though not as widely as it once was.
GIF has endured for so long, it affected even language. Any short animated image is now commonly referred to as a 'gif,' even if the image data is now PNG, MP4 or WebP. Just as GIF lasted, now it may be the turn for PNG to outstay its welcome: The WebP format's lossless mode offers many advantages over PNG, including superior compression performance on most images, but still struggles to displace the good-enough, widely-supported PNG.
One community which does care about compression performance, and is happy to adopt cutting-edge technologies, is the world of pirates. The combination of technologically-skilled users and a pressing need to transmit huge amounts of information on a budget of almost zero created the perfect environment for innovation - making them among the first groups to abandon ZIP and its ageing DEFLATE compression in favor of Eugene Roshal's RAR format, and its support (Since winrar 3.0) for PPMd compression. PPMd was and remains one of the most capable compression algorithms to date, especially so when dealing with text-based content. Despite criticism of the closed nature of the RAR specification and software, it became the de facto standard for software piracy. The format saw little adoption elsewhere: The older ZIP format was good-enough, and more widely supported.
There is, as yet, no true successor to RAR. The obvious contender is 7Zip, which can provide marginally better compression on maximum settings. Due to the combination of slightly better compression and an entirely open specification and reference implimentation, 7Zip is very gradually displacing RAR. While RAR supports only the highly capable PPMd compression, 7Zip supports both PPMd and LZMA, allowing users to select whichever performs better.
Another place in which advancements have been embraced readily is that of video compression, where the inconveniently large size of typical files makes even a marginal improvement in compression performance highly valued. While image, audio and general lossless compression has advanced only slowly, video has rapidly advanced over the decades with new and better codecs entering usage as soon as they become available.
These outdated technologies have a cost. The cost comes in the form of time and power to transmit files larger than they need be across the internet, and hardware to store them. It comes in the additional code needed for software such as web browsers to add support for new technologies while retaining all of the old. It comes in slower downloads, and fewer audio-books fitting on a phone.
So I would issue a call to the world: It is time to start really trying to phase out some of these old formats. Many were excellent in their time, and even changed the world - but that was in another era. It is time to place them in the museum, to be admired as the legacy of the past upon which our present was built. We should be creating Opus files instead of MP3, converting our GIFs and PNGs to WebP, and abandoning RAR and ZIP in favor of 7z. Ensure your web server supports Brotli. Abandon the legacy technology that holds back greater things, and help to build a faster, more efficient internet.