We use it everyday without realizing it. From the latest rap album, to that new film everyone’s talking about (which you may have downloaded through questionable means, but hey, I’m not judging), digital compression is everywhere. We take these mathemagical, file-shrinking formulas for granted. For the average user, it could mean saving up a few hundred megabytes of space on his hard drive, but for large corporations housing massive databases of user-uploaded information, it could mean the difference in the tune of several million dollars. From Morse code using shorter dits and dashes for common letters like “e” and “a”, all the way to advanced motion-compensation based video codecs like H.265 and VP9, data compression has indeed come a long way. However, like all good things in life, data compression comes at a cost. Compression requires processing power to encode and decode compressed data. Every time you open a jpeg file, your computer isn’t just opening up an image file. It’s building up the image using fancy math formulas like “DCT” and “entropy coding”. Another drawback of compression is that some data will be lost (lossy). In lossy compression schemes like JPEG and MP3, there is an adjustable trade-off between the size of the file and quality. We’re going to be focusing on the latter.
Anyone who owns a site or has Big Data resting on their servers knows it’s efficient to use data compression schemes in order to save hard drive space, speed up website load times, and save up on bandwidth. It’s the reasons why JPEG images are much more common on the internet than PNGs, for example. It’s also the reason why pictures on Facebook look like they’ve been taken with a potato from 1995.
It’s been a constant displeasure to delightfully be presented with the artifact-ridden, down scaled, potato-like quality of my images after uploading them to Facebook. Unable to defeat the wrath of Facebook’s ruthless JPEG compression, I eventually succumbed to the merciless compression engine. In fact, Facebook’s 1995 image filter/simulator is what prompted me to make this article.
Facebook’s JPEG compression, coupled with content freebooters with limited knowledge, or care, of image quality, screenshotting and re-uploading pre-compressed images makes the issue exponentially worse every time.
Facebook is by no means the only website guilty of overly lossy encodes. Other major sites such as Imgur, Twitter, and others employ image compression well, although not as lossily as Facebook.
The purpose of this article is to explore the image compression quality differences of various websites, any resolution downscale, as well as provide a comprehensive analysis of the extent of quality loss (if there is any compression involved at all). The information gained from this article can be used to help users better select the websites they choose to upload their images to. One can also benefit from this article by learning the basics of JPEG compression, and how to detect the degree of compression in a JPEG compressed image.
When you create a JPEG in Photoshop, GIMP, or whatever, you are presented with the option of choosing the quality level of the exported image. Usually it’s a number between 1-10 or 1-100, or it could be a percentage. The problem is that these arbitrary numbers representing quality levels are different for each program you use. 60% quality in Photoshop is different than 60% quality in GIMP. The challenge is how can we figure out how much of the original image was compressed? We can compare the sizes of the original and compressed image, and we will be doing that too. Another option is to download a program that will analyze the image and output a number or percentage representing the quality level, and use that as a gauge for image quality, but an estimated number will not be enough for our needs. A better option is to extract the quantization tables in the image. JPEGs use what are called quantization tables to lossily compress image data. Quantization tables are essentially a table of numbers used to divide their respective Discrete Cosine Transform (DCT) matrix coefficients in order to reduce or eliminate high frequency information and reduce file size. JPEGS contain two quantization tables, one for luminance (brightness) and one for chrominance (color). If none of that makes any sense for you, don’t worry. For the sake of this article, all you need to know is that the lower the numbers in the matrix, the better the image quality will be. Here’s what the quantization tables look like:
The top table is luminance, the bottom is chrominance. The image shown at the top of this article has values of 255 throughout the two tables.
The quantization tables can only tell us so much. For example, they do not tell us if the image has already been compressed before. An image that’s been compressed 5 times for example, will give us low quantization numbers if the last compression was high quality, but the image may still end up looking like complete shit. Another limitation is that these matrices tell us about the compression, and compression only. The image may have low spatial resolution, noise, rubber-banding, or other artifacts present in the first place.
Image subtraction is literally what its name implies. You take one image, and subtract it from another. The result is the difference between the two images. This can be helpful in observing the differences between the compressed and original image. Here’s an example of two images being subtracted, and their result:
The second image is the first image compressed using 10% quality level. The third image shows the areas with the most compression, which is represented by brightness. The third image lets you see what data was lost after the compression process.
I will be choosing the following RAW image I took with my Galaxy S6 to upload to various websites:
This image is compressed, obviously (I’m not going to upload a 30MB raw image on this site) but I’ve linked the original images at the bottom of this article, if you’re interested.
There are several reasons why I chose this image:
- It’s huge (5328 X 3000). This massive resolution will test which sites will accept the entire image without downscaling the resolution.
- It’s uncompressed RAW. No point in double compressing a compressed file.
- It’s complex. There are many curves, edges, shapes, and colors to be lost during the encoding.
- It’s one of those rare half-decent pictures I’ve taken. I’m by no means an avid photographer.
I’ve converted the image to PNG in order to losslessly compress the image, and increase compatibility with websites. Don’t know many sites that accept .DNG files. Not even the gallary app on my phone can open them.
I’ve also made three copies of the image. One is the original 16MP file, and the other two are down-scaled versions in 4K and 1080p respectively. (27.4MB, 17.5MB, 4.36MB)
Here’s a list of tools used, and what they’re used for:
- A computer with an internet connection (who would’ve guessed?)
- GIMP. Used for downscaling the image and converting to PNG
- DJPEG, used to extract the quantization tables.
- MATLAB, used for performing image subtraction
- Firefox, used to download/upload images and extract the link to the original image from page source
All three images will be uploaded to various websites. The results will be organized in a table as follows:
Cut To The Chase
Alright, enough metadata. Time to actually begin the tests. Let’s start with Facebook.
*Keep in mind these tests were performed with PNG images only. Different formats may yield different results.
|Resolution||2048 x 1153||2048 x 1153||Same|
16MP image quantization table:
4K image quantization table:
1080p Image quantization table:
The quantization tables for the three images uploaded to Facebook seem to be fairly consistent. The 4K and 1080p are identical, while the 16MP is slightly better.
This is the difference between the original 1080p and the Facebook encode. Subtraction was performed by “imsubtract(x,y)” in MATLAB. The resulting image’s brightness and contrast was increased by 127 and 123 in GIMP, due to the original being too dark to view.
|Resolution||2047 x 1153||2048 x 1152||Same|
For some odd reason, Twitter removed one horizontal pixel from the 16MP image and removed one vertical pixel from the 4K image.
The quantization tables for all three images are identical for Twitter. Twitter’s table indicates the JPEG quality is somewhat superior to Facebook’s quality.
Image subtraction between original 1080p and Twitter’s version: (same brightness and contrast settings)
*For Facebook and Twitter, only 1080p subtraction can be performed, due to the different resolutions for the other two images. Upscaling the compressed image, or downscaling the original to fit the resolution would skew the results, so it is better left undone.
Imgur is unique, because it offers lossless compression on images less than 10MB. Anything above will be JPEG compressed. One interesting thing to note is that while Imgur employs compression, it will retain the original resolution of the image.
Here is the quantization table for the 4K and 1080p images: (They share the same values)
Imgur’s quality level is very similar to Facebook’s. However, since Imgur retains the original resolution, that makes Imgur superior to Facebook in terms of overall image quality.
Here is the original 4K and Imgur 4K subtraction
|File size||Doesn’t matter||Doesn’t matter||Doesn’t matter|
|Resolution||1024 x 577||1024 x 577||1024 x 577|
Photobucket so far has the largest resolution downscale out of this list. They don’t compress the images, but they instead downscale it to 1024 x 577 (0.6MP). I can’t speak for all images uploaded, but my images were downscaled to that extent, but not compressed. I used the “download” button to download the image, so I don’t know if the direct download downsizes the image, but I was unable to find any higher resolution variants after searching through the page source code.
Tumblr has a 10MB limit on their image uploads, so I was unable to upload the 4K and 16MP images. Otherwise they don’t compress the images.
Pinterest refuses to upload the 4K and 16MP, due to their file size restrictions.
Reddit has an upload limit of 20MB. No compression or other modification is added on their part.
From what I’ve tested, Google, LinkedIn, and Flickr all retain the original three images uploaded. The three images were not downscaled, compressed, or converted in any form, and retain their original PNG format.
Suggestions for improvement
I understand space is a huge concern for websites, as storage is very expensive. What I can suggest is for more sites to follow Imgur’s philosophy with image compression. Users should be able to upload lossless images if it is under a certain file size threshold, and compress the images if they exceed that file size threshold. That way, users with small images that were losslessly compressed, such as graphic artwork, low resolution images, and other easily losslessly compressible images shouldn’t have to undergo additional compression.
What conclusion can we draw from this?
- Different websites treat your images drastically different from one another.
- The maximum file size permitted varies drastically as well.
- Many sites employ no compression at all, which I found to be pleasantly surprising.
- Many websites will downscale your image’s resolution if it is above a certain threshold.
- To choose your image sharing site carefully if you’re serious about the quality of your images.
If you’re interested in downloading or viewing the images used in this article, Here is the link to the folder containing the images: