Are tablets up to the task of accurate color testing?

Finally getting around to posting a follow-up to a follow-up to John The Math Guy’s recent series on color gamut size, colorblindness and tablet displays. I thought I might be able to at least shed a little more light on his question about the differences in color accuracy between some of these devices.

In his testing, John found no statistically significant difference in scores among different people taking the EnChroma colorblindness test on different devices. I found this somewhat surprising since, in my experience, even tablets with similar color gamuts tend to show colors with very different levels of accuracy.

iPad mini color gamut and Gretag Macbeth colors against sRGB in CIE1976

To show what I mean by that, I measured how two different tablets show the colors found in the Gretag Macbeth color checker chart.Nexus 7 color gamut and Gretag Macbeth colors against sRGB in CIE1976

As you can see, the iPad mini and Nexus 7 each produce very different colors, even for those colors that are actually inside their gamuts.

For example, even though the iPad mini has enough gamut coverage to accurately display the Gretag chart’s deepest blue, it cannot do so without distorting the image in another way. This is because of data in the underlying image standard- most content today is encoded in the sRGB standard. If the iPad were to show that Gretag blue correctly, it would not have enough color saturation headroom left over to show you a different color if a deeper blue, say right at the bottom of the sRGB triangle, were called for.

A good real world example of this can be found in the picture below of my bloodhound, Louisa, racing down the beach at Carmel, CA. The middle of the sky in this image is right on the edge of the iPad’s color gamut, very similar to the Gretag blue in the charts above, while the deepest blues found in the ocean fall outside the iPad’s gamut.

Out of gamut colors at beach

If the iPad were striving for accuracy at all costs, it might map both colors right on top of each other at the edge of the gamut. There’d be no visible difference between the two in this case and the quality of the image would suffer but at least the sky would be accurate. In order to avoid this scenario, the designers of these devices have decided to compromise on accuracy so they can show a full range of color differences to the user.

They do this by remapping colors inward, away from the edges of the gamut, effectively compressing the gamut even further so that otherwise out-of-gamut colors can be seen. This is a good solution given the gamut limitations of the device since it results in more pleasing, if less accurate images.

As newer devices trend towards wider color gamuts this kind of compromise should become a thing of the past. In fact, tablet designers may be working on the reverse issue- how to avoid oversaturating images that were encoded for smaller gamuts.

Great, how does this relate to colorblindness again?

iPad mini vs Nexus 7 color accuracy comparison in CIE 1976

iPad mini vs Nexus 7 color accuracy comparison in CIE 1976

Taking another look at the Gretag results from the two devices plotted on top of each other, there clearly are major differences. But, in the reds and greens, two colors associated with a common form of color blindness, the devices are relatively close. So, the simple answer may just be that colorblindness tests do not require pinpoint accuracy to be effective, at least as basic screening tools.

Color Space Confusion

For many who are new to the world of display measurement, the prevalence of two distinct, but often-interchanged color spaces can be a source of confusion. Since my recent post about the color performance of Apple’s new iPad, a number of people have asked about this topic, so I thought it would be worth a closer look.

In the world of displays and color images, there exists a variety of separate standards for mapping color, CIE 1931 and CIE 1976 being the most popular among them. Despite its age, CIE 1931, named for the year of its adoption, remains a well-worn and familiar shorthand throughout the display industry. As a marketer of high color gamut display components, I can tell you from firsthand experience that CIE 1931 is the primary language of our customers. When a customer tells me that their current display “can do 72% of NTSC,” they implicitly mean 72% of NTSC 1953 color gamut as mapped against CIE 1931.

However, from the SID International Committee for Display Metrology’s (ICDM) recent, authoritative Display Measurement Standard:

“…we strongly encourage people to abandon the use of the 1931 CIE color diagram for determining the color gamut… The 1976 CIE (u’,v’) color diagram should be used instead. Unfortunately, many continue to use the (x,y) chromaticity values and the 1931 diagram for gamut areas.”

So why are there two standards, and why are we trying to declare one of them obsolete? Let me explain.

What is a color space?

First, a little background on color spaces and how they work.

While there are a number of different types of color spaces, we are specifically interested in chromaticity diagrams, which only measure color quality, independent of other factors like luminance. A color space is a uniform representation of visible light. It maps the all of the colors visible to the human eye onto an x-y grid and assigns them measureable values. This allows us to make uniform measurements and comparisons between colors, and offers certainty that images look the same from display to display when used to create color gamut standards.

In 1931, the Commission internationale de l’éclairage or CIE (International Commission on Illumination in English) defined the most commonly used color space. Here’s a look at the anatomy of the CIE 1931 color space:

What makes a good color space?

An effective color space should map with reasonable accuracy and consistancy to the human perception of color. Content creators want to be sure that the color they see on their display is the same color you see on your display.

This is where the CIE 1931 standard falls apart. Based on the work of David MacAdam in the 1940’s, we learn that the variance in percieved color, when mapped in the CIE 1931 color space, is not linear from color to color. In other words, if you show a group of people the same green, then map what they see against the CIE 1931 color space, they will report seeing a wide decprepancy of different hues of green. However, if you show the same group a blue image, there will be much more agreement on what color blue they are seeing.  This uneveness creates problems when trying to make uniform measurements with CIE 1931.

The result of MacAdam’s work is visualized by the MacAdam Elipses.  Each elipse represents the range of colors respondents reported seeing when shown a single color, which was the dot in the center of each elipse:

A better standard

It was not until 1976 that the CIE was able to settle on a significantly more linear color space. If we reproduce MacAdam’s work using the new standard, variations in percieve color are minimalized and the MacAdam’s Elipses mapped on a 1976 CIE diagram appear much more evenly sized and circular, as opposed to oblong. This makes color comparisons using CIE 1976 significantly more meaningful.

The difference of the CIE 1976 color space, particularly in blue and green, is immediately apparent. As an example, lets look at the color gamut measurements of the iPad 2 and new iPad we used in an earlier article. Both charts do a reasonably good job of conveying the new iPad’s increased gamut coverage at all three primaries. But, the 1976 chart captures the dramatic perceptual difference in blue (from aqua to deep blue) that you actually see when looking at the displays side by side:

The increased gamut of the new iPad is worth testing. Next time you find yourself in an Apple store, grab an iPad 2, hold it alongside a new iPad, Google up a color bar image and see the difference for yourself.

So, why do we still use CIE 1931 at all?  The only real answer is that old habits die hard.  The industry has relied on CIE 1931 since its inception, and change is coming slowly.

Fortunately, CIE 1931’s grip is loosening over time. The ICDM’s new measurement standard should eventually force all remaining stragglers to switch over to the more accurate 1976 standard. Until then, you can familiarize yourself with a decent color space conversion calculator, such as the handy converter we built just for this purpose:

Apple’s new iPad display; what does 44% more color get you?

Last Friday Apple released an updated version of one of their hottest products, called simply “the new iPad.” Central to the update is a brand new display featuring significantly more resolution and color saturation. Since the resolution bit has been covered to death by others and we’re interested in color here we thought we’d take a closer look at Apple’s color saturation claims.

Our new iPad arrived on Friday and since then we’ve submitted it to several tests using our Photo Research PR 655 Spectroradiometer.

Using the new iPad, particularly next to an “iPad 2,” the reds and greens are noticeably better, but the blues in particular are quite striking. It actually makes the blue on the iPad 2 seem more ‘aqua’ than pure blue. The color data bears this out.  According to our measurements, Apple has significantly increased the saturation in all three primaries, most notably in blue:

The key color claim that Apple made on stage at the iPad announcement was that the new iPad has 44% more color saturation.  What they mean by that of course depends on the context.  There are a couple of different color measurement standards that Apple could be gauging the performance of the new iPad against such as CIE 1931 or CIE 1976.

An easy way to think about these standards is a bit like the temperature measures that we are all familiar with, Celsius and Fahrenheit, in that they are different ways communicating the same information. Saying, “it’s 5 degrees warmer today” means something very different to users of each system and its much the same way with color spaces, only we’re talking about measuring how the eye perceives color, not how warm it is outside.

We should also note that when people in the display industry talk about color saturation as a percentage, it is common practice to refer to a color gamut standard within a CIE color space. There are many color gamut standards in use today including: NTSC, sRGB, Adobe RGB 1998, DCI-P3, and rec 709. Each of these standards is a subset of a CIE color space. They are typically used by content creators to ensure the compatibility of their work from device to device. For example, if I create an image in Adobe RGB, I would like to display it on a screen that can show all of the colors in Adobe RGB in order to make sure it accurately reproduces all the colors in my original shot.

Based on our measurements it looks like Apple is referring to the NTSC gamut within a color space. But which color space do they mean?

A 44% improvement within the CIE 1931 color space would give the new iPad the equivalent of the sRGB standard used by HDTV broadcasts, Blu-Ray and much of the web. Given the significance of achieving that standard, some thought Apple must have been trying to say “sRGB” without confusing consumers by describing the meaning of various color standards.

According to our data, this is not the case. The new iPad only manages about 26% more saturation over the iPad 2 when measured against the CIE 1931 NTSC color space. However, the unit we measured showed a 48% increase in saturation when measured in the CIE 1976 color space, so that must be Apples frame of reference.

Measurements and standards aside, the new display looks great. The improvement in color performance will greatly enhance the user experience, and as we discussed yesterday, show’s what Apple is betting on for the functionality of future devices.

In our next post we will explain exactly how Apple achieved this improved color performance and look at ways they can improve the next generation.