wrotniak.net: High-Res Mode in the Olympus E-M1 Mk.II

Sensor Shift as a Way to Increase Resolution

The sensor-shift technique has been used for a few years already in some applications of medium-fornat photography, at the expense of some complication and a serious extra cost. Olympus was the first manufacturer who introduced it into a product normal people, like you and me, can afford and use.

This was the E-M5 Mk.II in 2015. Just months later Pentax offered a similar feature, called Pixel Shift Resolution in their K-3 II model. We will come back to that later.

The method uses a typical image sensor, but does it in an innovative way. To understand it, we must start from how such a sensor usually works.

Pixels, Photosites, and the Bayer Matrix

The most common arrangement is the-so-called Bayer matrix: an array of photosites (light receptors), arranged in a rectangular grid, with each photosite responding to, roughly speaking, one component (or spectral region) of light: Red, Green, or Blue.

Photosites are often confused with (and referred to as) pixels. This is wrong and misleading, often resulting in false conclusions.

The principal distinction is that while photosites are physical entities, really existing on the sensor surface, pixels are purely abstract constructs, stored in memory, and arranged in a rectangular grid.

A photosite responds to received light (one RGB component) at its physical location by generating a signal, translated into a numeric value. A pixel stores three such values as RGB components, describing brightness and color of light at its assigned logical location.

This means that three photosites capture the information equivalent to that contained in one RGB pixel. How many photosites are actually used for that and how, depends on the associated software.

Usually the image created by the camera has a number of pixels equal to the that of photosites, and that's the number listed in the specs.

In this arrangement, the abstract grid of pixels is mapped into the physical grid of photosites: every pixel has an associated photosite. During the so-called demosaicing (a part of raw-to-RGB conversion) that's the photosite supplying its color component's value to the pixel, and whose neighbors are used for interpolation of two other components.

Yes, that's right: for each pixel only one component is known from its associated photosite; two others are computed by interpolation from its nearest (two or four) neighbors responsive to the missing color.

This, actually, is nothing else but upsampling. Your 24 MP image is really an 8 MP image upsampled to 24 MP, period.

This, however, does not have to be the case.

Sensor Shift

This method uses multiple exposures, each followed by information readout, to generate a single image. Here is how it works.

The first exposure is done with the sensor it its original position. When the camera reads the signal from the photosites, we store for each pixel just one of its RGB components, the one from the photosite assigned to that pixel.

Next, the whole sensor is moved by one photosite grid step in one of the four principal directions. At each original grid node location there is now another photosite, sensitive to another color component. Another exposure is made; the camera does the readout again and sends this signal not to each site's associated pixel, but to that bound to the original one (before moving the sensor). Now every pixel will now have two readouts, of different RGB components, measured at the same physical spot!

Repeating this step twice more in a square pattern, we end up with every pixel receiving four RGB values, recorded by each member of the 2×2 photosite team at the time when that member was at that pixel's location (more exactly: at the original location of the photosite associated with that pixel).

The only problem is that each of the four readings was done at a slightly different time.

Having all RGB components as they were read at the right location makes the demosaicing interpolation no longer needed. Just store them in the pixel (averaging the two Greens, one too many).

Now comes the second part. The camera shifts the sensor diagonally by approximately 0.71 of the grid step (or half step each in two perpendicular directions). The new location of each photosite is now smack in the center of one of original photosite squares. From there it starts a new round of the same square dance, ending up with a second grid of full-information pixels, shifted by half a row and half a column from the original.

The last sensor move is returning it to the starting position, as if nothing happened. This is the second fractional move out of the total of eight. All others (those within each square dance) are by a full grid step, or photosite spacing.

The only thing remaining is the messy job (trust me, numerical methods are my thing) of merging these two, diagonally offset, grids, into one, with nice, even spacing in both directions. The pixel count of this new grid should be double the original, or perhaps somewhat more, to account for the fact that the new pixels are not interpolated in two out of three components.

Olympus engineers chose this "somewhat" as an extra 25% on top of the obvious doubling, so the total pixel count was increased by a factor of 2.5× from 20 MP in Standard Resolution (SR) to 50 MP in High Resolution (HR).

The square root of 2.5 is 1.581, and this is the factor by which both pixel dimensions rose; see the table below.

Resolutions and file formats

What we just discussed, was the default, 50MP image resolution, often referred to as HR, used by the High Res Mode.

If the raw-to-RGB conversion is done off-camera (using the Olympus Viewer or a similar application), the resulting pixel grid will have each dimension doubled and a pixel count 80 MP, Super Hi-Res or SHR.

Interestingly, there is one more option here: a 25 MP file, with just 11% more of linear resolution. I will be referring to it as Enhanced Resolution (ER) until I find a better name.

Do not dismiss the Enhanced Resolution just because it offers only 11% (nominal) of linear resolution in pixels. Remember that every pixel (before the final interpolation) contains information from four photosites, and that two interpolation grids are merged. This should not only deliver more detail, but also reduce noise considerably. I'm sure it does.

For the same reason I'm not bothered by the fact that the final pixel grid is entirely interpolatemd in each case.

Note that the only way to get a standard-resolution file while shooting in High Res mode, is to choose one of the Raw+JPEG save options: 35MB+Raw or 50MB+Raw. A raw version of the first image in the sequence is then saved, renamed to .ORI, which Olympus Viewer will convert into a native-size JPEG.

Using another conversion software may result in a slightly different format (not to mention some real problems with color and/or tonality).

Just to avoid any confusion about the table: you select exactly one of the Save As columns; files in resolutions and types shown in lines marked Yes (in that column) will be written out.

The great Olympus Treasure Hunt. Choosing the file type and resolution (the Save As column) is now, I believe, the best-hidden piece of user interface Olympus ever wrote — and trust me: they've been trying! As a personal favor, to make yur life more entertaining, I'm not going to tell you where to find it; try to do it on your own and tell me what you think. Using the camera manual is OK (although unmanly).

The pixel dimensions shown in the table for raw (.ORI and .ORF) files apply to JPEG conversions by Olympus software. The actual area captured raw is slightly (less than 1%) larger.

The image samples shown and discussed in this article were produced in the baseline (50MP) High Res resolution, for more than one reason. The results for Super High Res and Enhanced Res will, I hope, follow in a sequel (or sequels).

The Implementation

Olympus first introduced the sensor-shift Hi-Res Mode in the E-M5 Mk.II, and then, with some tweaks, in the E-M1 Mk.II, the new flagship. I have used it only on the latter, so my experience may be somewhat limited.

It is provided as one of the drive modes, and denoted with a nice, double prison window icon. To eliminate camera shake, it uses solely the electronic shutter (gating); it also allows to add triggering delay up to 30 seconds.

The sensor stepping is done by the same motor which is used for image stabilization, so there is no IS option in HR Mode. Nothing lost here, because HR requires both camera and subject to be perfectly steady over the time from the first to the eighth exposure. This usually takes a fraction of a second plus eight times the exposure time used. After that, in-camera processing takes two seconds or so.

The latest version of the HR Mode adds a smart trick to protect you from a minor change in the scene ruining the result; for example, someone walking through it. The algorithm checks for consistency between eight frames, and if it detects changes significant enough, it replaces the offending area in the merged image with a proper piece of upsampled original (the first one in the sequence, I'm not sure). From examples seen on the Internet I can only say that sometimes this works.

There are two other limitations in HR shooting:

The lens cannot be stopped down beyond F/8;
Only ISO settings up to 1600 can be used.

I am sure there is a good rationale behind each of those, but it would be bloody nice to know it.

The simplest way to get an HR image is to let the camera to save it as a JPEG file; there are some raw file options, too, but I never bothered with them.

Image samples

I will be presenting here three pairs of image samples, all shot with the E-M1 Mk.II camera. In each pair one picture is shot in the native resolution of 20 MP (5184×3888), and the other — in the HR Mode, saved in-camera as a JPEG file of 50 MP (8160×6120).

The 2.5× increase in pixel count means an increase in resolution by a factor of 1.58× — but this is just the upper, theoretical limit, assuming that pixel density is the only limiting factor. There are others, though, of which lens quality first comes to mind, followed by camera shake or vibration.

The procedure

All sample sessions were arranged indoors, using a medium tripod (admittedly, a heavy one would be much better). The Android OI Share application turned out to be useless, as it ignores the HR Mode, so I used a delayed HR release (8 seconds), for the last session switching to Olympus Capture in Windows, which does HR just fine, and allows for very precise focus adjustments under large screen magnifications.

Using Olympus Capture would be absolutely my recommended way for High Resolution shooting — if you can accept the limitation of tethered camera control. Also, throw away the crude, thick and rigid USB cable supplied with the camera and buy something better (longer, too).

To rule out a possibility of AF errors or camera shake, I shot multiple copies of every sample. Very few (less than 5%) had to be disqualified for any reasons.

The camera's image processing was set to the Natural Picture Mode, with sharpness, contrast and saturation at zero (default). Noise Filtering was at Low, which is one step below the factory setting. WB and ISO were set individually for each scene (sample pair).

Usually my preference with this camera is to keep Sharpness at -1, but here I left it at 0, because the images were not to be affected by postprocessing.

Image Sample Set #1

Here is my first sample: a close-up scene showing an old bank note, a subject with lots of detail. It was shot with the MZD 75/1.8 ED lens stopped down to F/8.0.

The top row is just for illustration. Full frames are reduced to fit the page, and red boxes show from where the samples presented below are taken. Note that the high-res box is smaller in this picture, for obvious reasons.

The second row is where we can compare the images. These are 480×360-pixel fragments from original images, otherwise unaltered. Here the difference in image pixel size shows quite clear and it makes the comparisons less straightforward. Tough, these are facts of life.

To make up for the difference in magnification, there is the third row, in which the original (non-HR) image fragment has been resized (upsampled) to match the magnification of the HR version. The right-hand sample is just a repetition of the one above it, for comparison convenience.

While it may seem that comparing between samples in this row may be easiest and most meaningful, this comes with some caveats. Upsampling introduces some unsharpness. Most of it is fully legitimate and comes just from stretching the same amount of detail over a larger area, but some more may be due to antialiasing and, possibly, other tricks used in the upsampling algorithm. This is why many authors apply a sharpen filter after upsampling an image.

The decision how much of that sharpening to apply is, almost entirely, up to the person performing the comparison, and it may be, within some limits, quite arbitrary; too arbitrary for my taste. The resulting image may look sharp (no fuzziness in high-contrast lines) while, at the same time, missing a considerable amount of detail (the lines look smoother in the picture than they actually are in the subject).

After a day of deliberation and experimentation, I settled down at not applying any explicit sharpening after resizing, just allowing the software used (Paint Shop Pro X9) to use its default (50%) amount of sharpening as a part of the upsampling process. I believe the program writers intended this to be a reasonably neutral setting, and some of my experiments (viewing upsampled images from a greater distance is quite informational, for that matter), seem to confirm that feeling.

Now the samples. My comments will follow afterwards.

This is quite interesting. Comparing unaltered 1:1 fragments, we see that the native-resolution [1.2] seems a bit less fuzzy than the high-res version [1H.2]. On the other hand, the latter shows more detail; for example, look closely at the edge of letter 'H', or its ink texture.

What does that mean is that HR provides some increase in resolution, but it is smaller than the increase allowed by the new pixel density.

Upsampling of [1.2] into [1.3] to compare it, again, with [1H.2], confirms this observation. Image becomes more fuzzy (as it should), and while contours seem quite sharp, there is no contest about detail, see the letter 'H' again.

Conclusion: Sample Set #1 shows that the HR Mode provides more sharpness and detail than the native resolution; the gain is, however, not as high as the new pixel resolution would allow, most probably limited by lens quality.

Image Sample Set #2

Here I used the same subject, but shot it with the MZD 12-100/4.0 ED IS PRO lens at 100 mm, F/8. While this is a zoom, it has very good press (I like it, too) and I was hoping it will be able to make a better use of the HR Mode.

This set does not bring in anything new compared to Sample Set #1. The lens seems to out-resolve the ZD 50/2.0 Macro at normal shooting distances, while at closer ones the situation seems to be reversed (see below).

The bottom line is the same: the HR Mode resolution is improved, but less than expected. Is this the lens again, or maybe some other reason?

Image Sample Set #3

This calls for drastic measures, I thought, and brought in my secret weapon, the sharpest lens I have: the ZD 50/2.0 ED Macro. True, I have to use it with a FT->μFT adapter, and AF is not as responsive as that with native μFT lenses (even on E-M1 bodies), but this is, I hoped, like U.S. cavalry showing up in a Western movie.

No more talk, just samples.

One look at the second row and we need nothing more: even in 1:1 pixel view sample [3H.2] is as sharp as [3.2], in spite of the magnification being 1.6× higher. I was hoping this lens will bring some improvement — but not this much!

We don't even need to look at the third row; it just confirms the first impression.

Conclusions:

The HR Mode brings in a possible increase in (linear) resolution by 60% or so;
This potential increase will become real only if the lens provides enough resolving power;
The legacy Four Thirds ZD 50/2.0 Macro ED does that.

Now, look at this. This lens was introduced on the market in 2003, together with the very first Four Thirds camera, the E-1, which had a resolution of 5 MP. Now, fourteen years later, it shows capable of filling 50-MP images with detail. That's ten times pixel count, three times linear resolution.

Image Sample Set #4

Now, that we are done with resolution, let's have a look at another area where the HR Mode may offer something: the noise. We discussed this in the first part of the article, so what remains is just to see the image samples.

Here I decided to go for the pencil box scene, with smooth, out-of-focus gradient transitions being the favorite playground for noise demons. As opposed to my previous arrangements of the same subject, this time I used diffused daylight, with an old computer game box used as defocused background.

The presentation is identical as for the resolution sample series, so no additional explanations are necessary.

The unaltered 1:1 fragments [4.2] and [4H.2] look, to a naked eye at least, quite similar. (Inspecting whole frames shows that the HR image does have some advantage here, but nothing dramatic.) This means they do not differ much in terms of noise per pixel. This is both good news and bad news.

Good: because having the same per-pixel noise in an image of higher magnification (or pixel density) means less per-area noise when the image is printed or displayed (such image needs less magnification then).
The per-area noise amplitude is inversely proportional to linear pixel density, so this means a drop by a factor of 1.6×.
Bad: because I was hoping for a 50% reduction in per-pixel noise amplitude due to the fact that each pixel is now constructed out of four independent photosite readings, not one.
The only explanation I may have here is that what we see is not the raw noise, but one after postprocessing in raw-to-RGB conversion. Having less of raw noise in the HR data, Olympus engineers probably chose to apply less of noise filtering, to keep the resolution as high as possible. This makes perfect sense, as the image noise at ISO 1600 (the highest setting allowed in HR) is very acceptable.

Upsampling [4.2] to [4.3] and comparing that to [4H.2] levels the field making both samples equally magnified. Clearly, the HR version is cleaner. (It is sharper, too, as discussed in Sample Set #1, using the same lens.)

Conclusions:

The HR Mode keeps the per-pixel noise at the level similar to (if a bit lower than) that in native resolution. This means reducing the printed/viewed noise amplitude by a factor of 1.6× or to 62% of the original.
Noise filtering in HR Mode is, most probably, less aggressive than in the native format, which is possible thanks to reduced amplitude of raw noise. This is to take a better advantage of the resolution increase.

What Difference Does All This Make?

As a concept and a piece of technology, the approach used by Olympus is brilliant; I also find it aesthetically pleasing, just beautiful. It also works, which means the engineers were able to take care of dozens (maybe more) of pesky ifs and buts in the implementation; these often ruin great concepts in practical use.

While it can (and should) be enjoyed as a piece of engineering art, its practical significance for 99% of photographers (and I mean advanced amateurs and professionals here) is negligible.

First of all, we already have more pixel resolution than we need or actually can use for most applications.

Sadly, the general public does not (and never will) understand that once you reach some level, any further increase in pixel count is just a waste of resources and/or a marketing gimmick. They are not aware that an 8 MP camera with a good lens makes better images than a 30 MP one with a crappy piece of optics. Educating the market would be counterproductive for manufacturers: it is much cheaper to make a sensor chip delivering more pixels than a lens with higher optical resolution.

Therefore, the higher resolution will be really useful in some special applications only — when and if it works within a given context.

Secondly, at the current stage the sensor-shift HR technique suffers from some limitations. Two of those stand out.

Only the sharpest lenses can fill all those pixels with actually rendered detail. My guess is that 90%, perhaps more, of lenses on the market now do not meet this requirement.
The technique uses a number of frames shot over a period of time, be it a few seconds, or a fraction of a second. The image being recorded has to stay identical between the first and the last frame. This limits the applicability to static scenes shot with a static camera; the smallest departures from that will either ruin the picture or, at best, obliterate any gains in resolution.
Higher, clean ISO settings and some tweaking of the process may, and probably will, make it possible to speed up the eight-frame sequence so that it will become handholdable. My guess would be another year or two. I still remain skeptical, though.

But even if all this will not affect the mentioned 99% of us, for some dedicated enthusiasts (like me and, probably, you), it is a fascinating development, opening new possibilities of learning and experimentation.

See also: More on the E-M1 Mk.II High Res Mode, a sequel to this article.

References

Olympus E-M5 II High-Resolution Mode by Dave Etchells and Dave Pardue on Imaging Resource. While the article (being a part of the camera review) does not really explain how the HR Mode works, it presents some interesting comparative samples, setting the results against those from Nikon D810 (35 MP full-frame) and Pentax 645Z (medium format, 51 MP). Highly recommended reading.
High Res Photo Mode, a part pf the E-M5 II Review by Richard Buttler — an extremely valuable resource explaining how it works
Making the EM5 Mk 2 High Res Mode Sing by Brad Nichol at Steve Huff Photo. Written from a down-to-earth, practitioner's angle, this article shows what may (and probably will) go wrong when you start doing HR photography. While based on author's experience with the E-M5 Mk.II, the article applies to the E-M1 II as well. A must-read before you go there.
Olympus Hi Res Mode: Thoughts & Test Image, a post by Shawn Liebling on a DPResource forum. User's impressions, based of a Pen F experience,
E-M1 Mark II Review Extension by Robin Wong on his blog is largely devoted to this subject. Like everything Robin writes, it is entertaining or educational (usually both).
E-M5 Mark II and its Hi-Res mode by Khen Lim on Gary Ayton's Photography Wiki — a commentary, placing this development in industry context; not technical but worth reading.

	Standard resolution	High resolution mode
	[1.1]	[1H.1]
	Full frames, reduced in size, with sample origins shown. Aperture priority (+.3 EV): 1/15 s at F/8, ISO 200
	[1.2] 1:1 sample from the original	[1H.2] 1:1 sample from the original
	[1.3] As above, upsampled to match the HR size	[1H.2] (yes, not a typo!) Same as above

	Standard resolution	High resolution mode
	[2.1] Full frame reduced in size	[2H.1] Full frame reduced in size
	[2.2] 1:1 sample from the original	[2H.2] 1:1 sample from the original
	[2.3] As above, upsampled to match the HR size	[2H.2] (yes!) Same as above

	Standard resolution	High resolution mode
	[3.1]	[3H.1]
	Full frames, reduced in size, with sample origins shown. Aperture priority (0 EV): 1/15 s at F/8, ISO 200
	[3.2] 1:1 sample from the original	[3H.2] 1:1 sample from the original
	[3.3] As above, upsampled to match the HR size	[3H.2] Same as above

	Standard resolution	High resolution mode
	[4.1]	[4H.1]
	Full frames, blah, blah, blah Aperture priority (-1.3 EV): 1/60 and 1/50 s at F/4, ISO 1600
	[4.2] 1:1 sample from the original	[4H.2] 1:1 sample from the original
	[4.3] As above, upsampled to match the HR size	[4H.2] Same as above