Image super-resolution: Historical overview and future challenges

1.1 Introduction to super-resolution

In most digital imaging applications, high resolution images or videos are usually desired for later image processing and analysis. The desire for high image resolution stems from two principal application areas: improvement of pictorial information for human interpretation; and helping representation for automatic machine perception. Image resolution describes the details contained in an image, the higher the resolution, the more image details. The resolution of a digital image can be classified in many different ways: pixel resolution,spatial resolution, spectral resolution, temporal resolution, and radiometric resolution.

In this context, we are mainly interested in spatial resolution. Spatial resolution: a digital image is made up of small picture elements called pixels. Spatial resolution refers to the pixel density in an image and measures in pixels per unit area. Fig. 1.1 shows a classic test target to determine the spatial resolution of an imaging system.The 1951 USAF resolution test target, a classic test target used to determine spatial resolution of imaging sensors and imaging systems.

The image spatial resolution is firstly limited by the imaging sensors or the imaging acquisition device. Modern image sensor is typically a charge-coupleddeveice(CCD)oracomplementarymetal-oxide-semiconductor (CMOS) active-pixel sensor. These sensors are typically arranged in a twodimensional array to capture two-dimensional image signals. The sensor size or equivalently the number of sensor elements per unit area in the first place determines the spatial resolution of the image to capture. The higher density of the sensors, the higher spatial resolution possible of the imaging system. An imaging system with inadequate detectors will generate low resolution images with blocky effects, due to the aliasing from low spatial sampling frequency. In order to increase the spatial resolution of an imaging system, one straight forward way is to increase the sensor density by reducing the sensor size. However, as the sensor size decreases, the amount of light incident on each sensor also decreases, causing the so called shot noise. Also, the hardware cost of sensor increases with the increase of sensor density or correspondingly im-age pixel density. Therefore, the hardware limitation on the size of the sensor restricts the spatial resolution of an image that can be captured.

While the image sensors limit the spatial resolution of the image, the image details (high frequency bands) are also limited by the optics, due to lens blurs(associatedwiththesensorpointspreadfunction(PSF)), lensaberration effects, aperture diffractions and optical blurring due to motion. Constructing imaging chips and optical components to capture very high-resolution images is prohibitively expensive and not practical in most real applications, e.g., widely used surveillance cameras and cell phone built-in cameras. Besides the cost, the resolution of a surveillance camera is also limited in the camera speed and hardware storage. In some other scenarios such as satellite imagery, it is difficult to use high resolution sensors due to physical constraints.Another way to address this problem is to accept the image degradations and use signal processingtopostprocessthecapturedimages,totradeoffcomputationalcost with the hardware cost. These techniques are specifically referred as SuperResolution (SR) reconstruction.

Super-resolution (SR) are techniques that construct high-resolution (HR) images from several observed low-resolution (LR) images, thereby increasing the high frequency components and removing the degradations caused by the imaging process of the low resolution camera. The basic idea behind SR is to combine the non-redundant information contained in multiple low-resolution frames to generate a high-resolution image. A closely related technique with SR is the single image interpolation approach, which can be also used to increase the image size. However, since there is no additional information provided, the quality of the single image interpolation is very much limited due to the ill-posed nature of the problem, and the lost frequency components cannot be recovered. In the SR setting, however, multiple low-resolution observations are available for reconstruction, making the problem better constrained. The non-redundant information contained in the these LR images is typically introduced by subpixel shifts between them. These subpixel shifts may occur due to uncontrolled motions between the imaging system and scene, e.g., movements of objects, or due to controlled motions, e.g., the satellite imaging system orbits the earth with predefined speed and path.

Each low-resolution frame is a decimated, aliased observation of the true scene. SR is possible only if there exists subpixel motions between these low resolution frames1, and thus the ill-posed upsampling problem can be better conditioned. Fig. 1.2 shows a simplified diagram describing the basic idea of SR reconstruction. In the imaging process, the camera captures several LR frames, which are downsampled from the HR scene with subpixel shifts between each other. SR construction reverses this process by aligning the LR observations to subpixel accuracy and combining them into a HR image grid The basic idea for super-resolution reconstruction from multiple low-resolution frames. Subpixel motion provides the complementary information among the low-resolution frames that makes SR reconstruction possible.

(interpolation), thereby overcoming the imaging limitation of the camera. SR (some of which described in this book), arises in many areas such as:

1. Surveillance video [20, 55]: frame freeze and zoom region of interest (ROI) in video for human perception (e.g. look at the license plate in thevideo),resolution enhancement for automatic target recognition(e.g. try to recognize a criminal’s face).

2. Remote sensing [29]: several images of the same area are provided, and an improved resolution image can be sought.

3. Medical imaging (CT, MRI, Ultrasound etc)[59, 70, 47, 60]: several images limited in resolution quality can be acquired, and SR technique can be applied to enhance the resolution.

4. Video standard conversion, e.g. from NTSC video signal to HDTV signal.

This chapter targets at an introduction to the SR research area,by explaining some basic techniques of SR, an overview of the literature, and discussions about some challenging issues for future research.

...

Twitter, Inc.
795 Folsom Ave, Suite 600
San Francisco, CA 94107
P: (123) 456-7890