1994 VR Conference Proceedings

Head-Mounted Display as a Low Vision Aid

By: Eli Peli
Schepens Eye Research Institute, Harvard Medical School; New England Eye Center, Tufts University School Of Medicine; 20 Staniford Street, Boston, MA 02114(617) 723-6078 ex 547

Abstract

The Private Eye (TM), a binary head mounted display, was originally designed to display static images. We have demonstrated that binary high contrast images can improve face recognition by the visually impaired. Recently, Optelec has introduced a modification of the device called Bright Eye (TM) (BE) which enables on-line presentation of video images. We have adapted the BE to present video movies as a testing platform for a mobility aid. The BE was designed for text presentation and therefore a single threshold performed satisfactorily. However, direct feed of a video signal through a single threshold results in poor image details. A better detailed binary image can be obtained by applying bandpass filtering prior to thresholding. The applicationof the adaptive enhancement algorithm by a DigiVision (TM) device enables such live 2-D processing. I will present a video tape illustrating the differences in image quality with and without the enhancement. To reduce the cost, weight, and power consumption of a portable low vision aid, we have designed a one dimensional processing alternative. Since the processing is applied only across the horizontal dimension, it provides no enhancement of horizontal features in the image. However, in the head mounted camera application the user can easily resolved such details by a slight tilt of the head.


Introduction

Reduced contrast sensitivity is often the result of diseases leading to low vision. This is especially true at high spatial frequencies [Note 24]. Because of lost sensitivity, perception of fine detail is limited. This limitation can lead to an inability to distinguish faces and scenes, and severely limit reading speed. These restrictions are common complaints of low vision patients.

The use of image enhancement to aid perception in the low vision population was first proposed in 1984 [Note 24]. Its use has since been evaluated in several domains, including face perception, motion video perception, and reading. Enhancement of face images has been shown to improve recognition for many low vision observers [Note 22]. Two types of enhancement were implemented in that study: adaptive enhancement [Note 25] which results in high-pass filtered gray tone images and adaptive thresholding leading to a binary, high contrast caricature-like resemblance of the original image. Improvement in face recognition using the two methods was similar. Until recently, however, testing of this technology was off-line and limited to static images.

Current technology allows us to apply the adaptive enhancement algorithm on-line to moving color images [Note 9]. We are now able to directly evaluate the effectiveness of enhancing details in movies and moving text. Recent studies have evaluated the effects of adaptive image enhancement on the appreciation of details in moving scenes and reading rate for text scrolled across a video monitor.

It has been suggested that image enhancement could be implemented in a portable visual aid using a miniature camera and a head-mounted display [Notes 20, 18, 17]. In this paper I describe initial experimentation aimed at modifying the Private Eye (TM) (PE), a commercially available, low cost, head-mounted display for this purpose. The implementation required modifying the display, designed for static computer graphics, to process and present a live video signal from a camera. Since the display is binary, it can be easily adapted using a single threshold to the presentation of text when used as a portable electronic magnifier. For use as a mobility aid, where it will present gray scale images of the environment, we tested two different approaches to binarization, both of which approximate the adaptive thresholding used by Peli et al.[Note 22]. In one approach the signal is processed in 2-D using the DigiVision (TM) adaptive enhancement device followed by a fixed threshold. In the other, a low cost analog filter is applied to the video signal to obtain a one dimensional high-pass filtering before thresholding. The 1-D processing may be sufficient in this application due to the ability of the user to quickly and easily examined other directions using small head tilts with the head mounted camera.


Perception Of Moving Scenes In Full Color Video

The perception of details in moving scenes is crucial to the ability to follow the story-line of a movie or to identify people and scenes when using a mobility aid. For patients with low vision, this is frequently a difficult task. In a pilot study reported recently [Note 21] we evaluated the effectiveness of image enhancement on the subjective appreciation of both still images and motion video presented on a television monitor. A DigiVision (TM) device [Note 9], which implements the adaptive enhancement algorithm of Peli and Lim [Note 25], provided the enhancement.

Twenty two low vision subjects selected their preferred enhancement and responded to specific questions about details present in scenes from the movie Gigi (1958). Subjects reported that they were able to distinguish between the static images that were unenhanced and those produced using their preferred enhancement settings. All preferred the enhanced version of the still images, while 95% preferred the enhanced motion segments. These subjects reported that the enhancement resulted in darker images with more visible outlines and fine details. For 91% of them, face recognition was easier, 82% reported that it was easier to interpret facial expressions, 90% thought the enhanced image appeared more natural, and 100% found it easier to follow the action. We also found a significant improvement in the identification of details in still scenes with enhancement [t(16) = -3.9, p = .0012].

As mentioned, most subjects described the enhanced video as appearing more natural, which they preferred. This was the case despite the fact that to a normal observer, the enhanced image appeared processed and distorted [Note 23]. Though not definitive, these results lead us to believe that the enhancement algorithm previously found to improve recognition of still images of faces for low vision patients also improves their recognition of details in moving scenes.


Reading Moving Text With Enhancement

It has been hypothesized that the application of spatial filtering to text displayed on a video monitor would increase reading speed [Note 12]. Lawton [Notes 12, 13] reported that her 3 subjects were able to increase their reading rates 2-4 times when a partially individualized enhancement algorithm was applied to statically displayed text. More recently, Lawton et al. [Note 14] have reported substantial increases in reading rate for scrolled text for subjects with moderate visual loss.

Fine et al. [Note 6] tested the effects of the adaptive enhancement algorithm on the reading of scrolled text. In these studies, individual sentences are scrolled from right to left across a video screen, and subjects are asked to read them out loud. Sixty seven low vision subjects read sentences that were displayed either as white text on a dark background, or these same characters enhanced to be visually similar to Lawton's [Note 13] published examples. Characters were magnified to subtend 6.23 , and on average, 4.5 characters were visible on the screen. The order of presentation of enhanced or unenhanced text was counter balanced across subjects.

Fine et al. found a 9.3 word/min (wpm) increase in reading rate with enhancement. This difference was statistically significant (F(1,64) = 14.18, p = 0.0004). This represents an average 13% improvement in reading rate across subjects, which is significantly different from zero (p < 0.006). However, 34% read slower with enhancement. Those reading slower had an average decrease of 22% in their reading rate. The majority of the patients showed an increase in their reading rate with an average of 31%.

For subjects reading text larger than 5 times their acuity threshold in size, enhancement resulted in only a 7.4 wpm increase in reading rate. On the other hand, subjects tested with text nearer their acuity threshold increased their reading rate by 13.5 wpm . Since these subjects read slower without the enhancement, this represented a 29% increase in reading rate for this group. Although the experiments of Fine et al. have shown increased reading rates for contrast enhanced text, they are substantially less than the increases reported by Lawton, even for the most severely impaired.


Head-Mounted Reading Aid-The Bright Eye (TM)

Most low vision patients are able to read with increased magnification. Magnification is typically provided with optical aids such as high add glasses and hand held magnifiers. The magnification available with such devices is typically limited to powers of 10-12X. The usefulness of higher optical magnification levels is limited by the reduction of the field of view (number of letters seen), the extremely short working distance, and difficulties with the reduced depth of field. Optical magnifiers also reduce image brightness and contrast [Note 4].

Higher levels of magnification, essential for many patients (as many as half in one study [Note 11]), are usually possible only with CCTV electronic magnifiers. These devices provide the benefit of large magnification levels (up to 60X) combined with a relatively large display allowing more letters to be visible at once. It has been demonstrated that more than 4 letters have to be included in the displayed window for patients to reach their maximum reading rates [Notes 15,16]. The large magnification of CCTV also permits reading in a more comfortable posture, enabling longer, more enjoyable reading. In addition to magnification, CCTVs can aid the visually impaired by allowing an increase in the contrast of the displayed material. In many cases the displayed text can be adjusted to contrast higher than the original. High contrast was shown to be more important for low vision reading than for normal reading [Notes 1,26]. The ability to reverse contrast polarity (white letters on a black background) is judged useful by many patients and in some cases an increase in reading rate can be recorded with reversed polarity [Note 15]. This too is available with CCTV.

The main disadvantages of the CCTV magnifiers are size and cost. The weight and size of these systems necessitate their use at a fixed location. Active users may need separate systems at their place of work or study and at home. Even with multiple systems, use is limited. The price ($3000 and up) leaves them out of reach for many interested users. These disadvantages and market needs may be met in part by the development of portable CCTV systems.


The Display Unit

A portable electronic magnifier system, the Bright Eye (TM) (BE), was recently introduced by Optelec, Inc. of Westford, MA. The system is based on a miniature head mounted monocular display (HMD) device called the Private Eye (TM) (PE), which was developed by Reflection Technology, Inc. of Waltham, MA (described in ref. 18). The display is designed to operate as a monitor on any IBM compatible PC and its aim is to provide a portable, private, inexpensive means of visual communication.

The PE combines semiconductor and electro-mechanical techniques to create a virtual image of a 12 in monochrome monitor in a package of 1.1 x 1.2 x 3.2 in, weighing about 2 ounces. It is designed to be head mounted in front of one eye, with the other eye's view of the environment uninterrupted. The PE provides high resolution, (720 (H) x 280 (V) pixels) and a wide field (about 21 x 14 ). The displayed pixels are generated by red light emitting diodes (LED) on a black background. The contrast ratio is quoted as 30:1 nominal, and the brightness is 2 foot lamberts nominal. The display is refreshed at 50 frames/sec (non interlaced). The head set is configured to enable use with either the right or the left eye and can be located above, below, or directly in front of the wearer's line of sight.

Image data is sent as bit map graphics from a host computer to the display unit. The bit map information is loaded into a linear array of LEDs. A whole column is illuminated at once for about 6.25 msec. While the image is displayed column by column, the linear array is scanned horizontally by an oscillating mirror. The resulting scan is imaged by a lens system to form a virtual screen at 2 feet from the viewer's eye. A focusing knob allows correction for substantial levels of spherical refractive error, but the display can be easily used with spectacles when needed. The integrated technology used enables production of the device at low cost. Even now, before mass production, engineering development kits, including the display, the head mount, and all the electronic circuitry are being sold for $550.00. The manufacturer anticipates costs in the range of less than $100 in production quantities.


Portable Reading Aid

The PE can display only computer generated still graphics or very slow animations using a small fraction (1/20th) of the field. To operate as a CCTV it had to be modified to display asynchronously live (video rate) images acquired from a CCD camera. The BE (Fig. 1) was designed to accept the input from scanned text, and therefore the use of a single threshold was satisfactory. At high magnification the field of the display (21 ) can present 4 letters each spanning about 5. This is equivalent to enlarging standard newsprint size text by a factor of 24X. These values are close to the 30X maximal magnification of letters cited by Demer [Note 3] as optimal characteristics for CCTV low vision aids. At this magnification the high resolution of the display will provide a sampling density of close to 150x150-much in excess of the required sampling. Even at moderate magnifications of 8X the sampling rate is higher than is needed for maximum reading rates.


First Generation Bright Eye (TM)

To overcome the slow transfer rate on the computer bus, which limits the possible update rate, a special purpose hardware circuitry was developed. The core of this circuitry is a static RAM that replaces the original PE/Host Video Memory. The live video acquired from the camera is binarized using a single fixed threshold. The binarized data are stored in the RAM at rates of 14 MHz. Independent of the writing operation, the PE Display Controller (PEDC) unit retrieves the data when it is ready for the next update at its own transfer rate-670 Kbytes/sec, or at a 5.36 MHz binary pixel rate. With this design the "bottleneck" is the PEDC, which is able to manage as many as 26.58 frames/sec, assuming a full frame size of 720 x 280 bits.

Figure 1: A low-vision person using the Bright Eye (TM) portable electronic magnifying system. The display is headmounted in this case (a spectacle frame mount is also available). Note the use of the camera mouse to scan across the text and the access provided to non flat reading material. (Photo courtesy of Optelec, Inc.)

The BE has two modes of operation differing in the magnification applied: low magnification and double magnification. In the low magnification half resolution mode, only one of the fields-even or odd-is displayed. Thus only 240 lines are needed to display a full RS-170 frame of 480 lines. These 240 lines are inserted in the top 240 lines of the 280 possible in the PE. The last 40 lines on the display remain dark. Therefore, to maintain the RS-170 image aspect ratio of 4:3 at the display stage, the line length needs to be changed to: 240 * 1.66 * 4/3 = 531 pixels; where 1.66 is the aspect ratio of the PE. The cost of such correction would be a reduction in the horizontal field actually being displayed. This was deemed wasteful and thus, the first 360 pixels in each line are sampled and displayed after replication across the full 720 possible columns. The result is a small widening of the letters (horizontally) by a factor of about 1.35 (720/531) (anamorphic magnification). Such widening of the text provides additional magnification which may aid reading, similar to the use of optical bar magnifiers, where the widening is in the vertical direction. The low magnification mode is advantageous when initial navigation across the page is attempted to find a point of interest or reference, and may be more effective for users with moderate visual loss.

The user can switch the BE to double magnification mode using a toggle button on the hand held camera mouse. In the double magnification mode, the pixels are replicated both in the horizontal and vertical directions. This increases the magnified image by a factor of two in both directions. The whole 280 display lines can now be used, and thus the vertical extent of the display is slightly larger in this mode. The text anamorphic magnification ratios are identical in both modes.

In addition to a change in magnification, the user can switched from positive video, presenting black letters on a red background, to negative polarity-red text on a black background. It has been shown in a number of studies that many low vision persons prefer the negative polarity in electronic reading aids, and some can increase their reading rate when using light letters on a dark background.


Use of the Color Red

The use of red LEDs in the BE results in a red on black rather than a white on black display. Is the use of red light appropriate for the visually impaired? It is known that the sensitivity of the rods is lower to red than to green light. Many of the potential users of a device such as this are likely to have lost the use of their central retina, and will therefore be dependent on peripheral retina. Rods are more numerous than cones outside the fovea and, at the luminance level of the BE, are likely to control the retinal response. This logic has led many to worry that the red display will be difficult to use for many in its intended user's population. However, the fact that the rods are less sensitive to red than to green does not mean that the contrast of a red on black display will be reduced relative to a black on white display. Also, to my knowledge, there is no evidence in the literature indicating that the perceived contrast of a red pattern is lower than that of green pattern.

Most low vision patients are able to use the device with the red light. Nevertheless, a small number of low vision observers, including two trained professionals using careful observations, have noted some difficulty in using the display. They attributed this difficulty to the red color. One of them was tested with a computer display which was set to display text with red letters on a black background. Despite the relative dimming of the computer display using the red gun only, he was able to comfortably read the red display. The question of threshold and perceived suprathreshold contrast in a red on black display will be tested in my laboratory in the near future.


Second Generation Bright Eye (TM)

In normal operation in an RS-170 (NTSC) environment the display is refreshed and updated at a rate of 60 fields/sec or 30 interlaced frames/sec. In the first generation BE (now on the market) the 50 Hz refresh rate of the PE is maintained but the update rate was reduced to 15 Hz (i.e. only one out of two consecutive odd fields of the camera is displayed). The combination of both temporal limitations, those of the PEDC and the RS-170 rates, results in the slow update rate of 15 Hz. It is known that even a reduction of the update rate to 30 Hz (every odd field) may cause a disturbing artifact when the eyes are tracking a moving object [Note 2]. The artifact results from the visual system's anticipation of the position of a moving target during the non updated refresh frame, and thus perception of the target in a different position results in doubling or even tripling of a perceived line target. This artifact is also associated with reduced perceived contrast of the tracked moving target. Since the reading of text with a hand held camera results in such tracking, the effect is present in the BE and may cause a reduction in the possible reading rate.

The update rate can be increased with the use of a 50 Hz (PAL) camera and additional circuitry. The additional memory permits the system to load one field into one memory while the PEDC is acquiring data from the other. The complete system in effect operates as a dual-ported memory. In this scheme the system is limited only by the update rate of the PEDC. As a result of the better match between the 50 Hz refresh rate and the 50 Hz rate of the camera, every other field can be captured and updated resulting in an increased update rate (up to 25/sec). This improvement was implemented in a prototype now being tested in my laboratory. The new system does reduce the movement artifacts and increases the perceived contrast as expected. It is still a slower update rate than would be required for optimal use by normal observers. The benefit of this change for low vision readers has not yet been determined.


Head-Mounted Low Vision Mobility Aid

With the development of the necessary technologies, the combination of a head mounted display with image enhancement has been proposed as an improvement to current low vision telescopes. Massof and Rickman [Note17] have developed a display device specifically designed for this use, and Peli [Notes 18, 19] suggested the possible combination of a PE display with image enhancement. These two systems represent two different approaches to the problem. The first approach is a full VR design using a binocular, wide-field system with two cameras to provide disparate images to both eyes [Note 17]. The images presented to each eye can be magnified and/or enhanced. Patients with binocular vision are expected to maintain their stereoscopic perception. The patient will see only the displayed images, and natural viewing of the environment will be blocked. This design suffers from a number of shortcomings as discussed below.


Limitations of the Full VR Design

A head-mounted, binocular, unit magnification system is similar in design to the night-vision goggles used by the military. Such goggles are being used despite difficulties caused by the small field and distortion of depth perception at short distances, which result in poor perception of spatial layout [Note: 8]. The distortions of depth perception are the result of the displacement of the imaging objective lens from the eye's pupil. This effect is prominent only at short distances. Near objects are perceived by the observer to be closer than they really are. In addition, observers' head movements result in perceivable movement of the image leading to loss of visual stability. A special optical design, which can reduce or eliminate these problems for the unit magnification device, has been developed and tested [Note 8].

Optical or electronic magnification is likely to be required in addition to image enhancement in a portable visual aid. Magnification will greatly aid the patient's utility of the enhanced images. At the same time, magnification will complicate the use of a binocular virtual environment device. The binocular disparity will no longer be valuable, and head motion will result in amplified image motion. It is known that image motion of the magnitude anticipated will greatly reduce visual acuity and may limit the displayÆs usefulness. Many low vision patients successfully use optical magnification in the form of spectacle-mounted telescopes. However, these telescopes are almost always bioptic, mounted above the line of sight. Bioptic telescopes are typically used about 10% of the time, primarily to spot objects of interest [Note 10]. Demer et al. [Note 5] demonstrated that dynamic visual acuity is reduced if a 4.0X telescope is used centrally with the peripheral field occluded; however, with peripheral view unobstructed dynamic acuity was significantly better. The design of a virtual environment aid calls for a wide visual field. Current technology enables fields of about 50 for each eye. With magnification of 4.0X, the display will provide an effective field of only 12.5 , similar to Keplerian design bioptic telescopes. Therefore, similar difficulties with a centrally mounted, constantly used device may be anticipated. For these reasons the system designed by Massof and Rickman [Note 17] is switched to a single central camera when magnification is used.

Head rotation while wearing telescopic spectacles with the peripheral view occluded was found to be a potent stimulus for motion discomfort [Note 4]. Although discomfort was reduced with adaptation, individual susceptibility to motion sickness may limit the use of full-field magnified devices for some visually impaired patients. The need for two displays, two to three cameras, and additional electronics is likely to increase the cost of such systems beyond the ability of most low vision persons. Even the $500-$1,000 cost of optical bioptic telescopes is too high for many elderly patients. Appearance, field of view, weight, and cost of the visual aid were identified as the most important factors in the utilization of low-vision telescopic aids [Note 7].


The Bioptic Design

As an alternative to the binocular virtual environment aid, I proposed an image enhancement aid implemented as a monocular bioptic device [Note 18]. In my design the HMD is placed above or below the line of sight to be used occasionally in the same way as the bioptic telescope. This device can combine the benefits of magnification with image enhancement without the psychological and functional drawbacks of the limited field virtual environment device. The cost of this implementation can be reduced substantially because only one display is required. The display itself can be of a smaller field than the one required in the virtual environment since the patient maintains his or her natural view of the environment. A larger field is required for safe navigation than is required for periodic investigation of objects of interest in a bioptic mode. A smaller field display device can be implemented in a smaller, lighter, and cosmetically more acceptable aid.

To test this concept I have adapted the BE to presentation of live video. With additional circuitry and controls, the display can be connected either to a VCR or to a head-mounted video camera, thereby allowing us to test the applicability of the device as a mobility aid. However, because the video signal is processed by a single threshold, even with optimal contrast and luminance adjustment, the resulting image is frequently of poor quality. If large portions of the image are dark or light all of the details in these segments will be missing from the binary image. A more detailed binary image can be obtained by applying a bandpass filter prior to thresholding with a fixed threshold set at mid video range. The resulting image is fairly similar to the binary images obtained using adaptive thresholding [Note 22], which where shown to be beneficial for the visually impaired. The application of the adaptive enhancement algorithm by the DigiVision (TM) enables such live 2-D processing. In addition, control of the background luminance maintains the average luminance relations at low frequencies. Such preprocessing of the video before the binary image is created permits normal observers to watch a movie without the loss of important details (Fig. 2). We have demonstrated the effectiveness of this adaptation, and will present a video tape illustrating the differences in image quality with and without enhancement. The value of this type of enhanced image to the visually impaired has not yet been evaluated.

Two-dimensional processing, as performed by the DigiVision (TM), is necessary for the processing of images taken from a video tape or broadcast or from a static surveillance camera. However, such processing is necessarily more expensive and consumes more battery power than the simple one dimensional processing applied to the video signal row by row. In the case of video tape imagery, such processing may be insufficient because it processes the image only across vertical features. Thus, image features such as edges that happen to be horizontal are not processed and will not be represented in the final binary image. Such horizontal features are abundant in the environment and are needed for proper perception. In the case of the head mounted camera, however, the user may change the camera orientation very easily using slight head tilts. With such a head tilt, horizontal features become diagonal and are therefore processed, thus becoming visible in the thresholded binary image. We have implemented such analog processing using basic video amplifiers and filters (Fig. 3). The system is now operational in a battery powered belt pack connected to a head mounted camera and a modified BE. The system clearly demonstrates the validity of the one-dimensional processing concept and is now ready for testing with low vision patients.

Fig. 2. Examples of movie images seen in the modified display without (left) and with (right) prior enhancement with the DigiVision (TM) device.

Fig. 3. Block diagram of the one-dimensional, enhanced head mounted system. The video signal from the camera is separated from the synchronization signals. The raw video is then low-pass filtered and subtracted from the attenuated raw video to obtain a high-pass filtered version. This is amplified by the contrast control and than recombined with the synchronization signals.


Conclusions

We have shown that image enhancement provides significant improvements in the perception of still scenes and moving images displayed on a television monitor. In addition, we have shown that this same technology can increase the reading rate of low vision patients. If image enhancement is effective on a stationary video monitor, its benefits can be extended to a mobility aid using novel VR technology associated devices. The implementation of such devices both in a full VR design [Note 17] and the bioptic design I propose is almost complete. Initial results from patient testing should be forthcoming, indicating whether either of these approaches is useful as a mobility aid for the visually impaired.


Acknowledgments

Supported in part by grant #EY05957 from the National Institutes of Health, the Ford Motor Company Fund, the Massachusetts Lions Eye Research Fund, Inc., Optelec Inc., and DigiVision, Inc. I would also like to thank Frank Rogers and Magda Butnaro for technical contributions to this project. Special thanks to Elisabeth Fine for valuable contributions to the research and the preparation of this manuscript.


References

Go to previous article 
Go to next article 
Return to the 1994 VR Table of Contents 
Return to the Table of Proceedings 


Reprinted with author(s) permission. Author(s) retain copyright.