Preprints & Reprints
Back to Preprints & Reprints > Publications & Opinion > Homepage


Abstract
In this paper we describe the background to, and our interest in, three dimensional (3-D) imaging systems for improved video-telecommunications services. There have been two main thrusts: for video-telephony and for improving response in tele-operations or remote working situations where depth information is vital. We have built prototype demonstrations that offer enhanced presence and are comfortable to view. These demonstrations have been well received. These systems have become feasible because of the recent rapid improvements in display technologies. Further improvements in 3-D image quality can be anticipated as the 2-D display technologies continue to develop and as the benefit of these applications is transformed to user demand. We anticipate a growing requirement to provide the bandwidth for a range of 3-D imaging applications operating over telecommunications networks.

Keywords
Telepresence, autostereoscopic, telecommunications, video-telephony

Introduction
With few exceptions, present day video communication systems only deliver two dimensional images to users. However, the future unquestionably lies with the delivery of 3-D imagery. After all, we conduct our lives with the benefit of 3-D vision and we could extend this to the field of electronic communications. We have, therefore, developed a number of 3-D systems with the objective of delivering more realistic and natural video communications services. In our studies we have identified a number of options that go some way to delivering these requirements, including spectacles free 3-D imaging and headset based systems.

Unfortunately, all spectacles based 3-D systems require the wearing of dark or coloured spectacles, which dramatically reduce eye contact. Dark spectacles can even make the wearer look doubtful or sinister. Since eye contact with the person at the remote location is one of the key advantages offered by video-telephony and video-conferencing, spectacles based 3-D imaging approaches are not appropriate for these applications. We have developed a prototype spectacles free system which demonstrates the impact of 3-D in a video-telephone application. This demonstration has been assembled using commercially available components with a liquid crystal display and a lensed sheet (a lenticular sheet ) overlay. The views from two remote cameras, equivalent to a left and right eye view, are displayed.

Spectacles based systems are ideal for applications where eye contact is not necessary. We have developed the concept of a remote 'surrogate head' linked back to a 3-D system, such as a spectacles based display or an immersive headset. The surrogate head has two cameras which collect views of the remote scene. The views are then transmitted over the telecommunications network to the user. This concept could be particularly useful in remote working applications where the visual image, coupled with the depth information, is important.

3-d video-telephony.
The main aim of this work has been to produce a demonstration system for evaluation so that a judgement about the potential benefit of 3-D for video-telephony could be made. At present most (2-D) video-telephony systems operate over the ISDN (Integrated services digital network) and use coding schemes compatible with the CCITT standard - H.261. These use high levels of compression and operate at data rates in multiples of 64kbit/s. Since there is a great deal of similarity between the adjacent views in a 3-D, stereoscopic system, there is the potential for the transmission of a second channel at only a small bandwidth overhead (perhaps 10% of the first channel). As long as the two views system may be viewed comfortably and as long as the effect enhances the observed image quality then 3-D has a role for video-telephony.

Previous work.
Earlier publications have described the design of the 3-D video-telephony system that we have built. The key steps in the development of this system have been as follows:

The availability of flat panel displays with a suitable pixel arrangement.
Earlier work" [1,2] indicated the viability of this approach. Figure 1 illustrates the optical arrangement. Initially, the components (lenticular sheet, LCD panel, and electronics) were assembled and the system demonstrated with a viewing distance of 600mm. Previously we outlined the main flat panel display considerations" [3]; e.g. care must be taken to ensure that the colour variation in the sub-pixels (R, G & B) making up each individual pixel is vertical, rather than horizontal. Also, for this panel the non-transmissive areas of each pixel are relatively large, making up one third of the width of a pixel and if imaged exactly to the viewing plane will result in dark stripes being observed as the user moves his head.


Figure 1. LCD 3-D imaging system

The choice of view determining screen for prototyping.
Lenticular sheets or parallax barriers were initially considered for use in this demonstration system. We first defined the key requirements including the need for a pitch mismatch between the flat panel display and the 'view determining screen' for good image replay "[4]. For prototyping purposes parallax barriers were easier to produce but suffer the disadvantages of reduced light throughput. We found that optimised parallax barriers and commercial lenticular sheet had a similar optical performance - other than light throughput. We also developed good test and measurement capabilities for displays and view determining screens.

For the video-telephone application, our initial design of lenticular sheet built in an element of defocus on each lens so that the dark stripes would be blurred at the viewing plane. The resulting pattern has been defocused as required but, unfortunately, a darkening is still observable as the observer moves his head [6,7].

  1. The need for a good understanding of the correct set up for cameras and displays so that realistic 3-D scenes are observed - so called 'scene modelling' [5]. This will allow a comfortable viewing.
  2. Preliminary results with early LCD panels with a limited colour palette (3 bits per colour) show good 3-D. The effect is marred by colour contouring - on skin tones this is particularly distracting making the scenes difficult to view.
  3. Improved panels with more colours have given a much improved performance. Good 3-D and comfortable viewing resulted when panels with 200,000 colours were used. The 3-D effect is observable over a limited viewing zone - despite a significant in-built defocus on the lens elements within the lenticular sheet. This limitation is shown to be due to the structure of the LCD panel and is not due to problems associated with the lenticular sheet. With better flat panels, with less dead space, a wider viewing zone would result.
  4. Preliminary tests with independent codecs on each channel, operating over the ISDN, show good 3-D - even at low data rates. Early concerns that the independent coding of the two channels would introduce artefacts, such as blocks floating in space, were shown to ve unfounded". However, to avoid uncomfortable visual effects care must be taken to ensure the two views remain synchronised. This becomes critical at low data rates.

Recent developments.
The performance of the 3-D video-telephone prototype has been considerably improved. The aim of this activity has been to enhance the visual appearance of the system and to demonstrate the quality that might be expected with improved LCD display technologies. As we have indicated previously, the major limitation with available LCD's is the restricted viewing zone arising because of the dead space between the active part of each pixel. In addition, if there is further movement laterally the user is presented with reversed (pseudoscopic) views. At this position the left eye will see a right eye view and vice versa. This is unnatural and can be uncomfortable to view.

The long term solution for both of these issues is to use better displays and intelligent camera systems to monitor the user's position. In the short term we have used a head tracking system that monitors the user's position and provides feedback to move the views and to ensure that the correct views are displayed. This allows the user to move more freely in front of the video-telephone and gives an enhanced visual effect. For our first demonstration we have used an infra-red headtracking system to locate the observer's head position laterally with respect to the screen.

Initially we thought that the best way to overcome the 'limited viewing zone' problem was to move the lenticular sheet with respect to the LCD panel. At most, for the panel we are using, this movement would be of the order of +/- 2mm and this could be achieved relatively easily with piezo-electric transducers or miniature stepper motors. The position of the two views would then change to follow the user. After some preliminary tests we found that a better way to move the views was to rotate the complete LCD and lenticular sheet assembly using a stepper motor so that the correct views are always shown.


Figure 2. Display movement arrangements.
a) Moving the lenticular sheet b) Rotating the whole panel
(Click on image to view larger version)

We have found that users seated in front of a video-telephone do not normally move more than 250mm either side of a central position, or faster than 1.35 metres per second. This corresponds to a 0.384 radian rotation of the panel, at an angular velocity of 0.002 radians per second and is well within the limits of a standard stepper motor. As it is likely that users would try to find the limits of the system, some filtering of head movements is required, so that the motor cannot be overdriven.

A commercially available infrared headtracker which connects to the serial port of a computer is used to locate the head position of the user. Software interprets this signal and controls the stepper motor: this is driven from the parallel port. Initial tests of the system revealed that a greater torque was needed from the motor for it to move the panel in the required manner. Consequently, a 5:1 ratio gearbox was fitted. Unfortunately, as further experimentation showed, this meant that the motor was now too slow to track fast head movements. This is not a problem in normal operation and will be improved by fitting a better motor in the near future.

Future work.


The primary requirement from display manufacturers for 3-D video-telephony applications is to produce improved flat panel displays with less dead space between the active parts of the pixels. A number of recent developments in LCD technology would appear to resolve this issue by using an alternative addressing scheme. We have had preliminary discussions with a number of suppliers and would be pleased to broaden our contacts and discussions further. The second requirement is to produce flat displays with more pixels.

In the short term we would like to improve the performance of our system by using an image processing based, head tracking approach that offers less user intrusion and a greater flexibility. As we move to the future we expect that with such a tracking capability and with better flat panel displays we will only need to flip adjacent views to always give the correct, orthoscopic view to the user. More sophisticated image processing techniques could then be used to produce more realistic images by increasing the number of views available to the user; a number of intermediate views between those from the source cameras could be generated (eg [8]). These could be displayed as the user moves his head.

As has been reported by a number of workers, image quality in both high and low resolution images is enhanced significantly with the addition of depth information. If the second and subsequent views can be efficiently coded and image quality is much enhanced compared with 2-D, then there is a real prospect for 3-D at low data rates for video-telephony.

Surrogate head system.
With the rapid improvements in resolution, size and performance of CCD cameras coupled with reductions in cost, the potential for the transmission of video signals to allow remote operations and technical support over vast distances has now become a reality [9]. These first systems are generally 2-D and are based around a single camera and a lightweight headset linked to the ISDN. Further enhancements to these systems will come with smaller and more lightweight equipment and with increased sophistication. Also, for more realistic experience twin camera sensors will provide a stereoscopic view.

There are two scenarios for developments to this system.

The first is for a field support system in which a distant operator can lend support to a technician performing a complex task. Twin cameras, perhaps mounted on the field technician's head, relay images back over the ISDN to a control room where the scene is viewed using a spectacles based stereoscopic display system. Niche market applications already exist for such systems and these have been shown to operate over the ISDN [6]. The main applications appear to be in 'remote handling', in medical imaging field or for tele-support systems such as Camnet"[9]. These are generally not enhanced presence video-telephony, or video-conferencing systems, because of the loss of eye-contact. However, they do provide depth information to a distant operator and offer valuable extra cues for difficult handling or conceptual tasks.

The second scenario is for a system which requires the user to wear an immersive display linked to two remote cameras. The idea is to provide a stereoscopic pair of views to the observer which depend on the direction in which he is looking. This is determined by an independent headtracking system. In first experiments, windows on the field of view of two fixed cameras were moved in response to the observers head motion. This allows the user to look around a remote scene - even though the remote cameras remain fixed. Electronic zoom, pan and tilt are all possible under user control. The movement of a motorised camera platform following the motion of the user's head (and eyes) can increase the sense of presence further. Initial work in this area has shown that high resolution immersive displays are required before a good sense of presence can be established. With improvements in the enabling technologies and in the quality of the long distant links there are good prospects for providing real, remote immersive experiences.

Planned enhancements of the system.
Another requirement for the system is the inclusion of high quality sound. Initially we have used a stereo sound system. This allows sounds on either side of the user to be located. We hope to enhance the realism of this system using a digital signal processor system that can position sound sources within a 3-D spatial domain depending on user orientation.

Additionally, collaborative remote experiments with the University of Strathclyde, in Glasgow, who have developed an 'anthropomorphic head', are planned. This head has the same number of degrees of freedom as the human head and can be used in place of fixed cameras. The intention is to interact with the 'anthropomorphic head' across the network - over a distance of some 700km.

This work is described more fully by Fryer et al [10]. Further applications will be described at the conference.

Conclusions
For video-telephony, where a single viewer is reasonably constrained in position by the field of view of the monitoring camera(s), we have shown that good quality, spectacles free, 3-D imaging is achievable now. Our prototype system has been put together using current LCD and lenticular sheet technology and has been coupled to a simple head tracking system. Simpler, cheaper and better quality 3-D displays will come with improvements in flat panel display technology. In addition, with smarter camera systems monitoring the position of the user, unobtrusive head tracking will become viable - enhancing the 3-D experience.

Since there is a great deal of similarity between adjacent views in a stereoscopic system, the prospects of coding the signals efficiently for transmission over a telecommunications network, seem good. This could significantly enhance the quality of a video-telephony interaction at only a small bandwidth overhead.

Prospects for immersive or screen based stereoscopic display systems operating over long distances using the ISDN are improving year on year as the enabling technologies improve in quality and drop in price.

Acknowledgements.
The authors are grateful for the support of A Miles, K. Fisher and N. Harvey. Also, we would like to acknowledge the expertise of Dr J. H. A Schmitz (Philips Optics, Eindhoven) in the provision of lenticular sheet to our design.

References
1. Ichinose S; Tetsutani N; Ishibashi M: 'Full stereoscopic video pickup and display technique without special glasses.' Proc SID Vol 30, No 4 p319 (.1989)

2. Burner R: 'Autostereoscopic lenticular systems.' Proc IEE Colloquium on Stereoscopic Television. October 1992. London.

3. Sheat D E; Chamberlin G R; Gentry P; Leggatt J S; McCartney D J; '3-D Imaging Systems for Telecommunications Applications.' Proc.SPIE. Vol 1669. p186. Electronic Imaging Systems and Applications Meeting, San Jose.USA.1992.

4. McCartney D J; Chamberlin G R; Leggatt J S; Sheat D E; Seal C H: '3-D Electronic still imaging'. Ist Int Festival of 3-D Images. Paris. September 1991. p296-302.

5. McCartney D J. Sheat D E. Chamberlin G R. Seal C H. 'Auto-stereoscopic 3-D imaging systems for telecommunications applications.' IEE Meeting on Stereoscopic Television, London. October 1992.

6. McCartney D J. Chamberlin G R. Sheat D E. Gentry P: 'Telecommunications Opportunities For 3-D Imaging Systems.' Proceedings of the 4th European Workshop on 3-D Television. Rome. October 20-21st 1993. p65-69.

7. Chamberlin G R. Sheat D E. McCartney D J: 3D imaging for video-telephony. Proc First international sysmposium on Three Dimensional Image communication Technologies. Tokyo. December 1993.

8. Liu J: Skerjanc R: 'Construction of intermediate pictures for a multiview 3-D system.' Stereoscopic Displays and Applications. SPIE Vol 1669. (1992) .pp10-19.

9. P.Cochrane, D Heatley, K Fisher, K Cameron and R Taylor-Hendry: 'CAMNET, the First Telepresence System', Interlink 2000 Towards the Future of Communications, pp 38-41, August 1992.

10. Fryer R et al: 'An experiment in three dimensional telepresence.' Proc Conf on Three dimensional Imaging. The Royal Society of Arts, London. February 2nd & 3rd 1995.