NEW SENSORS FOR CULTURAL HERITAGE METRIC SURVEY : THE ToF CAMERAS

ToF cameras are new instruments based on CCD/CMOS sensors which measure distances instead of radiometry. The resulting point clouds show the same properties (both in terms of accuracy and resolution) of the point clouds acquired by means of traditional LiDAR devices. ToF cameras are cheap instruments (less than 10.000 €) based on video real time distance measurements and can represent an interesting alternative to the more expensive LiDAR instruments. In addition, the limited weight and dimensions of ToF cameras allow a reduction of some practical problems such as transportation and on-site management. Most of the commercial ToF cameras use the phase-shift method to measure distances. Due to the use of only one wavelength, most of them have limited range of application (usually about 5 or 10 m). After a brief description of the main characteristics of these instruments, this paper explains and comments the results of the first experimental applications of ToF cameras in Cultural Heritage 3D metric survey. The possibility to acquire more than 30 frames/s and future developments of these devices in terms of use of more than one wavelength to overcome the ambiguity problem allow to foresee new interesting applications.


INTRODUCTION
The 3D information of an object to be surveyed can be basically acquired in two ways: by using stereo image acquisitions or optical distance measurement techniques.The stereo image acquisition is already known and used for decades in the research community.The advantage of stereo image acquisition to other range measuring devices such as LiDAR, acoustic or radar sensors is that it achieves high resolution and simultaneous acquisition of the surveyed area without energy emission or moving parts.Still, the major disadvantages are the correspondence problem, the processing time and the need of adequate illumination conditions and textured surfaces in the case of automatic matching procedures.Optical distance measurement techniques are usually classified into three main categories: triangulation based technique, interferometry and Time-of-Flight (ToF).The triangulation based technique normally determines an unknown point within a triangle by means of a known optical basis and the related side angles pointing to the unknown point.This often used principle is partitioned in a wealth of partly different 3D techniques, such as for instance active triangulation with structured illumination and passive triangulation [1].Interferometry measures depth also by means of the Time-of-Flight.In this case, however, the phase of the optical wave itself is used.This requires coherent mixing and correlation of the wave-front reflected from the object with a reference wave-front.The high accuracies of distance measurements performed with interferometry mainly depend on the coherence length of the light source: interferometry is not suitable for ranges greater than few centimeters since the method is based on the evaluation of very short optical wavelength.Continuous wave and pulse ToF techniques measure the time of flight of the envelope of a modulated optical signal.These techniques usually apply incoherent optical signals.Typical examples of ToF are the optical rangefinder of total stations or classical LiDAR instruments.In this latter case, actual laser scanners allow to acquire hundreds of thousands of points per second, thanks to fast scanning mechanisms.Their measurement range can vary to a great extent for different instruments; in general it can vary between tens of meters up to some kilometers, with an accuracy ranging from less than one millimeter to some tens of centimeters respectively.Nevertheless, the main drawbacks of LiDAR instruments are their high costs and dimensions.In the last few years a new generation of active sensors has been developed, which allows to acquire 3D point clouds without any scanning mechanism and from just one point of view at video frame rates.The working principle is the measurement of the ToF of an emitted signal by the device towards the object to be observed, with the advantage of simultaneously measuring the distance information for each pixel of the camera sensor.Many terms have been used in the literature to indicate these devices, which can be called: Time-of-Flight (ToF) cameras, Range IMaging (RIM) cameras, 3D range imagers, range cameras or a combination of the mentioned terms.In the following the term ToF cameras will be prevalently employed, which is more related to the working principle of this recent technology.Previous works, such as [2,3,4], have already shown the high potentiality of ToF cameras for metric survey purposes.In [3] it has been demonstrated that a measurement accuracy of less than one centimeter can be reached with commercial ToF cameras (e.g.SR-4000 by Mesa Imaging) after distance calibration.In that work, an accuracy evaluation of the SR-4000 camera measurements has been reported, with quantitative comparisons with LiDAR data acquired on architectural elements.In [2] an integrated approach based on multi-image matching and 3D point clouds acquired with ToF cameras has been reported.Thanks to the proposed approach, 3D object breaklines are automatically extracted, speeding-up the modeling phase/drawing production of the surveyed objects.In [4], an attempt to build up a 3D model of the Laocoön-Group Copy at Museum of Art at Ruhr University Bochum using the PMDCamCube2.0camera is reported.Some reflective targets are employed in order to register data acquired from three viewpoints; nevertheless, the systematic distance measurement errors decreased the final 3D point cloud quality.In this work, first a brief overview on commercial ToF cameras is reported, in order to show pros and cons of the systems available on the market.Then, a comparison between data acquired with two commercial ToF cameras and two LiDAR devices is reported, in order to show the achievable 3D point clouds.Moreover, an approach for metric survey and object modeling using ToF cameras is reported.Thanks to the adopted procedure, it is possible to obtain complete 3D point clouds of the surveyed objects, which can be employed for documentation and/or modeling purposes.Finally, some conclusions and future works are reported.

TOF IMAGE SENSORS
There are two main approaches currently employed in ToF camera technology: one measures distance by means of direct measurement of the runtime of a travelled light pulse, using for instance arrays of single-photon avalanche diodes (SPADs) [5,6] or an optical shutter technology [7]; the other method uses amplitude modulated light and obtains distance information by measuring the phase difference between a reference signal and the reflected signal [8].Such a technology is possible because of the miniaturization of the semiconductor technology and the evolvement of the CCD/CMOS processes that can be implemented independently for each pixel.The result is the possibility to acquire distance measurements for each pixel at high speed and with accuracies up to about one centimeter in the case of phaseshift devices.While RIM cameras based on the phase-shift measurement usually have a working range limited to 10-30 m, ToF cameras based on the direct ToF measurement can measure distances up to 1500 m.Moreover, ToF cameras are usually characterized by low resolution (no more than a few thousands of tens of pixels), small dimensions, costs that are one order of magnitude lower with respect to LiDAR instruments and lower power consumption with respect to classical laser scanners.In contrast to stereo based acquisition systems, the depth accuracy is practically independent of textural appearance, but limited to about one centimeter in the best case (actual phase-shift commercial ToF cameras).In the following section, a brief overview on commercial ToF cameras is reported.

Commercial ToF cameras
The first prototypes of ToF cameras for civil applications have been realized since 1999 [8].After many improvements both in sensor resolution and accuracy performance that this technology has undergone in ten years, at the present many commercial ToF cameras are available on the market.The main differences are related to working principle, sensor resolution and measurement accuracy.The phase shift measurement principle is used by several manufacturers of ToF cameras, such as Canesta Inc., MESA Imaging AG and PMDTechnologies GmbH, to mention just the most important ones.Canesta Inc. [9] provides several models of depth vision sensors differing for pixel resolution, measurement distance, frame rate and field of view.Canesta Inc. distributes sensors with field of view ranging between 30° and 114°, depending on the nature of the application.Currently, the maximum resolution of Canesta sensor is 320 pixel x 200 pixel (Canesta "Cobra" camera), one of the highest worldwide.Some cameras from Canesta Inc. can also operate under strong sunlight conditions using Canesta"s SunshieldTM technology: the pixel has the ability to substantially cancel the effect of ambient light at the expense of producing a slightly higher noise.In Figure 1    Several models of ToF cameras have been produced by PMDTechnologies GmbH [11] in the last years.The illumination unit is generally formed by one or two arrays of LEDs, one for each side of the camera (Figure 3).PMDTechnologies GmbH provides several models of ToF camera with different features and suitable for measurements also in daylight since all cameras are equipped with the Suppression of Background Illumination (SBI) technology.Currently, the PMD devices provide sensor resolutions up to 200 x 200 pixel (PMDCamCube3.0camera) and a nonambiguity distance up to 40 m (PMDA2 camera).The field of view of the latest model (PMDCamCube3.0)is 40° x 40°, but customization for specific applications is possible.Also in this case simultaneous multi-camera measurements are possible, thanks to the possibility to select many different modulation frequencies.Specific models for industrial applications are also delivered, such as the PMDO3 and PMDS3 cameras (Figure 3).working distance and opening angle, are strongly related to the illumination unit (i.e. its optical power and its illumination characteristics).One ToF camera called ZCamII [7] based on the optical shutter approach has been realized by 3DV Systems, which provides NTSC/PAL resolution, with a working range up to 10 m and field of view up to 40°.Another camera by 3DV Systems which is mainly employed for real time gaming is the ZCam camera: the working range is up to 2.5 m, with centimetric resolution, high frame rate (up to 60 fps) and RGB data with a resolution of 1.3 Mpixel thanks to an auxiliary sensor.In fact, a key feature of ToF cameras by 3DV Systems is that RGB information is also delivered in addition to depth data.For a complete overview on commercial ToF cameras (working principle, measurement parameters, distance calibration, etc.) refer to [13].
Figure 4: Some ToF cameras by Advanced Scientific Concepts Inc.: Dragoneye 3D Flash LiDAR, Tigereye 3D Flash LiDAR and Portable 3D Flash LiDAR (from left to right).

Distance measurement errors
As in all distance measurement devices, ToF cameras are typically characterized by both random and systematic distance measurement errors.In some cases, the influence of systematic errors has been strongly reduced by the manufactures, while other camera models still suffer from these error sources, thus limiting their actual applicability without suitable distance calibrations.According to [8], typical sources of noise in solid state sensors can be subdivided in three different classes: photocharge conversion noise, quantization noise and electronic shot noise, also called quantum noise.Electronic shot noise is the most dominating noise source and cannot be suppressed.Typical nonsystematic errors in ToF distance measurements are caused by pixel saturation, "internal scattering", "multipath effect", "mixed pixels" and "motion artifacts".Some models of ToF cameras suffer from the so called "internal scattering" artifacts: their depth measurements are degraded by multiple internal reflections of the received signal occurring between the camera lens and the image sensor.A common problem to all ToF cameras based on phase shift measurement is the "multipath effect" (or "external superimposition"), especially in the case of concave surfaces: small parts of diffusely reflected light from different surfaces of the object may superimpose the directly reflected signals on their way back to the camera.A common problem in data acquired with ToF cameras is represented by the so called "mixed pixels" or "flying pixels" or "jumping edges": they are errant 3D data resulting from the way ToF cameras process multiple returns of the emitted signal.These multiple returns occur when a light beam hits the edge of an object and the beam is split: part of the beam is reflected by the object, while the other part continues and may be reflected by another object beyond.The measured reflected signal therefore contains multiple range returns and usually the reported range measurement for that particular ray vector is an average of those multiple returns.Finally, when dealing with real time applications or moving objects, the so called "motion artifacts" could affect the acquired data.The result is that ToF data are often noisy and characterized by several systematic and random errors, which have to be reduced in order to allow the use of RIM cameras for metric survey purposes.

3D OBJECT METRIC SURVEY
Previous works have already demonstrated the high potentialities of ToF cameras for metric survey purposes.In [3] an accuracy evaluation of data delivered by the SR-4000 camera has been reported.The ToF data has been compared with more accurate LiDAR data acquired on an architectural frieze: the results showed that the proposed distance calibration procedure allows reducing the distance measurement error to less than one centimeter.In the following, a qualitative comparison between data acquired with two ToF cameras and two laser scanners on the same object is reported, in order to show the performance of RIM cameras for metric survey purposes.Then, some results obtained with the "multi-frame registration algorithm" proposed by [13] are reported as an example of automatic 3D object reconstruction from multiple viewpoints.

ToF versus LiDAR
In order to qualitatively compare data acquired with ToF cameras and data acquired with LiDAR instruments, the architectural frieze of Figure 5 has been surveyed with different instruments.First, the object has been surveyed using two ToF cameras, the SR-4000 and the PMDCamCube3.0;then, the same object has been surveyed by using two wellknown instruments: the Riegl LMS-Z420 LiDAR instrument and the S10 Mensi triangulation based scanner.In both cases, the RIM cameras were positioned on a photographic tripod, in front of the object, and 30 frames were acquired after a warm-up of forty minutes in order to have a good measurement stability [13].After that, ToF data were averaged pixel by pixel in order to reduce the measurement noise.In the case of the SR-4000, distance data was corrected with the distance calibration model [13], while no distance calibration is available for the PMDCamCube 3.0 yet.Then, the Mixed Pixel Removal (MPR) filter [13] was applied, in order to automatically remove the "mixed pixels", which are errant 3D data resulting from the way ToF cameras process multiple returns of the emitted signal.Data acquired with the Riegl LMS-Z420 laser scanner was filtered with the RiSCAN PRO software, while the Mensi data was manually filtered with the Geomagic Studio 10 software.The results of the point clouds acquired on the frieze are reported in Figure 5.As one can observe, good results have been obtained with the two ToF cameras, even if the point density is lower than in the case of LiDAR data.In [13] it has been demonstrated that the SR-4000 measurement accuracy on the frieze is few millimeters after distance calibration, therefore comparable to the LiDAR accuracy.No accuracy evaluation of the PMDCamCube 3.0 measurements has been performed yet, so why only a qualitative comparison is reported in this work.

Automatic 3D object reconstruction with ToF data
In order to obtain a complete 3D model of the surveyed objects, more than one acquisition viewpoint is usually required with ToF cameras.In fact, their field of view is often limited to about 40°, the working range is often smaller than a tens of meters and foreground objects in the scene can occlude background objects (as in the case of LiDAR acquisitions).
Since data are acquired from different viewpoints and each point cloud is referred to a local coordinate system fixed to the device, suitable registration procedures have to be adopted in order to register the acquired data.The approach proposed in this work to acquire data is related to the following scene acquisition conditions [13]: the ToF camera acquisitions are performed from a stable position (i.e.photographic tripod) in order to acquired several frames (e.g. 10 †30) of a static scene; in this way, it is possible to average the acquired frames in order to reduce the measurement noise.Moreover, several camera positions are adopted in order to survey the entire object, remembering to maintain an overlap of at least 50% between consecutive camera viewpoints.The choice of acquiring data from few static camera positions is justified by two main reasons: measurement noise reduction thanks to multi-frame acquisition, since the frames acquired from the same position are averaged pixel by pixel; limitation of the accumulated registration error: if the number of consecutive 3D point clouds to be registered increases, the accumulated registration error inevitably increases.The integration time is adjusted for each camera position, in order to avoid saturated pixels while maintaining high amplitude values and, therefore, precise distance measurements.In [13] an algorithm for the automatic registration of ToF point clouds has been proposed.The algorithm, called "multi-frame registration algorithm", allows to automatically perform ToF point cloud registration using data coming only from the ToF device.It exploits both amplitude data and 3D information acquired by ToF cameras.Homologous points between two positions of acquisition are extracted from amplitude images obtained after averaging multiple frames (so why the method is called "multiframe registration").After a robust estimation of the spatial similarity transformation parameters thanks to the Least Median Square estimator [14], the spatial similarity transformation between two adjacent point clouds is estimated in a least square way and the registration is performed, with estimation of the residuals.The procedure is extended to all positions of acquisition.In Figure 6, some results of the registration process on data acquired from two positions with the SR-4000 camera on a small area of the Topography laboratory façade of the Politecnico di Torino (Italy) are reported in order to show the potentialities of the method.As one can observe from Figure 6, the point density increases in the overlap region between the two point clouds.Some problems of multiple reflections, mixed pixels and other outliers occurred in correspondence of the window glasses: the MPR filter [13] removed almost all of them, so why only some points are still visible in the internal area of the windows.The differences between the z values (depth direction) of the two point clouds in the overlap region have a mean value of about 0.01 m, which is the measurement accuracy of the SR-4000 camera.Therefore, the final 3D point cloud after the registration process has an accuracy of about 0.01 cm, which can be suitable for modeling purposes and/or integration with other survey techniques [2].

CONCLUSIONS AND FUTURE WORKS
In this paper, an overview about commercial ToF cameras and typical systematic and random measurement errors has been reported, in order to show the main characteristics of the available sensors.Then, a qualitative comparison between data acquired with two commercial ToF cameras and two laser scanners on an architectural object has been reported.The results show the high potentialities of RIM cameras for metric survey purposes in the Cultural Heritage field.ToF cameras are cheap instruments (less than 10.000 €) based on video real time distance measurements and can represent an interesting alternative to the more expensive LiDAR instruments for close range applications.In addition, the limited weight and dimensions of ToF cameras allow a reduction of some practical problems such as transportation and on-site management, which are typical of LiDAR instruments.Nevertheless, the sensor resolution is still limited and the main problem of phase-shift RIM cameras is the limited working range.Future developments will probably overcome this problem by using more than one modulation frequency.Finally, some results about automatic point cloud registration for 3D object reconstruction using ToF cameras have been reported in this work.Using suitable registration procedures, it is possible to automatically obtain complete 3D point clouds of the surveyed objects, with accuracy close to the measurement accuracy of the considered device.Future works will deal with quantitative comparisons between calibrated ToF data and LiDAR data after performing the automatic registration of RIM data acquired from different viewpoints.
Figure 6: Homologous points extracted with SURF [15], which is implemented in the multi-frame registration algorithm, from amplitude images acquired from different positions (left); 3D view of the final point cloud after frame averaging, distance correction, mixed pixel removal and automatic registration with data acquired with the SR-4000 camera (right).
some images of ToF cameras produced by Canesta are reported.

Figure 2 :
Figure 2: Some models of ToF cameras by MESA Imaging AG: SR-2, SR-3000 and SR-4000 (from left to right).

Figure 5 :
Figure 5: Data acquisition and 3D views of: the SR-4000 point cloud after frame averaging, distance correction and mixed pixel removal (first row), the PMDCamCube3.0point cloud after frame averaging and mixed pixel removal (second row), the Mensi point cloud after manual filtering (third row) and the Riegl LMS-Z420 point cloud after filtering with the RiSCAN PRO software (forth row).