Synthesis of Room Impulse Responses for Variable Source Characteristics

Every acoustic source, e.g. a speaker, a musical instrument or a loudspeaker, generally has a frequency dependent characteristic radiation pattern, which is preeminent at higher frequencies. Room acoustic measurements nowadays only account for omnidirectional source characteristics. This motivates a measurement method that is capable of obtaining room impulse responses for these specific radiation patterns by using a superposition approach of several measurements with technically well-defined sound sources. We propose a method based on measurements with a 12-channel independently driven dodecahedron loudspeaker array rotated by an automatically controlled turntable. Radiation patterns can be efficiently described with the use of spherical harmonics representation. We propose a method that uses this representation for the spherical loudspeaker array used for the measurements and the target radiation pattern to be used for the synthesis. We show validating results for a deterministic test sound source inside in a small lecture hall.


Introduction
In order to determine room acoustic parameters, e.g.reverberation time, clarity index or even binaural parameters (IACC), room impulse responses are measured with an omni-directional sound source, as required by the ISO 3382 standard.These sound sources in general consist of several single loudspeaker chassis placed on a spherical array and excited in a coherent way with the exact same signal.Measured impulse responses in the room under test entirely describe the linear behavior for the exact combination of sound source position and microphone positions.This can be used afterwards to make the acoustic situation in this particular room audible (concept of auralization) but lacking the characteristic effect of the source radiation pattern.
Methods were therefore developed to drive the single loudspeaker chassis of these compact spherical loudspeaker arrays with individual signals in order to directly approximate certain radiation patterns of target sound sources [7,8,10].This directly implies measuring the room impulse responses with approximated radiation patterns, e.g. of a speaker, an instrument, even though only a technical source is present during the measurements.If we are interested in synthesizing the sound of various sources it becomes obvious that the measurement time rises with each target source and, of course, all target sources have to be specified and measured in advance.This motivates a novel measurement and synthesis method that allows us to measure universal sets of room impulse responses that can be used to synthesize arbitrary radiation patterns after the measurement has been completed.
The number of available loudspeaker chassis, and therefore the number of different basis radiation patterns, can be artificially increased by using several rotation and tilting angles of the loudspeaker array.The proposed method requires that the acoustic transfer characteristics in a room can be assumed as linear and time-invariant in order to use a superposition approach.Hence, the reasonable limits will be studied and discussed.
The proposed synthesis method is based on a description of radiation patterns in the spherical harmonic domain.This enables us to model the radiation patterns of the source to be approximated as well as the spherical loudspeaker array used for measurements on the same basis.

Method
The proposed method can be divided into two parts, measurement and synthesis, which can also be entirely separated from each other.Both parts use the same calculus, and the inverse problem is also formulated in the same way.The spherical harmonics representation is used throughout.

Measurement of room impulse responses
The core of the measurement consists of well known impulse response measurements of linear timeinvariant (LTI) systems.This assumption holds for most acoustical systems within certain limits.A detailed overview of these methods can be found in [6].
For each loudspeaker chassis of the array, the impulse response h(t) or its frequency representation H(ω)1 is obtained by using exponentially swept sines (chirps, sweeps) as excitation signals and using proper deconvolution techniques for the signal recorded by the microphones [2].The chosen signal is advantageous for the given task by means of non-linear behavior detection possibilities.Furthermore, we employ a time saving approach that uses interleaved excitation signals allowing several loudspeaker chassis to run at the same time, but also allowing us to perfectly separate the responses is used as proposed by Madjak et.al [4].
For each of the M orientation angles of the loudspeaker array the N impulse responses are measured, one for each loudspeaker chassis, leading to a set of Each response corresponds to a different radiation pattern.Figure 1 illustrates the method schematically for an array of three loudspeakers: the impulse responses of each driver are measured in two orientations and are subsequently superposed.

Synthesis of target responses
In order to approximate the target radiation pattern by the spherical loudspeaker array, complex and frequency dependent weighting factors w l are determined to obtain the room impulse response h T (t) or the transfer function H T (ω) of the approximated target radiation pattern by superposition.
The superposition approach is only applicable if the room can be considered as an LTI system.Linearity is in general not problematic for air-borne sound paths as in the room for moderate sound pressures, as in the case of standard room acoustic measurements.However, time-variances become problematic if the room changes significantly during a measurement session.Time variances in rooms are caused by temperature shifts, changes in humidity or light winds and circulations.In order to detect these variances leading to errors in the ongoing synthesis, a concept is used as described in section 4.1.
The radiation characteristics of an acoustic source can be described by the directivity factor Γ [5]: This gives the complex factor between the pressure p in a reference radiation angle (θ 0 , φ 0 ) and the sound pressure in any direction (θ, φ).(r, θ, φ) are the radius, the vertical and the horizontal angle of the common spherical coordinate system.In general, p and Γ are complex and frequency dependent, but for better readability they are used without subscripts in the following.
Since the directivity value can be regarded as a function which only depends on the radiation angle (θ, φ) and which is furthermore continuously integrable, it can be represented by a set of spherical harmonic coefficients Γn,m , as shown by Williams [9]: where Γn,m are frequency dependent and complex valued spherical harmonic coefficients, and Y m n are spherical harmonic base functions, which can be defined as: Indices n and m denote the spatial periodicity of the function Y m n (θ, φ).They are called order n ∈ N 0 , and degree m ∈ Z : −n ≤ m ≤ n.P m n (μ) is an associated Legendre polynomial of the first kind.A detailed work explaining the characterization of acoustic sources and radiation pattern with spherical harmonics is given by Zotter [10].
The radiation pattern of a real source has a finite roughness over the surface.Therefore its characterization in the spherical domain can be limited to a maximum order N max , and the spherical harmonic coefficients can be summarized in a column vector [10]: where 0 ≤ n ≤ N max and −n ≤ m ≤ n.
Each of the above-mentioned L measured impulse responses with the spherical loudspeaker array correspond to a certain source radiation pattern, which can be also written in such a vector dl .
Let ΓT be the radiation pattern of the target to be synthesized, and we can formulate by analogy with equation (2): The vectors dl can be summarized in a matrix characterizing the radiation patterns of the entire extended array: Hence equation ( 7) can be extended towards a matrix formulation: For the optimum weighting vector w we formulate, leading to an inverse problem, that can be solved by using the Moore-Penrose pseudo inverse D+ [3]: All quantities in equations ( 2) and (11) are measured quantities and are therefore subject to noise.In order to suppress the influence of these measurement uncertainties in the synthesis result, the range of possible solutions is limited by Tikhonov regularization methods [3]: where I is the unit matrix of dimension L × L and ν is the so called Tikhonov parameter.

Implementation and instrumentation
The measurement methods were implemented using MATLAB and the ITA-Toolbox.This toolbox is developed at the Institute of Technical Acoustics and provides various tools for acoustics measurements and post-processing.Hence, the calculus for the inverse problem and the synthesis was also implemented in MATLAB.
A complex calibrated instrumentation setup was used, consisting of the following elements (the numbers correspond to the numbers in Figure 2).Our spherical loudspeaker array consists of a midtone dodecahedron loudspeaker developed at the Institute of Technical Acoustics.The single loudspeaker chassis can be driven independently and the radiation pattern of each chassis was measured under free-field conditions in the anechoic chamber with a controlled measurement scan unit.The results were transformed to the spherical harmonics domain.This loudspeaker array was used along with a computerized turntable to allow arbitrary horizontal orientation of the array in the room for measurements as shown in Figure 3.The array was inclined at an angle so that the elevation angles of the single chassis were equally distributed.The superposition method based on the Moore-Penrose pseudo inverse D⊕ of the radiation patterns of the array was introduced in section 2.2.The error of this inversion is a good measure of the quality of the method to expect for ongoing calculations with reasonable target responses.In the ideal case, the residual matrix would be the zero-matrix.The energy of its columns εn,m corresponds to the error that arises when synthesizing several spherical harmonics Y m n .Figure 4 shows this error over frequency2 .As can be seen, the possible order of the spherical harmonics for synthesized target sources rises with frequency, and the error of synthesis rises as well.The low number of possible reproducible orders for low frequencies can be explained by the fact that the single loudspeakers do not have a dominant radiation pattern for low frequencies themselves.The synthesis error is caused by limited resolution of the orientation angles in vertical direction of the single chassis, as this angle could not be adjusted automatically with the given measurement setup.

Experimental results
In order to evaluate the proposed method a comparative measurement was conducted.The room chosen for the measurements was an easily accessible lecture hall in the Institute of Technical Acoustics with a mean reverberation time of approx.0.9 seconds at mid frequencies.Two main measurements were conducted in this room: one measurement with the spherical loudspeaker array, and the other measurement with a loudspeaker of a certain target radiation pattern, which was also used as target response for synthesis.Figure 5 illustrates the measurement setup inside the lecture hall.The upper picture shows the spherical loudspeaker array on the left side and the bottom picture shows the target loudspeaker in the same position in the room.Additionally, the refer-ence dodecahedron loudspeaker (right side) was used in a fixed position, as can also be seen in the pictures.

Detection of time variances
As mentioned above measurements with a reference loudspeaker are conducted for each orientation angle of the spherical loudspeaker array.The results of the reference loudspeaker are used for a correlation analysis of the impulse responses in the time domain with a mean impulse response.Figure 6 shows this correlation coefficient.The dotted line marks the time when the room was briefly entered to replace the array by the target loudspeaker.
It is obvious that the acoustic behavior of the room changes over time.At the beginning, after the personnel has left the room, the changes are greater than at the end.This can be explained by the fact that the room still needs some time to completely settle down after objects have moved.Measurements are chosen at measurement times where the time variances are low.In this case we chose 100 measurements as input for the ongoing synthesis.5e+002 Hz [1] 1e+003 Hz [1] 2e+003 Hz [1] 4e+003 Hz [1] Fig. 6: Correlation analysis to detect time variances

Results
The upper image in Figure 7 shows the measurement with the real source and the synthesis result in the time domain, and the lower image zooms into the range of the first reflections in the room impulse response.The results look very similar in this representation.The frequency domain is plotted in Figure 8.The results show good agreement in the range from 300 Hz to 1.5 kHz.The chosen cut off frequency limits the result at the low end.The deviation above 1.5 kHz grows over frequency, which is in correspondence with the results shown in Figure 4.

Conclusion
We have proposed a measurement method for a special set of room impulse responses and synthesis in a post-processing step for room impulse responses of arbitrary target radiation patterns.In addition an approach was introduced to fully separate measure-ment and synthesis by transforming the measurement results into a universal representation.Since the method assumes linear time-invariant systems, this assumption was studied and a measure to quantify and monitor time variances was used based on measurements with a reference loudspeaker in a fixed position.
The method was validated in a small lecture hall using a 12-channel dodecahedron spherical loudspeaker array with automatically adjustable orientation angles to virtually increase the number of drivers and therefore the number of different radiation patterns.The results obtained by the synthesis of the proposed method were compared to measurements with the source which was also used as target for the synthesis.The frequency range is limited towards higher frequencies at around 3 kHz with the given measurement setup.
As the main idea of this work was to develop such a measurement method, there are still some limitations to overcome in future work.In order to cover the entire audible frequency range from 20 Hz to 20 kHz two modifications seem reasonable: the spherical array should be replaced by a high-tone version for the higher frequency range, and the vertical resolution of the spherical array needs to be increased.