Sonic Boon: Head Tracking for 3D Audio Using a GPS-Aided MEMS IMU
Spatial orientation plays a critical role in aviation, especially under conditions of instrument flight rules. The ability to detect the direction of an aircraft’s heading, the ground, an airfield, and approaching aircraft is particularly challenging at night or in stormy weather. This article describes ongoing work to develop a GPS/IMU-based head tracking system to provide 3D audio cues that can help pilots orient themselves in adverse circumstances.
One of the fundamental qualities of the physical world that humans inhabit is spatial dimensionality. Typically, we think of this in terms of three dimensions (3D) — height, width, and depth — and experience it most frequently in visual terms. However, humans are also able to recognize the dimensionality and directionality of sound.
Audio systems have been developed that use headphones to project 3D sound, in which the brain perceives the sounds as coming from a particular direction – up, down, left, right, ahead, behind, or a combination of these. Potential applications for this technology exist in both military and general aviation, such as projecting tower transmissions in the direction of the tower or providing an audio orientation cue for visual flight rule (VFR) pilots who find themselves in emergency zero-visibility conditions.
In order to be effective, 3D audio systems require real-time knowledge of a pilot’s head orientation. This article describes the development and testing of an integrated inertial measurement unit (IMU)/GPS system, developed at the Advanced Navigation Technology (ANT) Center at the Air Force Institute of Technology (AFIT), that determines real-time head orientation for use by a 3D audio system.
The system incorporates a low-cost micro-elecro-mechanical system (MEMS) IMU combined with a single-frequency GPS receiver. Real-time data from both of these systems flow to a laptop computer where a real-time Kalman filter was implemented in MATLAB to solve for position, velocity, and attitude. The attitude information was then sent to a 3D audio system for sound direction rendering.
The Air Force Research Lab plans on using the system in a March flight test of 3D audio in a Cirrus aircraft at NASA Langley Research Center. Ten general aviation pilots will take part in the flight test.
Spatial auditory cues to the pilot could help prevent such disorientation when the aircraft has been flown into an unusual attitude. Combining 3D audio with information from a traffic alert and collision avoidance system (TCAS) could also generate spatial cues to alert pilots of approaching aircraft and provide a reference for evasive action. This has promise of reducing the number of fatalities due to midair collisions.
Current Head Tracking Techniques
A paper by J. Rolland cited in the “Additional Resources” section at the end of this article summarizes the current techniques for head tracking: time of flight, spatial scan, mechanical linkages, phase-difference sensing, direct field sensing, and inertial sensing.
Time of flight techniques include using ultrasonic or pulsed infrared laser diode measurements.
Spatial scan covers all optical and beam-tracking techniques.
Mechanical linkage uses an assembly of mechanical parts between a fixed reference and the user. Orientation is computed from various linkage angles.
Phase-difference sensing measures the relative phase of an incoming signal and compares it to a signal of the same frequency located on a fixed reference.
Direct field sensing includes tracking techniques using either magnetic or gravitational fields. Inertial sensing uses inertial measurements from accelerometers and gyroscopes.
All of these techniques except direct field sensing and inertial sensing require the use of measurements to a fixed reference. This approach may work for systems designed for virtual or augmented reality but obviously becomes a problem for the general aviation application. Once again, the goal is to provide orientation of the user’s head with respect to the local-level reference frame. Using a fixed reference inside the cockpit would only provide orientation of the user’s head with respect to the aircraft.
Of course, if the aircraft attitude information with respect to the local-level reference frame was available, then head position relative to the local-level reference frame could be derived using a fixed reference inside the cockpit. Most general aviation aircraft do not have digital attitude information readily available for such use.
Because of our objective to keep the proposed system low-cost and stand-alone, such methods are not practical. Sensors that measure the earth’s magnetic field could potentially be used, but the earth’s magnetic field is not homogeneous. Furthermore, any disturbances in the ambient magnetic field, which are quite likely inside a cockpit, will also cause angular errors in the orientation estimates.
This leaves inertial sensing to accomplish the task.
The system determined orientation by integrating angular rates from the gyros starting from a known initial orientation. Drift compensation was accomplished by using the inclinometer and compass as a “noisy and sloshy but drift-free” measurement of orientation. He then generated estimates of orientation using a Kalman filter and both sources of orientation.
Foxlin implemented an adaptive algorithm by increasing the estimate of inclinometer measurement noise during periods of slosh. (Slosh refers to the fact that the inclinometer uses a fluid-filled cavity to determine the apparent “down” direction, but that the fluid sloshes in the presence of dynamics, leading to “sloshy” measurements).
Meanwhile, he decreased the estimate of measurement noise at a specified length of time since the last nonzero gyro reading or last change in the inclinometer reading. In this way, the Kalman filter took advantage of the inclinometer and compass measurements when they were the most accurate (with no head motion). This technique would encounter disadvantages in an aviation environment, however, because several phases of flight—including takeoff and coordinated turns—are exposed to sustained constant linear acceleration.
In a paper presented in 2000, Lin Chai and fellow researchers described the use of optical cameras to aid inertial tracking. In their system head-mounted cameras and computer vision techniques located and tracked naturally occurring features in a scene. It could estimate angular orientation, angular rates, as well as translational position, velocity, and acceleration of the camera with respect to an arbitrary reference frame.
The system used two extended Kalman filters, one to estimate the position of up to five points in the scene and the other to estimate the dynamics of the user’s head. Measurements were taken from three types of sensors: gyroscopes, accelerometers, and cameras. However, synthetic inertial sensor data was used because their system did not allow for simultaneous recording of video imagery and inertial sensor data.
Employing this technique as well as other inertial-optical tracking techniques would become more complicated in the aviation environment. Points being tracked by the camera could be inside or outside the cockpit; so, system designers would need to develop an algorithm that distinguished between the two types of points.
The drawback, of course, is that the accuracy of an INS using a MEMS IMU will degrade much more rapidly than an INS using a higher quality IMU. Lacking feedback corrections, the errors in a MEMS-based INS will quickly grow without bounds. We alleviated this problem in our research by estimating the errors in the INS through the use of a Kalman filter and GPS measurements.
GPS Receiver. The system uses a 12-channel C/A-code GPS receiver with an embedded antenna. Position and velocity data were obtained at a 1 Hz rate via an RS-232 serial connection using standard NMEA 0183 ASCII interface protocols. The receiver also provides a one-pulse-per-second (1PPS) output. The rising edge of the pulse is synchronized to the start of each GPS second. This pulse is used in the time synchronization of the IMU and GPS data. The receiver was placed on the side window of the test aircraft.
Inertial Measurement Unit. The MEMS IMU used in this system outputs raw binary sensor data from a triad of accelerometers, gyros, and magnetometers via an RS-232 serial connection. (The magnetometer outputs were ignored when navigating, because they did not generate meaningful measurements inside of the cockpit). The MEMS-IMU can sense angular velocity up to ±450 degrees/second and accelerations up to ±50 meters/second2. The device is lightweight at only 35 grams and relatively small with dimensions 39x54x28 mm (WxLxH). The accompanying photo shows the IMU mounted on the headset.
The IMU’s sample frequency can be set between 10 Hz and 512 Hz. We selected a sample rate of 100 Hz for this research in order to provide a reasonable compromise between processing requirements and accuracy. Factory calibration data is provided for orthogonalization, scaling, and offset corrections; however, the manufacturer does not specify gyro drift rates.
Integration Computer. Because this project was a proof-of-concept demonstration, we chose to implement the integration algorithm on a Pentium 4 laptop running Microsoft Windows 2000. All of the navigation software, including the serial input/output and time synchronization, was implemented in MATLAB. Running Matlab under Windows for a real-time system was not ideal, because Windows would occasionally “take over” the system for short periods of time (preventing any I/O in the process).
This required the software timing algorithms to be robust in the presence of data gaps or delays and necessitated occasional filter resets if a significant data gap occurred. Since the time of the flight tests described in this paper, the algorithms have been ported over to C++ running on a PC-104 embedded computer with a Linux operating system, which has proven to be a much more stable approach.
The position, velocity, and attitude states were initially modeled using a standard Pinson error model implementation as described in the citation by D. H. Titterton and J. L. Weston in the Additional Resources section. Later, we found that all of the higher-order terms in the Pinson error model could be neglected in this case without affecting system performance, because the higher order effects (such as the Schuler oscillation) are completely dominated by the large measurement errors inherent to the low-cost MEMS IMU that was used.
The accelerometer bias and gyro drift states were modeled as a first-order Gauss-Markov process. More details on the filter implementation can be found in the work by J. Joffrion cited in Additional Resources.
The GPS measurements are valid at the beginning of each GPS week second. Because of latencies in the receiver, however, the actual measurement data are not available until approximately 400 milliseconds after the measurement is valid. Two Kalman filter propagation cycles per measurement update period are used to accommodate the delay. At the time the measurement is valid, INS position and velocity are stored.
When the GPS measurement is available, a measurement update is accomplished using the stored INS position and velocity. The error states are then propagated to the current time and estimates of the errors in the INS then become available for feedback corrections (discussed in the following section). After feedback corrections are made, the error states are propagated forward to the next GPS week second to facilitate the next measurement update.
Feedback Corrections. Estimates of the true position, velocity, and attitude as well as accelerometer bias and gyro drift are formed using the output of the INS navigation algorithm and the estimates of the errors in these quantities from the Kalman filter. To minimize drift in the INS, the system uses estimates of the true position, velocity, and attitude to “reset” the INS every time a measurement is available.
The system performed better without resetting the accelerometer bias and gyro drift. Occasionally these states would become unstable in the feedback configuration. To keep the system stable, the algorithm utilizes a combination of feedforward and feedback implementations. x1 to x9 are feedback terms while x10 to x15 are feedforward terms.
Real-Time Software. Matlab’s serial port interface makes it possible to use Matlab in a real-time environment for this application. Serial port objects are established for the IMU, GPS receiver, and 3D audio hardware. Communications with each piece of equipment varies, depending on the communications protocol for each device, and event callback functions represent the primary method by which to accomplish specific tasks.
For example, each NMEA ASCII sentence from the GPS receiver terminates with a carriage return followed by a linefeed. To take advantage of this feature, each time Matlab detects this specific terminator on the serial bus, it executes a callback function. This function reads all current data on the serial bus and checks for specific NMEA sentence headers. It then parses the desired data into a MATLAB structure.
MATLAB integrates the one-pulse-per-second (1PPS) output from the GPS receiver with other system data using its PinStatusFcn function. This callback function is typically used to detect the presence of connected devices or control the flow of data. A user-specified function will execute whenever the status of one of the RS-232 control pins changes.
The pulse output from the GPS receiver is tied to the carrier detect (CA) pin, and the rising edge of the pulse is captured using logic in the PinStatusFcn. The start of GPS week second is determined when the CA pin transitions from low to high. According to the GPS receiver manufacturer, 1PPS accuracy of the receiver is ±1 microsecond.
The IMU outputs data in a continuous binary format with no terminators; so, a subroutine checks for the number of bytes available on the serial bus. Each data packet sent from the IMU consists of 24 bytes. If 24 or more bytes are available on the serial bus, the subroutine searches for the message header, checks for data validity, and stores the data in a temporary software buffer until it can be read into the INS mechanization algorithm.
In addition, this subroutine time-tags the IMU data’s arrival with GPS week seconds, using a combination of the NMEA data, the 1PPS, and the IMU sample counter. The sample counter is included in the IMU data packet and is incremented every sample period. This 16-bit counter rolls over upon reaching 216 sample period counts.
The timing scheme that we’ve described here is not ideal, as it is subject to timing variations due to the varying latencies involved with the serial ports and the MATLAB program running under the Windows 2000 operating system. Such variation normally would not significantly affect performance because the timing variations are only on the order of milliseconds, and the performance requirements (a few degrees with no noticeable latency) are not all that stringent. However, at times the operating system takes over for a second or two, which causes the head-tracking system to have to reinitialize.
Additionally, the system would occasionally (every 10-30 minutes) hang, which is common when using MATLAB for serial I/O. Because the serial ports are used, it is doubtful that timing accuracies better than 1 ms can be obtained. (This problem was subsequently mitigated, but not completely removed, by the porting of the system to Linux/C++) which enabled the system to run indefinitely without hanging. The variations in timing of the system, according to the computer CPU, are more consistently on the order of a few milliseconds than in the Windows platform.)
The on-board GPS-aided inertial navigation reference (GAINR) system used as a truth reference generated time space positioning information (TSPI) data from an embedded GPS/INS (EGI) containing a digital laser gyro and keyed SAASM-based C/A-P(Y)-code receiver in post-processing mode. According to its manufacturer, the GAINR one sigma accuracies are specified to be 0.8 foot for position, 0.01 feet/second for velocity, and 0.05 degree for attitude.
The head tracker laptop, 3D audio laptop, pan-and-tilt unit, and IMU were mounted to a plate on top of the existing data acquisition system rack. (The 3D-audio/head tracker system involved the two laptops and the small IMU mounted on the top; the rest of the rack held the truth reference system.) The pan-and-tilt simulated head movement in a measurable way (e.g., rotate the IMU a known number of degrees).
Unfortunately, the pan-and-tilt’s actuator proved to be incompatible with aircraft power; so, it could not be used in this evaluation. Precise location of all equipment was determined through the use of laser surveying equipment. A lever-arm correction was not applied for the head tracker, since the GPS antenna was within one meter of the IMU (well within the GPS position measurement accuracy).
Performance Evaluation of Head Tracker
Subsequently, we evaluated the head tracker using these updated Kalman filter parameters and collecting head-tracker data as well as TSPI GAINR-system data during a second dedicated flight. For this test, the head tracker system was again firmly fixed to the aircraft body frame so that the head tracker solution could be compared to the reference system solution.
The results, which we will discuss shortly, come from a 24-minute section of the flight flown at an altitude of approximately 12,000 feet.
Because they represent the primary output of the system, attitude results will be presented first. Figure 4 shows both the TSPI (true) attitude and the head tracker filter-estimated attitude. In a broad sense, the head tracker system was able to accurately determine the attitude throughout this test. A plot of the error in filter-estimated attitude (relative to the TSPI attitude), expressed in local-level axes, is shown in Figure 5. The dotted lines show the filter-computed 1σ covariance values. (To view figures and tables, download the PDF version of this article, above.)
The east and north attitude errors were generally within 1-2 degrees (with occasional spikes probably due to timing irregularities stemming from the MATLAB/Windows latency issues). In contrast, the down (i.e., azimuth) tilt error was significantly larger, both in the filter-computed covariance and the actual error. This results directly from a lack of observability of azimuth error when the aircraft is not accelerating in a horizontal direction.
The most common way for an INS/GPS system of this quality to detect and correct for attitude errors is to effectively correlate the acceleration sensed by GPS (obtained from a position and velocity history) with the acceleration sensed by the IMU accelerometers. In the case of the east and north tilt errors, a downward acceleration of approximately 1G always exists; so, any misalignment along these axes is interpreted as an incorrect horizontal acceleration.
When the same acceleration is not seen by the GPS system, the filter realizes that a tilt error has occurred and corrects for it. In contrast, when the aircraft flies straight and level, there is no horizontal acceleration; so, the filter has no way to detect misalignments about the vertical axis. Note that this comparison between GPS and IMU acceleration is done implicitly by the Kalman filter mechanization, not by separately computing and then comparing two different acceleration profiles.
From minutes 3-8, the aircraft was flying straight and level, with minimal horizontal acceleration. Not surprisingly, the down tilt error grew during that time period to a worst-case value of approximately -15 degrees. Once the aircraft turned at the 8-minute point, the azimuth error reconverged to within a degree or two of the true error.
These results, while expected, do highlight one of the potential difficulties of using a MEMS IMU integrated with GPS: During long periods of straight and level flight, the system may be prone to drifting in azimuth. We made two attempts to mitigate this effect. First, we considered the use of a 3-axis magnetometer that is also part of the MEMS IMU.
Initial testing indicated that, on a pilot’s moving head in the middle of a metal aircraft cockpit, the magnetometer outputs could not provide any meaningful information about head orientation. A second attempt, which was successful, used a heading derived from the GPS-based velocity vector as an additional attitude measurement to constrain the azimuth drift.
Figure 6 shows, using a dotted line, the error after applying this heading measurement correction and reveals a significant improvement in azimuth accuracy.
Two things should be noted about the GPS velocity-based heading approach. First, it assumes that the velocity vector and the heading are the same. Depending upon the wind magnitude and direction relative to the aircraft velocity vector, however, an aircraft may not always be pointed exactly in the direction of travel (an effect known as “crabbing”).
As a result, unless the wind effect is known, this GPS-derived approach could result in a heading bias (although it would keep the heading error from growing unbounded). Secondly, the GPS velocity-based approach would not work when the IMU is moving relative to the aircraft airframe (as in the 3D audio case); so, it was not used for the remainder of the testing, leaving the azimuth drift uncorrected during phases of straight level flight.
Figures 7 and 8 show head tracker position and velocity errors, and the filter-computed 1σ covariance values. These plots reveal errors significantly outside of the ±1σ bounds, particularly during periods of dynamics. This most likely stems from a residual timing error within the system. (Possibly, the same timing errors caused some of the spikes in the head tracker attitude errors as well). Further system refinement (including porting to a better real-time operating system) would probably reduce these errors.
The test conductor initiated a set of azimuth/elevation angle sound cues, which were presented randomly to the pilot from uniformly distributed locations. Twelve discrete azimuths (1 to 12 o’clock) and three discrete elevations (low, medium, and high) were possible. The azimuth of the sound cue was generated with reference to the current aircraft heading. At the completion of each aural presentation, the pilot responded with the perceived direction of the sound (e.g., 3 o’clock low). The test conductor recorded the pilot’s response and the commanded sound position.
These tests were performed in two modes: (1) the 3D audio system coupled to aircraft attitude using GAINR data and (2) the 3D audio system coupled to head attitude using head-tracker data. When the 3D audio system is coupled to the GAINR system, the direction of sound depends on aircraft orientation. When the 3D audio system is coupled to the head tracker, the direction of sound depends on head orientation.
In mode 1 when the head tracker is not used, therefore, 3D audio cues remain “fixed” to the user’s orientation. For example, assuming that the aircraft to which the GAINR is fixed does not change course, if a cue is presented directly in front of the user and he turns his heads 90 degrees to the right, the cue will still sound as though it is coming from in front of him (i.e., in the direction he is looking).
In mode 2, the 3D audio system is coupled to the head tracker, and the locational origin of sounds remain spatially fixed. Imagine the same user facing north, and a cue is presented directly in front of him. When the user turns his head 90 degrees to the right, the cue still sounds as if it is coming from the north, that is, from his left side.
The 3D audio system had difficulty generating discernable elevation cues, and correct elevation responses were infrequent using both configurations. Only 40 percent of the GAINR-coupled elevation angle responses were correct, both on the ground and in the air.
Correct head tracker-coupled elevation responses were 42 percent on the ground and 46 percent in the air. Neither of these results are significant, because low, medium, and high are the only possibilities to choose from, and a user is statistically likely to guess the correct response 33 percent of the time without additional information from the 3D audio system.
Azimuth localization is a different story. When the 3D audio system was generated based upon the GAINR-computed aircraft attitude, only 40 percent of the azimuth angle responses were correct both on the ground and in the air. The GAINR-coupled system produced ambiguous responses to sound cues from forward and aft azimuths. Cues from a forward azimuth (e.g., 11 o’clock) were difficult to distinguish from cues from an aft azimuth (e.g., 7 o’clock). Left and right azimuths were easily discerned.
In contrast, with the system coupled to the head tracker, reported azimuth accuracy was significantly better. Around 56 percent of the azimuth angle responses were correct on the ground and 72 percent in the air. The better performance in the air probably results from the larger number of horizontal accelerations during flight, which means that the azimuth angle estimate from the system will be more accurate. On the ground, the system will tend to drift in azimuth.
This large improvement when using the head tracker probably stems from the way that humans resolve forward and aft ambiguities in sound. When hearing a tone with the head tracker, the pilot could slightly turn their head, and the 3D audio system would adjust the sound accordingly.
This “dither” feedback enables a human to distinguish between a sound coming from behind and a sound coming from ahead. The GAINR-coupled system did not change the sound when the pilot turned their head; so, this fore-aft ambiguity could not be effectively resolved.
The head tracker-coupled system eliminated azimuth ambiguities, greatly improving the azimuth performance of the 3D audio system. These results show that the heading estimates are accurate enough to provide real benefits to the 3D audio system. Even if head-tracker heading error is 10 degrees, this error is small when compared to the 180-degree azimuth ambiguity the user could experience with no head tracker.
For figures, graphs, and images, please download the PDF of the article, above.
ManufacturersThis project used the following equipment: H-764G-TSPI, digital laser gyro, SAASM-based C/A-P(Y)-code receiver, from Honeywell Defense & Space Electronics Systems, Clearwater, Florida; GPS 35 receiver, GARMIN International, Olathe, Kansas; MT9-B MEMS-IMU. Xsens, Enschede, The Netherlands; Pentium 4 laptop computer from Dell Computer Corporation, Round Rock, Texas; laser tracker from Faro Technologies Inc., Lake Mary, Florida.
Captain Jacque Joffrion recently graduated from the Air Force Institute of Technology where he earned an MSEE degree, specializing in guidance, navigation, and control. Joffrion also has a BSEE from the U.S. Air Force Academy, and he is a graduate of the U.S. Air Force Test Pilot School. He is currently serving as a B-1B experimental test pilot and has over 1,000 hours of flight time in military aircraft.
John Raquet currently serves as an associate professor of electrical engineering at the Air Force Institute of Technology, where he is also the director of the Advanced Navigation Technology Center. He has been working in navigation-related research for more than 15 years.
Douglas S. Brungart is currently serving as senior computer engineer and as technical advisor in the Battlespace Acoustics Branch of the Human Effectiveness Directorate in the Air Force Research Laboratory. He has been active in 3D Audio research for more than 15 years.
Copyright © 2017 Gibbons Media & Research LLC, all rights reserved.