Kim, Prinzel, Kaber, Alexander, Stelzer, Kaufmann &Veil conducted a research in 2011 to assess the influence of head-up displays (HUDs) configuration on perceptions of display clutter, workload and flight performance. This article summarises and reanalyses the results to provide more useable findings.
Perception of Multidimensional Measure of Display Clutter
The present study achieved the objectives of developing a new multidimensional measure of display clutter and investigating the relationships among perceived and objective measures of clutter, workload, and flight performance.
|Table 1: Interaction among Measurements|
|Measurements||Pilot Flight Experience||Workload Level||Display Configuration||Flight Performance||Flight Segment||Overall Clutter Ratings|
|Pilot Flight Experience||—||Marginally Significant||Significant||Not Significant||Not Significant||Significant|
|Workload Level||—||—||Marginally Significant||Not Mentioned||Not Mentioned||Significant|
|Display Configuration||—||—||—||Significant||Marginally Significant||Significant|
|Flight Performance||—||—||—||—||Significant||Not Mentioned|
|Flight Segment||—||—||—||—||—||Not Mentioned|
|Overall Clutter Ratings||—||—||—||—||—||—|
Table 1 shows the overall significant interactions among pilot flight experience, workload level, HUD configuration, flight performance, flight segment and pilot perceptions of overall display clutter ratings. These results indicated that clutter is an actual quality of displays that may lead to human factor problems.
|Table 2: Analysis Results for Hypothesis|
|Calculated clutter scores & Overall perceived clutter ratings||R=0.77, P<0.0001||Highly Significant||H1|
|Workload ratings by NASA-TLX scores & Pilot experience||F(2,45) =2.929, P =0.064||Marginally Significant||H2|
|Workload ratings by NASA-TLX scores & Display configuration||F(2,45)=7.911, P =0.001||Significant||H3 & 4|
|Pilot experience & Workload level||F(2,45) =3.023, P =0.059||Marginally Significant||H5|
|Basic visual display properties & Calculated clutter scores||R²=0.33, P<0.0001 (low workload); R²=0.18, P<0.0001 (high workload)||Significant||H6|
|Calculated clutter scores & Performance measures (LOC control and G/S control)||R²=0.037, P=0.018 (vertical), and R²=0.033, P=0.005 (horizontal deviation) RMSEs||Significant||H7|
|The log-transformed RMSE responses & Workload ratings by NASA-TLX scores||R²=0.024, P=0.024 (vertical), and R²=0.01, P=0.068 (horizontal deviation)||Significant||H7|
The study group predicted seven Hypothesis for the research. Table 2 shows that except for H2 (even though the researchers consider it as Marginally Significant, the p-value is actually shows not significant), most of the results are proved as significant and support the hypothesis.
From table 2, calculated clutter scores were found to be highly correlated with overall perceived clutter ratings. This supported hypothesis (H1) that the new multidimensional measure of clutter would have internal consistency. In agreement with H5, table 2 shows that high experience pilots were less sensitive to the workload manipulation according to NASA-TLX scores. To address H6, multiple linear regression models of clutter scores based on HUD visual properties were developed. Software-based analysis of HUD images yielded visual property results that proved to be predictive of clutter scores. Table 2 proves that pilot perceptions of clutter in new HUD designs can be projected based, in part, on low-level display characteristics. Finally, table 2 indicates that, as expected by H7, both normalized clutter and NASA-TLX scores were significant predictors of pilot performance (vertical and horizontal deviation measures) in the various segments of the landing approach.
|Table 3: Clutter ratings by HUD configuration and pilot experience|
|HUD configuration||Low Clutter||Medium Clutter||High Clutter|
Contrary to H2, table 3 indicates that high experience pilots were more sensitive to display clutter and were more accurate and consistent in judging the occurrence of clutter (imagery obscuring or confusing other information). This also suggests that flight experience may support pilots in extracting relevant information from displays and the ability to judge when information is extraneous (i.e., clutter).
|Table 4: NASA-TLX scores of workload ratings by HUD configuration and pilot experience|
|Pilot Experience||Low Experience||Medium Experience||High Experience|
Table 4 shows that, negative effects of low and high clutter displays were found across workload and performance measures, which indicated some optimal amount of HUD information may exist in terms of information overload and support for flight path control. This was predicted by H3 and H4.
This was an exploratory study into the effects of clutter on flight experience as perceived by commercial airline pilots.
- 18 current commercial airline pilots with no prior HUD experience.
- The sample comprised following demographic: male pilots (n=16) and female pilots (n=2), with age from 23 to 51 years old (mean= 40.4 yrs) and total flight hours from 1500 to 20,900 hours (mean =8947.8 h).
Criterion (dependent) variable
- Pilot rankings of the various dimensions of clutter for describing HUDs were collected after pilot training on the IFD and flight scenario.
- Subjective ratings of overall perceived display clutter on a scale from 51 “low” to 520 “high” and ratings on the underlying dimensions of clutter were collected at the close of each flight segment.
- Clutter scores were calculated by rank-weighted sums of ratings across the six clutter sub-scales (redundancy, colorfulness, salience, dynamics, variability and density).
- Basic visual properties of HUDs (contrast, occlusion, display density and luminance) were calculated to predict the calculated clutter scores resulting from the multidimensional subjective measure.
- Pilot ratings of workload were recorded by using NASA-TLX.
- Pilot performances were recorded in each segment.
Predictor (independent) variable
- Three experience groups (‘Low’, ‘Medium’ and ‘High’) were formed based on pilot total flight experience.
- Three HUD configuration sets (‘high clutter’, ‘medium clutter’, and ‘low clutter’) were presented. Three target displays were selected to represent unique HUD feature sets within each group for a total of nine test displays.
- Two levels of flight workload (‘High workload’ under crosswind condition and ‘Low workload’ under no wind condition) were used.
- Three legs (phases) of flight were separated.
Data were collected from an experiment which including a total of 108 trials across all pilots and 324 observations on perceived workload, ratings of the dimensions of clutter, and overall perceived clutter.
1. For H1, Correlation analyses (Pearson coefficients) were conducted to identify whether pilot ratings on the underlying dimensions of clutter were consistent with overall perceived clutter ratings.
2. A series of repeated measures ANOVAs were conducted to assess the effects of pilot experience level (low, medium, or high), HUD configuration (low, medium, or high clutter), flight task workload level (low, high), and flight segment ( 1 – 3 ) on the overall clutter ratings, NASATLX scores, and a subset of the flight performance measures.
3. The post hoc tests were conducted to assess the HUD configuration effect.
4. Model parameters were revealed by regression analysis (t-tests) to all be significant predictors (P<0.05) of clutter score except for occlusion under the high flight workload condition.
5. Regression models (multiple linear regression model, step-wise regression model and best-fit regression model)) were developed to predict the flight performance measures in terms of display clutter and TLX scores. The scores were converted to standardized z-scores. For each regression model, graphical analysis and diagnostic tests were conducted on the residuals to assess the normality assumption.
- For evaluation of a range of aviation system display concepts beyond SVS and EVS HUDs.
- For other researchers work on pilot performance measures in other flight simulation studies.
- For airlines as a basis for new avionics display certification and systems acquisitions.
- For evaluating air traffic management support display technologies for the occurrence of clutter and to assess the reliability of the measurement outcomes.
- Sang-Hwan Kim, Lawrence J. Prinzel, David B. Kaber, Amy L. Alexander, Emily M. Stelzer, Karl Kaufmann, and Theo Veil (2011). // Multidimensional Measure of Display Clutter and Pilot Performance for Advanced Head-up Display.// Aviat Space Environ Med 2011, 82:1–10.
- Sang-Hwan Kim, Karl Kaufmann, and Simon Hsiang (2008).// Perceived Clutter in Advanced Cockpit Displays: Measurement and Modeling with Experienced Pilots.// Aviat Space Environ Med 2008, 79:1–12.
Footnote1 : P-Value of 0.05 is referred to in this research as a level of significance.