Mental Workload Assessment

 

What is mental workload?

Generic measures of Mental workload

Factors that impact upon operator ‘mental workload’?

Approaches to workload assessment

Measuring mental workload

Classic task analysis

‘Quasi computational Metrics’

A note on the mental workload literature

Illustrative Bibliography

 

Mental workload web resources

 

Downloadable lecture on mental workload. It has notes in the notes view. 

 

1.                              What is mental workload?

 

‘Mental workload’ is a way of describing the mental stress and strain of being busy at work. Excessive ‘mental workload’ often leads to slips, mistakes, misunderstandings, omissions and other errors. ‘Mental Workload’ is important in the operation of Safety critical systems where there are many ‘mental’ tasks. These include:

 

*      Active and passive vigilance tasks.

 

*      Problem recognition and diagnosis

 

*      Formulation and implementation of plans of actions

 

*      Prioritisation of plans of action.

 

*      Complex, extensive and integrated multi-faceted communications by four different modes of communication,

 

*      Remembering to do things.

 

*      Making prompt decisions based on the integration of experience and an understanding of current situations.

 

*      Coping with unexpected events

 

 

Mental Workload of the operators of complex systems is an idea that has been thought about for many years. It is possible to have too low a mental workload so that mistakes that caused by boredom or inattention because target events and noteworthy occurrences are so infrequent that they are simply not expected.  However, it is high mental workload that more commonly springs to mind when one considers mental workload.

 

Some examples illustrate different aspects of the notion of ‘high mental workload’:-

 

*      A pilot dealing with an emergency with too many things to do.

 

*      A general practitioner with too many patients to see in a short a space of time.

 

*      A design team with a cumbersome quality procedure that must be adhered to, (though it hinders rather than helps), and a set of impossible deadlines and project based cost pressures.

 

Each of these cases are different yet a little thought will show in each case, mental workload ‘pressure’ can lead to poor decision-making, planning and performance. Often the consequences of this poor performance may not be apparent immediately though the person or group may be aware that things are not right. Immediate organisational needs militate towards quick short term solutions to reduce the discomfort associated with excessive mental workload.  High mental workload is that it leads to mistakes, errors, stress and the development of sub-optimal coping strategies.

 

Work based situations do not exist in isolation. Training, and development, management style and the tools and procedures given to  an operator to do ‘the job’ can either make workload acceptable that otherwise would not be so, or make an otherwise acceptable situation unacceptable[1].  Mental workload is not simply about how easy an interface is to use, but about how well operators are supported in doing the mental aspects of their task as they go through their work on a day-to-day basis. It is not simply about the task ‘event rate’ or ‘information rate’ that operators have to deal with. It also concerns the causes and consequences of those events. If events within a task are wrapped within uncertainty and unpredictability the workload will be higher than it might otherwise have been. Task difficulty and training are involved. Novices have a much higher mental workload than skilled operators. Planning and monitoring may suffer not just because operators are busy, but also because they are preoccupied. Mental workload sounds simple but it I actually very complex

 

Defining mental workload is not as easy as one might think; a working definition might be ‘the real and perceived increase in task difficulty caused by any factors that impair decision-making planning and reasoning and other mental tasks concerned with the job in hand’. A touch tautological, but that is often the case with high level concepts in cognitive ergonomics. ‘Human Error’ and ‘Situational Awareness’ are two other constructs that are clearly important but equally difficult to define. In all three of these cases, each idea is difficult to define and also resists attempts to measure it, because attempts to measure things imply that there must be some relative or absolute scale, when in fact it situations where error, workload or situational awareness are important, what they are is precisely defined by the context of the operations and the details of the system that is being used. Lose the context and you lose the essence of the construct. This is why generic definition of mental work load beg context. Mental workload is a synonym for ‘overall cognitive task difficulty’. WE cannot understand what this means unless we supply details of each task and the context in which it is carried out.

 

 

 

1.1                         Generic measures of Mental workload

 

It would be very convenient if there were a measure of mental workload that was in some way generic or transferable from one context to another. The development cost of such an instrument would be ‘one off’ and it would be possible to make a wealth of comparisons across different sectors and settings.  Unfortunately things are not so simple. The fact that there are common features between two different tasks does not mean that they contain the same mental workload.

 

Imagine the workload of the pilot of a plane flying either between London and Brussels (45 minutes)or between London and Moscow(3 hours). Even in the same airline in the same type of plane, the mental workloads are not equivalent. The three hour flight to Moscow is a much more relaxed affair in the air, with more ground based difficulties on arrival. The Brussels flight is almost over before it has begun leaving the pilots with a different work profile and a different workload problem. The point is a simple one. The workload experienced by the pilot of a plane is dependent on the context in which they are operating and is not simply a feature of the interface.

 

If the assessment of workload is dependent on context then it becomes harder for a generic measure to be credible unless it takes account of context in some way. The context and the system operators thus become a core part of an iterative evaluation of mental workload..

 

2.                              Factors that impact upon operator ‘mental workload’?

 

Many things can affect control room/cockpit/etc operator ‘mental workload’. The Safety critical systems ‘operator interface’ is only one important factor that can either increase or decrease operator ‘mental workload’. Some other important factors are:-

 

*      Skill levels of operators and how well they are trained.

 

*      Operating procedures.

 

*      Operating conditions.

 

*      Station staffing levels and staff competence.

 

*      How tasks are allocated between different people and between people and automated systems.

 

*      Organisational expectations, such as the need to ‘keep open’ may also have an impact upon operator ‘mental workload’.

 

Control room/cockpit/etc mental workload is not a just feature of the design of the control room/cockpit/etc, but a product of the operators skills and competence, the means they have to control the station and the situations with which they a are confronted. Operator mental workload should be reviewed as part of the station management process on a regular basis, and certainly if changes are made to any of the factors listed immediately above.

 

 

3.                              Approaches to workload assessment

3.1                         Measuring mental workload

 

There is no all encompassing way of assessing operator ‘mental workload’. Methods that have been used include:-

 

*      Mapping, over a time base, every detailed interaction and spoken thought of operators while doing a certain task.

 

We can observe people doing their work, analyse their activities and ask them questions about their perceptions of their mental workload and construct a detailed analysis to see hw buyst they are doing certain key tasks at key times.

 

*      Assessing the impairment of a second, distracter task as workload on a main task increases.

 

Relatively simple tasks, such as the perceptual-motor actions needed to fly a plane or to run certain aspects of industrial systems, have been simulated in controlled situations and workload assessed by giving operators other tasks to perform and increasing the multiplicity and difficulty of these tasks. As main task performance is impaired, the increases in secondary tasks allegedly provide the assessment of workload.

 

*      Measuring certain supposed physiological correlates of high ‘mental workload’.

 

A variety of physiological; responses such as pupil dilation, frequency of change of gaze, heart rate, galvanic skin response and so on have also been suggested. They lack validity and their relationship to task performance is questionable.

 

*      Measuring task performance over a ‘busy’ time.

 

We can see how well they do their job and infer high workload from poor performance. This is far to simplistic and is only taken seriously by those in search of the holy grail of a simple, generic methodology that, unfortunately, does not exist.

 

*      Subjective ratings, questionnaires and checklists of various descriptions.

 

We can ask people how they think their workload is. This is not wholly reliable unless you train people to know what they are doing. If you just ask them outside of any context and without any development or appreciation of what it is they are doing, they may produce contradictory or unhelpful responses.

 

Very often these methods are aimed at ‘single user’ situations and not at ‘multiple-user’, ‘multiple-event’ control rooms. The control room environment is a notoriously difficult environment in which to assess ‘mental workload’. This is because in a control room, workload is not related to any particular event but to a number of different events occurring at the same time. These assessment methods are acceptable in some ways but also have the potential to be quite damaging. They can have variable construct validity, apply badly to industrial, control room and other environments where ‘open ended’ thinking is often required and do not take account of the most important aspect of workload measurement, the context in which the work is being carried out, including the skills base of operators and other station staff, organisational; factors and the context in which people are operating.

 

3.2                         Classic task analysis

 

This involves classic task analysis mainly in dealing with separate situations like fires and evacuations. Full data on operator actions is plotted and analysed by way of time lines and link analysis and similar methods of representation..

 

*      Advantages

 

This methodology is precise and anchored in sets of operator actions conducted in predictable circumstances. They provided a valuable usability audit of the interface, concluding that the interfaces were, as a rule, usable..

 

*      Disadvantages

 

These analyses tend to concentrate on low-level task items, and don’t deal effectively with higher cognitive aspects of the task. .

 

3.3                         Quasi computational Metrics’

 

‘Quasi computational Metrics’ (sic) are essentially an attempt to find a unitary and to some extent generic measurement method for assessing mental workload. The NASA mental workload tool is a good example, although there have been many others. Generally they tend to be created for a particular situation and then somewhat in the way of cultural and personality inventories, developed in either an industry specific or trans-functional form that can by dint of questionnaire, checklist, task measurement or physiological measurement give some idea of workload. They may have some precise questions and rating scales, they may ask semi directed questions and then attempt to derive some form of score.

 

Such a brief description hardly does them justice, but they do tend to share one characteristic. If you give them to operators of complex systems and ask them to comment upon them they don’t see quite how they will be useful. This is because context specificity is essential to any meaningful workload analysis and any attempt to make generic tools will begins life in a particular context: But it is bound to lose the contextual analysis of situations as it becomes generic, which is the thing that gave it credibility in the first place. Put succinctly, these metrics are rather like a doctor relying on a thermometer as the main means of making a clinical diagnosis, while paying scant attention to a patient’s circumstances, history and symptoms.

 

*      Advantages

 

They are valuable inasmuch as they give pause for analysis, reflection and thought that might otherwise not take place. They are also useful as a means of developing different ways for thinking about mental work and if used judiciously and with other methods, of producing a useful view of the likely mental workload problems of a situation.

 

*      Disadvantages

 

They design out the context or minimise its importance when that context is cardinal to workload considerations. They also work better on tasks that are inherently more predictable, even though they are complex, such as the role of a pilot. They are poorly suited to control room operations, are difficult to use in team based working and need to be used with care, because the moment a numeric value of some kind attached to an interface, attached caveats will tend to be forgotten.

 

4.                              Mental workload literature

 

4.1                         A note on the mental workload literature

 

The literature on mental workload is very large and has been growing, like a reef, for more than thirty years. It tends to concentrate upon vigilance and motor aspects of tasks such as driving and flying planes. There have been many summaries of methodologies and these are reflected in the brief account elsewhere in this report. There have been many learned accounts of the subject. Lisanne Bainbridge, Neville Moray and Christopher Wickens to name a few.  Boff, Kaufman &Thomas, edited the ‘Handbook of perception and Human Performance’, where, O’Donnell & Eggemeier, (1986) gave the state of the workload assessment ‘art’ at that point in time. More recently Schvaneveldt, Gomez and Reid summarise the literature well both in terms of the internal reliability and validity of approaches to assessing mental workload. Using tracking and tone counting tasks they compare subjective methods for assessing mental workload and find that subjective measures discriminate badly, probably because the tasks are too homogenous but find hope for the future. They suggest that physiological measures might be useful, a common theme. This study is quoted, as it is an Adobe Acrobat download that is a reasonable example of much of the competent work that is found in the area of workload assessment.

 

Also typical in its approach is the EU MEGATAQ[2] project (1999) which notes that

 

Cognitive workload refers either to the objective workload imposed by the task (e.g. pacing) or to the subjective rating of the operator with regard to the demands of the task. In most cognitive workload theories workload refers to the information processing capacity of the operator, whereas it sometimes also encompasses emotional demands. Mental effort is a key concept in cognitive workload. It plays a critical role when the task can only be performed under cognitive control, or when the state is not optimal for the task. Tasks to be performed under cognitive control are requiring mental effort when:

*      The input-output relations are inconsistent or variable.

*      Processing modules are used that have limited capacity.

*      Resources have to be divided between task components.

*      New skills have to be acquired.

The overall objective of workload-related research is to seek optimisation of the work task, i.e. minimal stress and fatigue due to work. In the design or implementation of newly built telematic applications there is a potential danger of changed cognitive workload due to task performance.

Major question in the assessment of cognitive workload:

*      What is the average overall workload?

*      What is the magnitude of the workload during peak periods?

*      What is the reserve capacity of task performance?

Mental effort can be determined in various ways:

*      Questionnaires measuring the subjective experience of mental effort.

*      Questionnaires measuring fatigue experienced and mood changes.

*      Psycho-physiological indices: pupil dilation; changes in heart rate variability.

And that:

 

The NASA-TLX (1988) is a questionnaire comprising six scales that measure different aspects of mental workload. The user completes the questionnaire after a recognisable task has been carried out. The user evaluates: the level of demand the task imposed upon him and the level of frustration the task imposed upon him. The TLX contains six 20-points scales on: Mental demand, Physical demand, temporal demand, Performance, Effort and Frustration. The questionnaire can be presented in a computerised version, or in paper and pencil version. The individual will need approximately 5 minutes to complete the questionnaire.

These two passages give a good illustration of the dominant approach that has been taken to assessing mental workload. Cognitive (or Mental) workload is only thought about in simple terms, in terms of tasks that are laboratory based analogues of simple tasks. The NASA-TLX is a widely used questionnaire and needs a relatively simple, discrete task as the substrate for its completion. In neither case do they admit into their considerations the complexity of control room situations.

 

The timbre of this ‘dominant approach’ can also be illustrated very simply by looking at, de Waard, D (1996), an on-line version of a PhD examining driver mental workload. The reference list shows the rich, extensive, but curiously self contained approach to mental workload assessment. It would be possible to continue such a review of workload assessment literature for some pages, but it serves no purpose here. There are though some key observations as are far as the current study is concerned.

 

First these methods of analysis do not capture what happens in a busy control room in any shape or form. They simply do not ring true to the reality of control room operation. They have very limited validity in control room settings and lead to concentrating on very precise details of the interface and of course, very simple aspects of any task. Tasks are often taken from the cognitive psychology literature. Dichotic listening, tracking, tone counting and so on; or simple emulations of real world tasks that have some validity in that they superficially represent some part of a real world task, but lose the real world complexity. They have some face validity, but poor construct validity. They do not lend themselves to assessing workload in a complex non-existent interface on projected but as yet unspecified tasks.

 

Second, by and large, these methods assume there is a task that has been conducted and a dynamic system to be evaluated. Conducting workload assessment on very small operational tasks in the existing arrangements at King’s Cross will not help assess mental workload using the new interface, although this type of analysis made an input into the development of the scenario based knowledge elicitation methodology employed in this study.

 

Finally there is a surprising amount of consensus in the literature that the notion of mental workload is a high level construct that should be used to ‘gather in’ good practice in systems design and operations. The difficulty is in incorporating such ideas into a useful way of assessing mental workload. The reason for this difficulty seems to be two fold:

 

*      There is the tendency to overestimate how easily ‘generic’ approaches to workload assessment can be transferred from one situation to another. As discussed elsewhere in this report, workload assessment is actually dependent on the context of the work for which mental workload is being assessed. The context is part of the task. Thus a methodology for assessing mental workload in a tracking task or some other easily constrained cognitive task is fine for this setting, but generalises very poorly to real world settings in practical terms.

 

*      Coupled with the first point, people seem to make what Gilbert Ryle in the 1940’s described as a category error. Ryle used the example that a University is a not a collection of buildings, but the totality of the staff, students and their scholarly works and aspirations. If you concentrate on the building at the expense of the culture and activities of the organisation, you run the risk of making mistakes in the way in which you run your university. Mental Workload Assessment is similar. Mental workload is not a collection of individual tasks and performance, it is a linking construct that allows context specific features of a job to be sensibly organised with a view to making sure that overall system performance is acceptable. In other words there is no one way to assess mental workload. Every situation needs its own assessment and its own model of mental workload, a point that Schvaneveldt, Gomez and Reid actually make in their paper, despite the methodologies that they propose.

 

 

4.2                         Illustrative Bibliography

 

Bainbridge, L (1978) Forgotten alternatives in skill and workload.  Changes in cognitive processes with the development of skill, and the implications for mental workload.
Ergonomics
, 21, 169-185. Simultaneously published as: (1977)

Bainbridge, L, (1974) Problems in the assessment of mental load)
The adaptation of cognitive processes to task demands and mental capacity: some reasons for the lack of correlation between objective and subjective mental workload.
Le Travail Humain, 37 (2), 279-302.

Bainbridge, L (1989) Development of skill, reduction of workload. ()
Described by title.
In Bainbridge, L. and Ruiz Quintanilla, S.A. (eds.), Developing Skills with Information Technology, Wiley, pp. 87-116.

Moray, N. Johansson, J. Pew, R. Rasmussen, J. Sanders. A.F., & Wickens, C.D. (1979) Mental Workload, It’s theory and measurement. New York. Plenum Press.

O’Donnell. R.D, & Eggemeier, F.T., (1986) Workload Assessment methodology, In Boff, K., Kaufman, L., &Thomas, J., (eds) Handbook of perception and Human Performance, Vol II (pp42-1 pp 42-49. new Your, Wiley.

 

Schvaneveldt, R.W., Gomez R.L, & Reid, G.B, Recent unpublished report from New Mexico and Arizona State Universities and Armstrong Laboratories, Wright Patterson AFB, http://interlinkinc.net/Roger/Papers/Workload.pdf

 

de Waard, D, (1996), The Measurement of Drivers' Mental Workload ISBN 90-6807-308-7 Paperback, 198 pages Published by the Traffic Research Centre (now Centre for Environmental and Traffic Psychology), University of Groningen. http://www.home.zon.be/waard2/mwl.htm

 

 

 

 



[1] An example of the latter is the imposition of batch processing rules onto a continuous process on a production line where operators had to track individual items and batches as they moved down the production, line, a very difficult task that disrupted all other activities.

[2] MEGATAQ - Methods and Guidelines for the Assessment of Telematics Applications Quality - sets out to help you answer your questions of evaluation. MEGATAQ is a project of the Telematics Engineering sector of the Telematics Application Programme of the European Commission, DGXIII.