Mental activity that places significant demands on working memory is like mental juggling.”
H. Lee Swanson
‘Cognitive load theory’ (CLT) is a theory about instruction that may help you to understand why the reading progress of some students is slower than expected. CLT was developed by John Sweller, Emeritus Professor of Educational Psychology at the University of New South Wales, Australia, and is based on what we know about two kinds of memory (working memory and long-term memory) and two kinds of knowledge (primary and secondary). It sounds a bit heavy (pardon the pun!), but it is relatively easy to understand and has important implications for the effective teaching of reading skills.
For many years we have known that instruction is most effective when it is designed with the limitations of the learner’s working memory capacity in mind. If we undermine or overload working memory, the likelihood of learning will decrease. This blog explains CLT and its application to reading instruction, particularly phonics instruction in the early years.
The processing of information has three main parts: sensory memory, working memory and long-term memory. If attention is paid to sensory input, for example, sounds (such as phonemes), or visual images, such as graphemes (letters), it is transferred from sensory memory to short-term memory.
Short-term memory is housed in the prefrontal cortex of the brain where small amounts of information can be stored for up to 30 seconds.
If that information is rehearsed or repeated often enough, it will be transferred into long-term memory in the hippocampus, an area deeper in the brain.
For example, when you are given a message to pass on but don’t have a piece of paper on which to write it at hand, you probably repeat it over and over to yourself until you find paper or the person to whom the message must be given.
Short-term memory is a part of working memory. Repeating a sequence of letters straight after you have heard it requires only short-term memory. Blending the sounds represented by those letters into a word requires working memory. All novel information is processed by working memory. Working memory works with, and is able to manipulate, that information.
Working memory is an executive function system (a set of cognitive processes necessary for the control of behaviour) that controls attention, use of cognitive strategies and retrieval of information from long-term memory. All the components of working memory are in place by age 4.
Working memory requires conscious effort and is fragile. Almost all information in working memory is lost after about twenty seconds, unless it is constantly rehearsed (think of our example of remembering a message). If the information is lost, it cannot be retrieved. It must be inputted again. Students with a learning disability often have low working memory.
According to Baddeley’s working memory model, there are three main components to working memory. One is a ‘phonological loop’ which processes speech and other sounds. The other is a ‘visuo-spatial sketchpad’ that processes text and other visual information.
Working memory is limited, in both capacity and duration. Capacity normally increases steadily from the age of 4 until the age of about 14. The average 5-6 year old can hold and manipulate about 2 distinct items of information in working memory. Capacity peaks in young adulthood, with a 13-15 year-old able to hold, on average, 5 items in working memory. Most adults have a capacity of 7, plus or minus 2.
If we want to retain new information permanently, it must be transferred from short-term, working memory to long-term memory, which is virtually limitless – very large amounts of information can be stored in long-term memory permanently. When learning to read, decoding an unfamiliar word requires a significant amount of conscious processing and working memory. Reading a familiar word already in the mental lexicon (in long-term memory) does not.
Learned information is stored in cognitive structures called ‘schemas’. These provide a system for organising and storing elements of information, according to how they will be used. We construct new schemas in working memory so they can be integrated into existing knowledge in long-term memory. We make schemas of everything, with all the bits of relevant information grouped together.
With sufficient rehearsal and application, these schemas can become automated, for example, the schema of ‘driving a car’ or ‘reading’. Automated schemas are not consciously processed in working memory so automaticity is an important goal in reading instruction.
‘Cognitive load’ is the amount of information processing required to complete a learning task. Look at this ‘simple’ sentence presented to a 4 year old:
In reading this sentence, on top of knowing each sound to match the letters, the child also has to:
That’s quite a lot of information to deal with!
The load can affect the ability to process new information and to construct knowledge in long-term memory. A high cognitive load puts pressure on working memory, making information more difficult to attend to, rehearse and remember. Because working memory is fragile, with a limited capacity and duration, transfer of information from short-term to long-term memory is essential for learning to occur. Consequently, instructional design and methods should support working memory, minimise the load placed on it and facilitate the transfer of information to long-term memory.
CLT distinguishes between three types of cognitive loads on the cognitive system – intrinsic, extrinsic and germane.
Intrinsic load refers to the load imposed by inherent complexity of what is to be learned – the ‘what I have to be able to do’ of the task – as well as the knowledge of the learner.
Extrinsic load refers to the load created by the way in which information is presented and the instructional procedure which must be followed – the ‘conditions under which I do it’, such as the complexity of instructions, the pace of instruction, time constraints or distractions in the environment.
Germane load refers to the effort or working memory required to process information and create schema. It is linked to the intrinsic load – the more complex the task, or the more novel the information, the higher the germane load.
Primary knowledge is knowledge we acquire unconsciously, for example, the ability to listen to, understand and speak the language of the home. It is knowledge that we have evolved to acquire without instruction. Our brains are hardwired for listening and speaking. Outside the classroom, most of what we learn is primary, for example, general problem solving.
Secondary knowledge is acquired only with conscious effort and expert teaching and thus often learned in the classroom. Our brains are not hardwired for reading and writing (these are human creations), therefore the ability to read and write with phonics requires secondary knowledge – it is rarely ‘discovered’; it must be explicitly taught and actively learned. Reading and writing are processes that place a significant intrinsic load on the working memory of the learner because they require conscious effort.
‘Expert’ readers process multiple elements in parallel in working memory during a reading task. Dr. Hollis Scarborough identifies eight ‘strands’ or elements in skilled reading and how they come together in her Reading Rope model. In the following video, Dr. Scarborough explains this model of understanding the reading process.
According to educational psychologist John Sweller, information that has been organised and stored in long-term memory has different characteristics than the same information prior to it being stored in short-term memory. Words are processed by the skilled reader unconsciously as a single element retrieved from long-term memory. Essentially, most words are ‘sight words’ for an adult reader, decoded so quickly that they’re recognized almost automatically.
Most children begin school as ‘novice’ readers, with little or no word recognition or literacy knowledge, and with background knowledge and other language skills dependent largely on the amount of oral language experience in the home. Most of what they are learning is unfamiliar, so the intrinsic load of Scarborough’s 8-strand reading process is high. Remember, at that age, their working memory capacity is only 2-3 items.
Children starting school typically rely heavily on visual-spatial working memory due to their lack of phonological awareness. The phonological loop has to be ‘switched on’ and the children explicitly taught the sound structure of words, so that they can construct schema for the concepts of word, rhyme, syllable, sound etc. They then have to construct schemas for the squiggles on a page that represent sounds.
Novices need to use thinking skills; experts use knowledge.”
Sweller et al, 2011, p21.
The novice reader has to process the written word in working memory as multiple, interacting elements because the written word has not yet been stored as a single element in long-term memory. In decoding unknown words, letter-sound correspondences must be recalled to convert the letters in a word into sounds. Each sound has to be held in working memory while the next is identified. The sounds must all be held in working memory while they are blended to form a word.
Each word has to be held in working memory while the next is sounded out (these are our ‘balls in the air’). Those words must then be combined into phrases and sentences. In his 1999 publication, ‘Cognitive research can inform reading education’, Charles Perfetti said that the cognitive load of word decoding is a key barrier to comprehension, creating a ‘reading bottleneck’. When a student can decode fluently, the cognitive load is minimised, making it easier for the student to process what has been read, retain vocabulary, link ideas etc. Check out this little girl’s ‘reading bottleneck’ at 25 seconds where the cognitive load of decoding is so great she forgets the meaning of ‘sun’!
In order to reduce working memory load and facilitate transfer of secondary information to long-term memory, instruction provided by teachers should be explicit and direct. In explicit teaching the teacher shows the student what to do and how to do it, rather than expecting the student to discover or construct knowledge for himself. Have a look at the features that make a great explicit phonics lesson.
The teacher decides the learning intentions and success criteria for a lesson, makes these transparent to the students, demonstrates them by modelling, evaluates if students understand what they have been told by checking for understanding, and retells them what they have been told by tying it all together with closure.”
Hattie 2009, p. 206.
The cognitive load resulting from a complex task can be reduced by breaking it down into smaller, simpler components. Novices, in particular, benefit from having material presented in this way, with plenty of opportunity to practise after each step. The individual elements of information should be taught in sequence, in isolation before presenting all of the elements and their interactions.
Learners should start with work on simple learning tasks. The more expertise they acquire, the more complex the tasks they will be able to work on.
Effective working memory capacity can be increased by using both visual and auditory working memory rather than either processor alone, consequently instruction should be multisensory. Movement enhances working memory so this should be incorporated where possible, e.g. tapping syllables on your lap or moving a counter into a phoneme frame for each phoneme heard.
CLT emphasises the importance of using worked examples that show learners how to carry out new tasks. A worked example is essentially a step-by-step demonstration that reduces a complex process to single actions. This reduces the intrinsic cognitive load.
The ‘I do, we do, you do’ gradual release model of instruction provides the guided practice that reduces working memory load and facilitate transfer of information to long-term memory.
It appears that working memory capacity depletes after heavy use and recovers after rest. Therefore, if learning is spaced with rest periods between learning episodes, it is superior when compared with the same learning time massed without rest periods.
When a learner is presented with two simultaneous instances of the same type of stimulus, the two compete for attention and the extrinsic cognitive load is increased. To avoid the split-attention effect, teachers should allow students to focus on a single visual or auditory source of information at any given time.
What can we do as teachers to help students manage cognitive load?
Prior to instruction:
During instruction:
Phonics Hero’s click-and-go Phonics Lessons help you put all of these methods into practice. Our no-prep Phonics Lessons are designed to lessen the cognitive load of phonics learning on students and also lessen the cognitive load of preparing a great phonics lesson on teachers! Take a look at this tour of a blending lesson to see it in action:
Many of these instructional recommendations derived from Cognitive Load Theory are just good, common-sense teaching practice. And yet, I still see teachers using implicit and whole language teaching in reading instruction in classrooms filled with distractions. I encourage you to trial at least one of the recommendations discussed in this blog and monitor the impact on the reading performance of your students. You can do more in depth reading on working memory on the links below:
Cognitive Load Theory: Research That Teachers Really Need to Understand from the New South Wales’ Department for Education