Cognitive Load Theory and Reading Instruction

img of teddy bear juggling 3 balls, labeled 'space', 'time', and 'effort

Mental activity that places significant demands on working memory is like mental juggling.”

H. Lee Swanson

What is Cognitive Load Theory?

‘Cognitive load theory’ (CLT) is a theory about instruction that may help you to understand why the reading progress of some students is slower than expected. CLT was developed by John Sweller, Emeritus Professor of Educational Psychology at the University of New South Wales, Australia, and is based on what we know about two kinds of memory (working memory and long-term memory) and two kinds of knowledge (primary and secondary). It sounds a bit heavy (pardon the pun!), but it is relatively easy to understand and has important implications for the effective teaching of reading skills.

For many years we have known that instruction is most effective when it is designed with the limitations of the learner’s working memory capacity in mind. If we undermine or overload working memory, the likelihood of learning will decrease. This blog explains CLT and its application to reading instruction, particularly phonics instruction in the early years.

From short-term to long-term memory

The processing of information has three main parts: sensory memory, working memory and long-term memory. If attention is paid to sensory input, for example, sounds (such as phonemes), or visual images, such as graphemes (letters), it is transferred from sensory memory to short-term memory.

executive-control-processes and cognitive load
Information processing (adapted from an image used on this blog from Kiddie Korner).

Short-term memory is housed in the prefrontal cortex of the brain where small amounts of information can be stored for up to 30 seconds.

If that information is rehearsed or repeated often enough, it will be transferred into long-term memory in the hippocampus, an area deeper in the brain.

For example, when you are given a message to pass on but don’t have a piece of paper on which to write it at hand, you probably repeat it over and over to yourself until you find paper or the person to whom the message must be given.

Working memory

Short-term memory is a part of working memory. Repeating a sequence of letters straight after you have heard it requires only short-term memory. Blending the sounds represented by those letters into a word requires working memory. All novel information is processed by working memory. Working memory works with, and is able to manipulate, that information.

st-and-working-mem-01-01Look at all the effort working memory takes on in a novel task!

Working memory is an executive function system (a set of cognitive processes necessary for the control of behaviour) that controls attention, use of cognitive strategies and retrieval of information from long-term memory. All the components of working memory are in place by age 4.

Working memory requires conscious effort and is fragile. Almost all information in working memory is lost after about twenty seconds, unless it is constantly rehearsed (think of our example of remembering a message). If the information is lost, it cannot be retrieved. It must be inputted again. Students with a learning disability often have low working memory.

According to Baddeley’s working memory model, there are three main components to working memory. One is a ‘phonological loop’ which processes speech and other sounds. The other is a ‘visuo-spatial sketchpad’ that processes text and other visual information.

diagram of long term memory and cognitive load
Baddeley’s model of working memory (adapted from this image).

Working memory is limited, in both capacity and duration. Capacity normally increases steadily from the age of 4 until the age of about 14. The average 5-6 year old can hold and manipulate about 2 distinct items of information in working memory. Capacity peaks in young adulthood, with a 13-15 year-old able to hold, on average, 5 items in working memory. Most adults have a capacity of 7, plus or minus 2.

Long-term memory

If we want to retain new information permanently, it must be transferred from short-term, working memory to long-term memory, which is virtually limitless – very large amounts of information can be stored in long-term memory permanently. When learning to read, decoding an unfamiliar word requires a significant amount of conscious processing and working memory. Reading a familiar word already in the mental lexicon (in long-term memory) does not.

Learned information is stored in cognitive structures called ‘schemas’. These provide a system for organising and storing elements of information, according to how they will be used. We construct new schemas in working memory so they can be integrated into existing knowledge in long-term memory. We make schemas of everything, with all the bits of relevant information grouped together.

With sufficient rehearsal and application, these schemas can become automated, for example, the schema of ‘driving a car’ or ‘reading’. Automated schemas are not consciously processed in working memory so automaticity is an important goal in reading instruction.

YouTube video
This short video explains schema theory in conjunction with CLT.

Cognitive Load Theory (CLT) and Reading Instruction

‘Cognitive load’ is the amount of information processing required to complete a learning task. Look at this ‘simple’ sentence presented to a 4 year old:


In reading this sentence, on top of knowing each sound to match the letters, the child also has to:

  1. Recognise the capital ‘D’ as a way to represent the sound /d/.
  2. Deal with the letter-sound correspondences in the tricky word ‘the’ – a voiced /th/ and a schwa sound.
  3. Work out that two letters can represent a sound, i.e. the ‘ck’.
  4. Know what to do with a full stop.

That’s quite a lot of information to deal with!

The load can affect the ability to process new information and to construct knowledge in long-term memory. A high cognitive load puts pressure on working memory, making information more difficult to attend to, rehearse and remember. Because working memory is fragile, with a limited capacity and duration, transfer of information from short-term to long-term memory is essential for learning to occur. Consequently, instructional design and methods should support working memory, minimise the load placed on it and facilitate the transfer of information to long-term memory.

The 3 types of cognitive load

graphic of a woman carrying three spheres, labelled 'intrinsic', 'germane', and extrinsic'

CLT distinguishes between three types of cognitive loads on the cognitive system – intrinsic, extrinsic and germane.

Intrinsic load refers to the load imposed by inherent complexity of what is to be learned – the ‘what I have to be able to do’ of the task – as well as the knowledge of the learner.

Extrinsic load refers to the load created by the way in which information is presented and the instructional procedure which must be followed – the ‘conditions under which I do it’, such as the complexity of instructions, the pace of instruction, time constraints or distractions in the environment.

Germane load refers to the effort or working memory required to process information and create schema. It is linked to the intrinsic load – the more complex the task, or the more novel the information, the higher the germane load.

Primary and secondary knowledge

Primary knowledge is knowledge we acquire unconsciously, for example, the ability to listen to, understand and speak the language of the home. It is knowledge that we have evolved to acquire without instruction. Our brains are hardwired for listening and speaking. Outside the classroom, most of what we learn is primary, for example, general problem solving.

Secondary knowledge is acquired only with conscious effort and expert teaching and thus often learned in the classroom. Our brains are not hardwired for reading and writing (these are human creations), therefore the ability to read and write with phonics requires secondary knowledge – it is rarely ‘discovered’; it must be explicitly taught and actively learned. Reading and writing are processes that place a significant intrinsic load on the working memory of the learner because they require conscious effort.

The intrinsic load of the reading process

‘Expert’ readers process multiple elements in parallel in working memory during a reading task. Dr. Hollis Scarborough identifies eight ‘strands’ or elements in skilled reading and how they come together in her Reading Rope model. In the following video, Dr. Scarborough explains this model of understanding the reading process.

According to educational psychologist John Sweller, information that has been organised and stored in long-term memory has different characteristics than the same information prior to it being stored in short-term memory. Words are processed by the skilled reader unconsciously as a single element retrieved from long-term memory. Essentially, most words are ‘sight words’ for an adult reader, decoded so quickly that they’re recognized almost automatically.

Most children begin school as ‘novice’ readers, with little or no word recognition or literacy knowledge, and with background knowledge and other language skills dependent largely on the amount of oral language experience in the home. Most of what they are learning is unfamiliar, so the intrinsic load of Scarborough’s 8-strand reading process is high. Remember, at that age, their working memory capacity is only 2-3 items.

Children starting school typically rely heavily on visual-spatial working memory due to their lack of phonological awareness. The phonological loop has to be ‘switched on’ and the children explicitly taught the sound structure of words, so that they can construct schema for the concepts of word, rhyme, syllable, sound etc. They then have to construct schemas for the squiggles on a page that represent sounds.

Novices need to use thinking skills; experts use knowledge.”

Sweller et al, 2011, p21.

The novice reader has to process the written word in working memory as multiple, interacting elements because the written word has not yet been stored as a single element in long-term memory. In decoding unknown words, letter-sound correspondences must be recalled to convert the letters in a word into sounds. Each sound has to be held in working memory while the next is identified. The sounds must all be held in working memory while they are blended to form a word.

Each word has to be held in working memory while the next is sounded out (these are our ‘balls in the air’). Those words must then be combined into phrases and sentences. In his 1999 publication, ‘Cognitive research can inform reading education’, Charles Perfetti said that the cognitive load of word decoding is a key barrier to comprehension, creating a ‘reading bottleneck’. When a student can decode fluently, the cognitive load is minimised, making it easier for the student to process what has been read, retain vocabulary, link ideas etc. Check out this little girl’s ‘reading bottleneck’ at 25 seconds where the cognitive load of decoding is so great she forgets the meaning of ‘sun’!

YouTube video

Some instructional design methods drawn from CLT

Explicit teaching

In order to reduce working memory load and facilitate transfer of secondary information to long-term memory, instruction provided by teachers should be explicit and direct. In explicit teaching the teacher shows the student what to do and how to do it, rather than expecting the student to discover or construct knowledge for himself. Have a look at the features that make a great explicit phonics lesson.

The teacher decides the learning intentions and success criteria for a lesson, makes these transparent to the students, demonstrates them by modelling, evaluates if students understand what they have been told by checking for understanding, and retells them what they have been told by tying it all together with closure.”

Hattie 2009, p. 206.

Instructional sequencing

The cognitive load resulting from a complex task can be reduced by breaking it down into smaller, simpler components. Novices, in particular, benefit from having material presented in this way, with plenty of opportunity to practise after each step. The individual elements of information should be taught in sequence, in isolation before presenting all of the elements and their interactions.

Learners should start with work on simple learning tasks. The more expertise they acquire, the more complex the tasks they will be able to work on.

Multisensory instruction

Effective working memory capacity can be increased by using both visual and auditory working memory rather than either processor alone, consequently instruction should be multisensory. Movement enhances working memory so this should be incorporated where possible, e.g. tapping syllables on your lap or moving a counter into a phoneme frame for each phoneme heard.

phoneme frame with sound buttons A phoneme frame in action.

Worked examples

CLT emphasises the importance of using worked examples that show learners how to carry out new tasks. A worked example is essentially a step-by-step demonstration that reduces a complex process to single actions. This reduces the intrinsic cognitive load.

Guided practice and fading

The ‘I do, we do, you do’ gradual release model of instruction provides the guided practice that reduces working memory load and facilitate transfer of information to long-term memory.


It appears that working memory capacity depletes after heavy use and recovers after rest. Therefore, if learning is spaced with rest periods between learning episodes, it is superior when compared with the same learning time massed without rest periods.

Minimising the split-attention effect

When a learner is presented with two simultaneous instances of the same type of stimulus, the two compete for attention and the extrinsic cognitive load is increased. To avoid the split-attention effect, teachers should allow students to focus on a single visual or auditory source of information at any given time.

Putting the instructional methods into practice

What can we do as teachers to help students manage cognitive load?

Prior to instruction:

  1. Pre-test to determine the student’s current knowledge and skill level. In the Phonics Hero Teacher Account you will find many free assessment materials.
  2. Evaluate the intrinsic and germane loads of reading a particular text and try to reduce the load e.g. by identifying letter-sound correspondences, tricky words and vocabulary that could be unfamiliar to a student. The best decodable readers list new correspondences, words containing these and tricky words on the first or last pages of the text for you.
  3. Reduce or remove any possible extraneous load on working memory such as lengthy/complex explanations or instructions, time restrictions, auditory or visual distractions in the classroom.

During instruction:

  1. Begin instruction at a point of familiarity, activating prior knowledge. For example, revise previously learned letter-sound correspondences or words with a game. Phonics Hero supplies teachers with free access to their games for use on the interactive whiteboard with a group of students.
  2. Pre-teach:

    • any letter-sound correspondences in a text that are likely to be unfamiliar. Teach the blending of these to make the regular words that will be found in the text. Too much novel information can overload working memory. Pre-teaching pre-requisites for a task aids the establishment of schemas that extend working memory.
    • Explicitly teach ‘tricky’ irregular words, drawing attention to any regular letter-sound correspondences. Phonics Hero teaches how to read and spell ‘camera words’ in Steps 4 and 5.
      This camera word spelling game emphasises the known letter-sound correspondence and highlights the irregular bits for practice.

  3. Make sure that students have sufficient background knowledge to access the meaning of the text. It is easier for them to process information if they can link it to existing schemas. Talk about key concepts, new vocabulary, etc.
  4. Move systematically from the simple to the complex to support schema formation. The structure of Phonics Hero helps teachers to teach the logic of the alphabetic code. Students learn that one letter can represent a sound (e.g. ‘c’ can represent /k/) before learning that two or more letters can represent a sound. (e.g. ‘ck’ can represent /k/). The most common representations are taught before the less common (e.g. ‘q’ can represent /k/.
  5. Use explicit instruction to teach new content and skills. Synthetic phonics programs do not ask students to read words they have not been explicitly taught. It is not appropriate to embed phonics instruction in text reading or to expect a student to use multiple cues such as context or pictures to decode text.
  6. Model/demonstrate strategies e.g. correct pronunciation of sounds, blending or segmenting a word into syllables
  7. Use a consistent lesson structure so that the learner can attend to the intrinsic load of the content: review; statement of learning intentions and success criteria; I do, we do, you do, application; reflection. Download the free lesson plan template.
  8. Make learning multi-sensory to allow use of both visual-spatial and verbal working memory. Phonics Hero is multi-modal.
  9. Provide worked examples and problems with partial solutions e.g. sound buttons or dashes to indicate the number of sounds in a word or a longer dash for a digraph or trigraph.
  10. Present oral and visual information together, to facilitate use of both visual and verbal working memory. The best electronic books highlight each word as it is read to the student.
  11. Cut out non-essential information e.g. decorative graphics. We need to direct student attention to the information they need. The images on each screen of a Phonics Hero game are kept simple.
  12. Provide multiple practice opportunities in order to develop automaticity.

    • The games of Phonics Hero are an ideal tool for repeated practice as each game is unique, maintaining student enjoyment during repeated use of knowledge and skills.
    • Students need the opportunity to practise applying information to new contexts once they have learned it. Application is a crucial step in phonics instruction that sometimes gets forgotten. The goal of reading is comprehension, so once the child can decode words, he or she should be applying that skill to decodable readers.
      level-10-sentence-readingDecodable readers and caption reading, such as the Sentence Reading games in Step 6 of Phonics Hero, reduce cognitive load.

    • Encourage repeated reading as this helps the student to develop automaticity and fluency, reducing the load on working memory and freeing it up for comprehension. For example, take a look at teacher Stephanie Brighton’s blog on how she reduced the cognitive load of home readers in the early stages of learning to read.
  13. Encourage students to visualise what they have read (drawing on paper or in their mind’s eye). This encourages them to draw information out of their long-term memory and review what they have learnt/read.

Superhero School and Cognitive Load Theory

Phonics Hero’s Superhero School helps you put all of these methods into practice. Superhero School is designed to lessen the cognitive load of phonics learning on students and also lessen the cognitive load of preparing a great phonics lesson on teachers! Take a look at this tour of a blending lesson to see it in action:

YouTube video

Start a Trial of Superhero School with a Teacher Account

From one teacher to another…

Many of these instructional recommendations derived from Cognitive Load Theory are just good, common-sense teaching practice. And yet, I still see teachers using implicit and whole language teaching in reading instruction in classrooms filled with distractions. I encourage you to trial at least one of the recommendations discussed in this blog and monitor the impact on the reading performance of your students. You can do more in depth reading on working memory on the links below:

Sweller, J., van Merriënboer, J., & Paas, F. (2019). Cognitive architecture and instructional design: 20 years later. Educational Psychology Review, 31, 261-292.

Cognitive Load Theory: Research That Teachers Really Need to Understand from the New South Wales’ Department for Education

Author: Shirley Houston

With a Masters degree in Special Education, Shirley has been teaching children and training teachers in Australia for over 30 years. Working with children with learning difficulties, Shirley champions the importance of teaching phonics systematically and to mastery in mainstream classrooms. If you are interested in Shirley’s help as a literacy trainer for your school, drop the team an email on

Leave a Reply

Your email address will not be published. Required fields are marked *

Please enter the numbers you see in the box provided.

Array ( [_CAPTCHA] => Array ( [config] => a:15:{s:4:"code";s:5:"48649";s:10:"min_length";i:5;s:10:"max_length";i:5;s:11:"backgrounds";a:3:{i:0;s:75:"/home/webdev/phonicshero_live_website/captcha/backgrounds/stitched-wool.png";i:1;s:74:"/home/webdev/phonicshero_live_website/captcha/backgrounds/white-carbon.png";i:2;s:72:"/home/webdev/phonicshero_live_website/captcha/backgrounds/white-wave.png";}s:5:"fonts";a:1:{i:0;s:72:"/home/webdev/phonicshero_live_website/captcha/fonts/times_new_yorker.ttf";}s:10:"characters";s:9:"123456789";s:13:"min_font_size";i:42;s:13:"max_font_size";i:42;s:5:"color";s:4:"#666";s:9:"angle_min";i:0;s:9:"angle_max";i:6;s:6:"shadow";b:0;s:12:"shadow_color";s:4:"#fff";s:15:"shadow_offset_x";i:-1;s:15:"shadow_offset_y";i:1;} ) [captcha] => Array ( [code] => 48649 [image_src] => /captcha/simple-php-captcha.php?_CAPTCHA&t=0.33411200+1607135268 ) )