30 Design Strategies and Tactics from 40 Years of Investigation

Appendix: Further information and examples

Hugh Burkhardt and Daniel Pead

Bookmark for top

WYTIWYG: what you test is what you get

We have never met a teacher who did not think that they need to focus on the types of performance that are tested in the high-stakes tests on which their students, and increasingly they themselves, will be judged.

But for a long time, test providers denied the responsibility that fact implies - to design examinations that are balanced across the learning goals set out in the 'intended curriculum' for the school system they serve. There were various excuses. “We don't test that but, of course, all good teachers do it.” "We only test a sample of performances - but they correlate well with broader measures." The result is unbalanced teaching and learning, neglecting important types of performance that are not included in the task types on the exam.

Recently, only 35 years since Hugh Burkhardt coined WYTIWYG1, it has become widely accepted, in principle, by those responsible for commissioning and designing high-stakes tests. However, the attempts to design high-quality balanced tests that reflect the broader objectives in mathematics education for the 21st century have usually been distorted by other pressures including: lack of in-house experience in designing richer tasks, fear of the backlash to any change in a sensitive area, cost, and a view that assessment time is, in any case, not productive. Since politicians need only simple numbers from a plausible source for accountability purposes, the balance of the tests is not a priority – whatever the effect on classrooms.

The two cases described below exemplify relatively successful attempts to improve the balance of established high-stakes assessment. Recently, there have been some modest signs of progress but there is a long way to go to meet the standards set out by the powerful working group of ISDDE (ISDDE 2012)

1…a play on WYSIWYG – ‘What You See is What You Get’ – a new feature in word prcoessors in the 1980s.

Bookmark – Section 1

Balanced Assessment in Mathematics

2009 BAM Grade 6 Test

Example1 1

Enlarge…

The approach here was to offer a supplementary test that would assess some key aspects of performance that were not covered in narrow short-item multiple-choice state tests. Designed at the Shell Centre, with the design led by Rita Crust, and developed in US classrooms in three states, the products were 40 minute tests including 5 tasks for each grade 3 through 10.

To cover the wide range of performance expected on each task, the team developed a design tactic “the exponential ramp” where successive parts of the task moved in the same context from simple concrete examples to increasingly complex and abstract models. See, for example, the second task, Truffles, in this Grade 6 test which goes from simple arithmetic (just doubling) through interpreting a graphical representation to formulating a general rule.

BAM 2009 Grade 6 Test extract
Page 1 of paperPage 3 of paperPage 4 of paperPage 5 of paperPage 6 of paperPage 7 of paperPage 8 of paperPage 9 of paperPage 10 of paper
Link to full test (PDF)

Known as “The MARS Tests”, (MARS 2000-2010) they were used for over 10 years, principally in California by schools working with the Silicon Valley Mathematics Initiative (MAC. Teachers were trained each year to score the tests, carried through in a Saturday workshop. Research showed that the MARS tests were a better measure of mathematical performance than the state tests (Ridgway et al 2000, Carroll et al. 2001) MAC developed a professional development program each year around the tasks in previous year's tests.

The tasks in the MARS tests, which we would now describe2 as “Apprentice Tasks” were a substantial step from short multiple-choice items ("Novice Tasks") towards really valid tests which should have unstructured problems in a form in which they might naturally arrive ("Expert Tasks"). The latter remain rare in high-stakes assessment worldwide.

A wider selection of these tasks can be found on the Inside Mathematics website.

Bookmark – Section 2

Testing Strategic Skills

This project (see Gradual Change) was explicitly focused on using WYTIWYG to encourage improvement of the curriculum in mathematics classrooms. The following two very different tasks give some flavour of what was involved, as do their scoring schemes. They are from the first TSS module Problems with Patterns and Numbers.

Examples from TSS
TSS Example 1TSS Example 2TSS Example 3TSS Example 4
Link to full PDF
Examples from Problems with Patterns and Numbers
Thmbnail of TSS examples

Click to view

Figure 1b: TSS – Problems with Patterns and Numbers
Image of the PPN cover

Link to the materials

Bookmark – Section 3

References

ISDDE (2012) Black, P., Burkhardt, H., Daro, P., Jones, I., Lappan, G., Pead, D., Stephens, M. High-stakes Examinations to Support Policy. Educational Designer, 2(5).
Retrieved from: http://www.educationaldesigner.org/ed/volume2/issue5/article16

MARS: Crust, R., Burkhardt H. and the MARS team (2000–2010 ) Balanced Assessment in Mathematics, annual tests for Grades 3 through 10, 2001-2004 Monterey CA: CTB/McGraw-Hill, East Lansing, 2005–2010 MI: MARS

Ridgway, J., Crust, R., Burkhardt, H., Wilcox, S., Fisher, L., and Foster, D. (2000). MARS Report on the 2000 Tests. Mathematics Assessment Collaborative, San Jose, CA. p 120.

Caroll, C., Ridgway, J., Pead, D., Crust, R., and McCusker, S. (2001). MARS/Renaissance Student Performance Assessment: 2001 tests. Renaissance Project, Ventura, CA. pp 67.