Operationalizing Qualitative Data as Mathematical Objects: Ideas for the Next Generation of Educational Assessment

Organising Qualitative Data

Qualitative learning processes work on a number of fundamentally different assumptions as compared to data interpretation and processing that we find in empirical research. Still, there are good reasons to operationalise qualitative events for the purpose of providing structured and formative feedback to students or to track students’ learning progress in a more organised manner.

Traditional grading works on the assumption of a number of correct answers that are provided by students based on a finite sample space of predefined outcomes, as we find them e.g., in standardised tests and exams. The higher the probability of correct answers, the higher the final grade. This closed approach, which can be defined in mathematical Set Theory, does not take the qualitative ‘grounded’ dimensions of learning into consideration, such as how students solve problems, interpret context, make sense of different approaches and learning outcomes, derive personal meaning from their learning experience or reflectively evaluate their strengths and weaknesses during the learning process.

The Physical Grounding of Learning

In a truly learner-centered pedagogy, assessment should be driven primarily by qualitative outcomes since we are dealing with young people based on their prior physical grounding. This includes their age and developmental level, cultural setting, socio-economic background, language competencies, social support networks and so on and so forth. Once we agree that learning processes should take into account prior conditions, we need to establish individual baseline profiles (inclusive of motivational profiles) prior to graded assignments in order to normalise measurement. There is little point in assessing learning progress without measuring normalised before- and after states. Still, I do not know of any school who would bother about measuring prior learning conditions beyond the institutional levels (such as in aptitude tests, if at all), but on the level of individual courses and classes.

One option to measure students’ prior abilities can be achieved by non-graded finger exercises or brief initial test runs in the form of a pre-instructional assessment. Such approach also allows for students to become self-aware of their progress when they benchmark their personal before- and after outcomes at a later stage. Contrary to working with a ‘correct answer’ schemata, learning processes are by nature multidimensional and open constructs. Let’s assume we want to operationalize a highly complex scenario where we want to assess the social-, languages-, cognitive- and metacognitive competencies among students with a further differentiation into analytical, creative and pragmatic competencies. For simplicity, let’s assume a ladder of 1 to 10 (the ‘ladder of power’ function, here y = x) ranging linear from minimal (1) to very high competency (10).

Mathematical Operationalisation

As an example of how to operationalise social behaviour, we may use the PBL model. Among the typical questions asked of team members at the end of a tutorial class are ‘(a) How have I been doing as a team member, (b) How have I been doing as a researcher and (c) How have I been doing as a problem solver?’. The objective is to obtain feedback of the social roles that students have taken on during their project. Once students derive at an informed self-assessment, the mathematical structure could look like a matrix storing these three dimensions as in the table below

Student_names as a team member as a researcher as a problem-solver
Sean 2 10 3
Lin 9 3 7
Jenny 7 4 8
Wu 7 8 8

The advantages of a more differentiated account are obvious. Not only can be determined in which areas students need support, but also pattern emerge that give lecturers valuable feedback. In the example above, Lin and Jenny are enthusiastic working in a team and consolidating data as problem-solvers, but both show weaknesses in research.  Sean is the opposite. He is a brilliant researcher but appears to have issues blending in with the rest of the group.

But what makes a good team member in the first place? We can set out criteria that make sense in the specific context, such as, e.g., (a) the ability to relate to someone else’s arguments and to formulate ideas, (b) to participate actively in discussions or (c) to work as an effective tutor for the team.

Such a structure could be expressed, for example, as a vector containing these elements as

Vector_social >- c(a = 8, b = 7, c = 9)

To store multidimensional objects, vectors can be assembled into matrices while matrices can be compiled into arrays. A huge advantage of such a mathematical approach is that at any time the qualitative data measurement is preserved. An external assessor could e.g., ask, “Which feedback has been provided to this particular student since her lack of competencies as a tutor in earlier rounds obviously propagated to a lack of competencies as a problem-solver towards the the end of the semester?” Storing measurements in more complex mathematical objects allows asking smarter and more helpful questions to support students’ learning. Obviously, this only makes sense if comprehensive criterion-based assessment tools such as e.g., scoring rubrics are set in place as translating frameworks between classroom observation and mathematical expression.

Jonson and Svingby (2007) concluded regarding rubrics that “Student understanding of criteria, feedback, possibilities of self- and peer assessment are other positive experiences reported (…) it has been concluded that rubrics seem to have the potential of promoting learning and/or improve instruction. The main reason for this potential lies in the fact that rubrics make expectations and criteria explicit, which also facilitates feedback and self-assessment.”

Professional Ethics

One of the main advantages to starting formulating assessments as mathematical objects are that matrices and arrays can be mapped via rubrics to student feedback. In addition, long-term study progress of individual students can be monitored accordingly. Most importantly, such procedure would also allow for the optimisation of student feedback since the efficacy of feedback formats to students would be revealed in collected data sets. Obviously, such a step requires the digitisation of assessment/ feedback processes and also requires more time from teachers to attend to their students’ learning as compared to traditional grading. However, the required effort to capture the measure of multiple dimensions is nothing that goes beyond an Excel spreadsheet.

Besides, there is the ethical question that we should empower learners. We should never discourage students via miserable grades, in particular, those students who require the most support and understanding. What is stopping us design new systems where the measure of abilities equals the assisting of students’ learning?  One of the main points of learning, after all, is to get better at it.

(1) Assessing multiple dimensions of learning


(2) Relating dimensions of learning


The illustration above (1): How do we assess students in the future? In this mockup that I created, four virtual students are measured on the variables of social skills marked in orange, research skills marked in blue and metacognitive (= cognitive regulation) competencies marked in green. Each assessment (data point) is associated with a constructive feedback. In the simulation above, Jenny and Sean (1st and 3rd from the top) progress steadily, Liu (2nd from the top) advances, but rather unsteady and may require additional exercise to firm up skills. Wu (bottom) progresses at a high level but declines sharply in the mid of the course. Illustration (2): We see the social behaviour development in tandem with research project assessments.The first measure is a non-graded pre-instructional measure to establish students’ baseline. The narratives of students’ learning journeys evolve. What happened to Wu between April and July? How come Liu performs lower in 3 out of 5 measures as compared to the preinstructional baseline exercise?



Jonson, A. & Svingby, G. (2007). The use of scoring rubrics: Reliability, validity and educational consequences. Educational Research Review 2 (2007) 130–144