This chapter describes how scoring can be defined in CBA ItemBuilder Projects Files. Scoring is always defined on the task level (see section 3.6 for details on defining tasks). Accordingly, before scoring can be implemented in newly created CBA ItemBuilder projects files, a Task must be configured (see section 3.6 for details). A second prerequisite for defining the scoring of Tasks is to define names for all required components. That means that
User Defined Ids (see subsection 3.7.4 for details) is required for the components used to gather responses. Human readable
User Defined Ids are suggested to remember the meaning of particular identifiers when using them, for instance, in the syntax definition of scoring rules. Since this definition of scoring rules is based on the
User Defined Ids, systematically defined and easily readable IDs simplify the creation and validation of scoring rules for item authors.
The definition of explicit scoring rules is not mandatory for the use of CBA ItemBuilder items, but it provides the most flexible way to combine evidence into scoring. Alternatively, 1) the so-called component state, that is, the values selected or entered for all input elements, can be automatically stored by the test assembly and deployment software (see chapter 7). Moreover, 2) the components can be linked to FSM variables (applies to version 10.0) and the last value of FSM variables when a task is exited can be used. Finally, 3), log data can also contain all changes of component values and it is often possible, to infer about the final response from the collected log events (see section 1.6, and Kroehne and Goldhammer (2018) for response-completeness of log data).
Motivation: Explicit automatic scoring of items can be necessary at runtime when scoring results that incorporate logical rules are required either for adaptive test assemblies (branched testing, multi-stage testing, or adaptive testing, see section 2.7.4), for the different feedback purposes (see section 2.9), or to monitor the test-taking processes. Scoring of CBA ItemBuilder Tasks is evaluated at task switches (i.e., when tasks are changed from one task to another). Task switches can either be triggered from within the Task (using Runtime Commands, see section 3.12, or from outside (by the deployment software, for instance, because of a global timeout, see section 7.2.8). Having the scoring definition implemented within the assessment components created with the CBA ItemBuilder can also simplify data post-processing workflows (see section 8.6) and sharing of items (for instance, as Open Educational Resources, see section 8.7.4). Hence, automatic scoring provides essential advantages over above mentioned alternatives: It standardizes the scoring procedures and gives immediate access to the scored results.
Implementing the scoring rules within the CBA ItemBuilder project files comes with a second advantage. The scoring definition becomes independent from the deployment software (see chapter 7) and the approach used for data post-processing (see section 8.4.2). Scoring embedded in CBA ItemBuilder projects can be tested already during item development and will be available after distributing items (as CBA ItemBuilder projects). An essential tool for checking the scoring in CBA ItemBuilder items is the so-called Scoring Debugger, as described already in section 1.5. The Scoring Debugger can be used to inspect the scoring live during Preview.
The core component for the definition of the automatic scoring is the design of syntax conditions, which can be evaluated based on inputs (i.e., Component State of elements) and operators (i.e., incorporating the states in the internal finite-state machine(s) and the visited pages).
User Defined Id’s: The link between components and the syntax is provided by the
User Defined Id's. Using the main menu
Project > Edit all user defined IDs all named components of a CBA ItemBuilder Project File can be displayed. As shown in Figure 5.1, for a simple multiple-choice item, various components of type
Checkbox can be defined, all fo them with a unique
Figure 5.2 shows a simple item to illustrate the scoring of a multiple-choice item. The item was created by adding a
HTMLTextField (see subsection 3.8.2 for the instruction and the four
Chechboxes (see subsection 3.9.3). A simple scoring of this item might distinguish the conditions Correct (Jaguar and Panda) and Wrong (Jaguar and Panda are not selected or additional wrong options are selected). If a test-taker never selected any
Chechboxe at all, this might be called a Missing response (see section 5.3.11 for details).
The CBA ItemBuilder allows to define scoring-conditions using these
UserDefinedIds. The conditions are labeled of Hits (i.e., Hit-conditions) and Misses (i.e., Mis-conditions) combining
UserDefinedIds of components, and additional functional operators (see appendix B.2 for all operators), if necessary, combined with logical operators. Each condition represents a nominal conditions that is to be differentiated when computing the value of a scoring variable (called Class). The item contains one Task (defined in the Task-Editor, see subsection 3.6. The Task-Editor is also used to define the scoring (see Figure 5.3).
Class: Scoring is defined using nominal Hit-conditions. Each conditions corresponds to a potential value of a variable. To specify the relationship of values to variables, Hit-conditions (i.e., values) are assigned to Classes (i.e., Classes are equivalent to Variables in the final data set), as shown in Figure 5.4. The variable
Score can have the three potential values
Name: Each hit- and miss-condition requires a unique name, that is defined in the Task-Editor (shown in Figure 5.3). This name represents the nominal value of the variable, if the corresponding condition is met (i.e., if the hit is active).
Condition Syntax: Each Hit-condition is defined by providing a scoring syntax. The example item shown in Figure 5.2 contains three different Hit, and the syntax for the conditions are shown in the editor for Conditions in Figure 5.5. The first condition for the hit
Correct combines the
User Defined Id's of the four
Checkboxes with logical operators (using the CBA ItemBuilder specific bracketing of expressions, see section 4.1.3).
However, the syntax for scoring rules (see section 5.3) also allows using scoring operators, for instance, to incorporate information from the dynamic part of CBA ItemBuilder tasks. Operators are illustrated in Figure 5.5 in the syntax for the hit-condition for
Wrong, which evaluates to true if the item’s current state is
Answered. Note that the state
Answered is implemented using a simple finite-state machine (see section 4.4 for details).
Use first active hit per class (applies to all tasks). (see right part of Figure 5.5).
Scoring Debug Window: The Scoring Debug Window (already introduced in section 1.5) can be used to explore the scoring of the example item shown in Figure 5.2, as shown in Figure 5.6. The Scoring Debug Window can be requested during item development in the Preview and is also available in the examples embedded in the online version of this book.
Multiple Classes: If components are to be used only in a particular way to form outcome variables, defining scoring constraints may be more onerous than strictly necessary (see section 5.2 for an alternative). The full potential of CBA ItemBuilder scoring unfolds in the use cases when different summaries of answers to variables are to be used. This is illustrated in Figure 5.7 for a simple Likert-style item. Assume that two variables should be created: One variable containing the response (Class: Response) and one indicating agreement or disagreement [Class: Style, as used, for instance, in models to investigate response style; cf. Böckenholt and Meiser (2017) and others].
By introducing the layer of hit definitions (described in detail in section 5.3), the CBA ItemBuilder allows the creation of flexible scoring for responses that can combine multiple components and can result in multiple variables (i.e., classes). Use cases not only include questionnaires (as the example in Figure 5.7) but can also be found for cognitive assessment, e.g., when both the raw response and the automatically scored response (correct vs. incorrect) are to be stored or when dichotomous and polytomous scoring is to be considered. Use cases for explicit scoring with multiple classes also arise if, for example, time measures are included in the scoring of responses.
Result-Text: Hit- and miss-conditions can be used to define evidence in terms of categorical values, which can then be assigned to classes to be used as outcome variables. However, also entered text and numbers are required as result variables. To copy text responses to result variables, the CBA ItemBuilder provides the
result_text()-operator. The underlying idea is that each class (i.e., variable) can provide a Result-Text in addition to the name of the active hit. The condition defines which particular value is used as Result-Text. The (first) active hit-/miss-condition of a class defines which text is copied into the Result-Text.
The item shown in Figure 5.8 illustrates the use of the Result-Text. The first class (
Var1) is used for question 1: Class
Var1 has only one hit-condition with the syntax
result_text(input1). This condition is always true, and whatever is entered in the
SingleLineInputField with the User-Defined Id
input1 is copied to the Result-Text for
Var1. For question 2, the class
Var2 is used with two hit conditions. When a text is entered into the
InputFiled with the User-Defined Id
input2 (i.e., the text is not empty checked with the condition
matches(input2,"")), the value is copied to the Result-Text using the operator
result_text(input2) in the condition
Q2_Text. When nothing is entered, the Result-Text is filled with the string
Missing (see hit-condition
Q2_Missing). The class
Var3 is used for question 3. The class contains either the selected option (A or B) in the Result-Text (see hit-conditions
Q3_B). If neither
Other is selected, the string
Missing is copied to the * Result-Text* (see hit-condition
Q3_Missing). Two hit-conditions are defined that deal with conditions that
Other is selected. If no text is entered into the
SingleLineInputField with the User-Defined Id
input3, the text
Other: Not Specified is copied to the Result-Text (see hit-condition
Q3_OtherNotSpecified). If a text is entered, the Result-Text is filled with the string
Other: followed by the provided text. This is achieved by using an argument list for the
restult_text()-operator (see 4.1.5). Note that
Var3 will not contain the text entered into
B is selected. This issue is addressed by defining
Var4 that contains the text entered in
input3, even if
Other is not selected.
Using items provided by Toplak, West, and Stanovich (2014) the item in Figure 5.9 illustrates scoring using variables. In this example, the finite-state machine updates variable values, designed to allow immediate feedback (correct response, intuitive incorrect responses, and any other wrong response) and to compute the total score for all seven items of the Cognitive Reflection Test.
Checkboxes) and groups (e.g.,
FrameSelectGroups) using FSM Variables (instead of so-called Hit-/Mis-conditions).
An important part of possible scoring rules are input elements, i.e. components for the design of items, which have a value (i.e., are either selected or un-selected). To refer to the value of a component in a Hit- or Miss-condition, it is sufficient to include the
UserDefinedId of the respective component into the condition-syntax. For instance, for a
checkbox with the
UserDefinedId: myCheckbox, the string literal
myCheckbox is interpreted as
TRUE if the
checkbox is selected, when the syntax is evaluated. If the
checkbox is not selected, the string literal
myCheckbox is interpreted with the value
CheckBox - components, the selected/unselected state of
RadioButton - components, the toggle state of Buttons in toggling mode and the selected/unselected-state of
ComboBoxItem in a
ComboBox can be used to define Hit- or Mis- conditions by simply referring to the
UserDefinedId of the component.
The item scoring mechanism implemented in CBA ItemBuilder goes beyond simple mapping of Scoring Conditions (i.e., hit- and miss conditions) to component states. This is enabled by providing the possibility to formulate conditions as arbitrary combinations of statements using a so-called Domain Specific Language (DSL, i.e., by using a specific syntax).
UserDefinedIds of the components to logical expressions, the following logical operators can be used:
A and B:
A or B:
Flexible combinations of conditions are possible with the basic operators
not. Use the Scoring Debug Window (
Ctrl / Strg + S, see section 1.5) to explore the hit conditions in the item shown in Figure 5.10.
Notice the specific bracketing in the last hit condition shown in Figure 5.10:
(((A and B) and C) and D). This condition illustrates that combining multiple Boolean expressions requires to include brackets so that the statement can be decomposed into pairs:
(A and B) and
C, and finally
((A and B) and C) and
For a number of scoring tasks, simply checking the Boolean value of components is not sufficient. Therefore, the CBA ItemBuilder provides functions in the scoring syntax (so-called operators), which can be used within the scoring syntax to take into account properties of the current task for the formulation of hit and miss conditions.
By default (i.e, when not configured differently), Hit- and Miss-conditions are evaluated independently. If variables are created, i.e., hits are assigned to classes, a central condition must be met: At any time, precisely one hit must be active for each class. This condition follows directly from using hits as (categorical) values for variables. Consequently, hit conditions within a class must always be formulated in such a way that they are mutually exclusive. To support checking this condition, the CBA ItemBuilder’s Preview of tasks provides the Scoring Debug Window, which contains a red exclamation mark once multiple hits are active for a class (see Figure 5.10).
However, a powerful alternative is to active the sequential evaluation of scoring conditions in the Task-Editor by selecting the checkbox
Use first active hit/miss per class (applies to all tasks). If this option is activated, the evaluation is performed sequentially, starting with the first hit condition of a class. Only if the first hit is not
true, the second hit is evaluated. Accordingly, a last hit (when no other conditions evaluate to
true) can be added, for instance, to simplify missing value coding (see the item shown in Figure 5.2 as an example).63
Use first active hit/miss per class (applies to all tasks) is not activated.
Text responses can be automatically scored inside of the CBA ItemBuilder using keywords or pattern. The provided
matches()-operator takes two arguments: The
UserDefinedId of the component used to collect the text response (see section 3.9.1) and a regular expression (see section 6.1 for details).
Note that the logical operators
not can be combined with several
matches()-operators and other conditions. Hence, there is no need to formulate too complex regular expressions since multiple expressions can be combined using multiple
The value of FSM-Variables can be used within scoring rules (i.e., Hit- and Mis-conditions). This is achieved using the
An examples for using the
variable_in()-operator is provided in the Figure 5.12 for the scoring of a Drag-and-Drop response format, implemented using FSM Variables (see section 4.2.6 for the implementation of Drag-and-Drop).
visited_all_values_of_variable()-operator can be used to check whether a variable has taken one or more concrete values in the course of test-taking (see Figure 5.13 for an example):
As described in section 4.2.6, the CBA ItemBuilder supports free drag and drop. The
panel_position_range()-operator can be used to score the position of drag-and-drop elements (see Figure 5.14 for an example):
[CheckNonMembers], XStart, XEnd, YStart, YEnd,
panel_position_range(Container, Center, Component, Component, ...)
The operator evaluates to
true if the (X,Y) positions of all given
Components in the given
Container are within the range given by
YEnd relative to the container’s (X,Y) position. If the flag
CheckNonMembers is not give or set to
true, the operator only evaluates to
true if the (X,Y)-positions of all other components in the given
Container are outside the given range. The upper left corner of the component is used as (X,Y)-position of a
Component if the flag
Center is not provided as
Alternatively to the position of drag and drop element, the distance to score can also be evaluated. The
panel_distance_range()-operator returns true if the mutual distance of all given
Components in the given
Container are within the given range between
panel_distance_range(Container, MaxDistance, Center, Component, Component, ...)
The CBA ItemBuilder provides various operators to incorporate events and the number of interactions into scoring conditions.
Number of Events: The number of events that have been raised during the execution of the current task can be used in scoring conditions. The CBA ItemBuilder considers an event to be raised even if it did not trigger a transition, and the count includes events raised by the
If only the number of specific events should be counted, the
raised_nb_events()-operator can be used:
An even more advanced version of the
raised_nb_events()-operator exist, that can be used to count how often one or multiple events were raised, while the item was in a particular state:
Indicators for Events: In addition to the operators that count the events (of a particular type / within states), operators exist to check if an event was triggered. These operators evaluate to
true (instead of returning the frequencies). The
raised_all_events(EventA, EventB) return
true if all events listed in the set of events (e.g.,
EventB) were raised:
Again, an more advanced version of the
raised_nb_events_in_state()-operator exist, that can be used to check if one or multiple events were raised, while the item was in a particular state:
The following item shown in Figure 5.15 illustrates the use of the event-related operators.
Number of State Visits: The
visited_nb_states()-operator return the number of the visits for a set of states during the execution of the current task:
visited_nb_states(State, State, ...)
Indicators State: The
true if the last state the finite-state machine is one of the given states in the
is_last_state()-operator refers to the last state of the finite-state machine, the
visited_all_states()-operator can be used the check if all states listed in the
SetOfStates were visited during the execution of the current task:
Number of Interactions: A simple generic operator is provided that counts the number of user-interactions within the current task:
Note that this operator counts the total number of interactions within the running task. Counting specific interactions in FSM variables is possible using the finite-state machine (see section 4.4).
Elapsed Time: Another generic operator is provided that measures the elapsed time in the current task:
elapsedTime() counts the total time in the current task. Measuring more specific time intervals is possible using finite-state machines (see section 4.4.6 and the example provided in Figure 4.60).
Tree Components: The scoring of response formats created using components of type
TreeChildArea (see section 3.9.9) is supported with the following operators:
- The operator
current_node()allows to check if in a particular
RegularExpressionmatches to the node path ID of the current node:
- The operator
exists_nodes()returns number of nodes in the given
Treewhose node path ID matches at least one of the given
RegularExpressions(each node counts once only):
exists_nodes(Tree, RegularExpression, RegularExpression, ...)
- The operator
visited_nodes()returns number of visited nodes in the given
Treewhose node path ID matches at least one of the given
RegularExpressions(each node counts once only):
visited_nodes(Tree, RegularExpression, RegularExpression, ...)
- The operator
matches_nodes()returns the number of nodes in the given
Treewhose node path ID matches the
NodeIdPatternand whose column values match the specified
ColumnPatterns. The first
ColumnPatterncorresponds to the node name, the second
ColumnPatternto the first additional column, etc. (each node counts once only).
matches_nodes(Tree, NodeIdPattern, ColumnPattern, ColumnPattern, ...)
Pages: An operator
current_page() is provided to check if a specified
Page is currently displayed (or is displayed within the specified
For browser-components that support the bookmark function (see section 3.13.2) the following operator can be used to check if a page was bookmarked:
Spread Sheets: Operator to score value (or the computed formula) entered in a spreadsheet table with a given
UserDefinedId (see section 3.9.8) as integer value:
integer_value(UserDefinedId, RoundingMode, Default)
RoundingMode can take the values
half_down. If the text content is empty or does not represent a number, the
Default value is returned.
To score the entered formula (instead of the value), the
matches()-operator provides the additional argument
Selector. If the value
formula is requested, the operator evaluates the formula text of a spreadsheet table cell (instead of the formula value):
matches(Component, RegularExpression, Selector)
Highlighting: The following operators are provided to score the response format of multiple text highlighting (see section 3.8.3):
highlighted(RichText, RichText, ...)
complete(Selection, Selection, ...)
partial(Selection, Selection, ...)
PageAreas (see section 4.1.4) can be used to embed existing pages as content when designing pages. The CBA ItemBuilder allows that identical content can be re-used multiple times in different PageAreas on a single page, as illustrated in Figure 5.16. It is therefore generally necessary to add the
UserDefinedId of the PageArea to all references to components displayed within PageAreas.
As shown in Figure 5.8, the CBA ItemBuilder integrates the handling of numerical and string responses into the Scoring Rules (i.e., Hit- and Miss-conditions) using the
result_text()-operator. For each class, the active hit is determined first. If the option
Use first active hit/miss per class (applies to all tasks) is activated (see section 5.3.3), this is the first condition within a class that applies. Otherwise item authors need to make sure that all conditions are mutually exclusive within each class. If the active hit contains a
result_text()-operator, numerical or text input is provided as Result-Text.
The following examples shows, how to define scoring for single choice and multiple choice items including hits for not reached items and omitted responses. For this purpose, a variable is defined in the finite-state machine that counts how often a page was visited.
Items without response on a not visited page are coded as not reached (NR), missing responses on visited pages are coded as omitted response (OR). For more details, see the item shown in Figure 5.17 and use the Scoring Debug Window (as described in section 1.5.
The CBA ItemBuilder runtime will create some selected variables automatically:
- reactionTime: Time (in milliseconds) between the start of the task execution and the first user interaction.
- execTime: Time in (milliseconds) since the start of this task execution.
- nbInteractions: Number of user interactions since start of the current task execution.
Deployment software for CBA ItemBuilder tasks (see chapter 7) can use identical tasks multiple times and can allow to re-visit tasks. For that purpose, the runtime also computes the cumulative variables:
- reactionTimeTotal: Accumulated time (in milliseconds) between the start of the task execution and the first user interactions in previous executions of the task (excluding the last execution, that can be found in the variable reactionTime).
- execTimeTotal: Accumulated time (in milliseconds) in previous executions of the task (excluding the last execution, that can be found in the variable execTime).
- nbInteractionsTotal: Accumulated number of user interactions in previous executions of the task (excluding the last execution, that can be found in the variable nbInteractions).
If only dichotomous scoring is required for the complete Task, the CBA ItemBuilder implemented a simple approach.
MinHits: For each Task can be defined, how many Hit-conditions must be fulfilled, that the task is scored as
- Weight: Each Hit-/ and Miss-conditions is assigned to a Weight.
- Class: Each Hit-/ and Miss-conditions is assigned to a Class and to each Class either Hit-/ and Miss-conditions are assigned.
Each task provides the following results:
- result: Overall result (\(1\) if the at least the number of hits is active that is defined as the property MinHits, \(0\) otherwise).
- nb_Hits: Number of (active) hits.
- Hit_weight: Total weight of hits.
- nb_Misses: Number of (active) misses.
- Miss_weight: Total weight of misses.
- credit_Class: Name of the Class with the highest value. The value is computed as the sum of weight for all active Hits (in classes with Hits) or all active Misses (in classes with Misses).
- credit_weight: Weight of the class with the highest class weight.
As a summary the following list describes the typical workflow that is required for implementing automatic scoring in the CBA ItemBuilder:
Prepare the implementation of scoring by defining explicit User Defined IDs (see 3.7.4) for all components that should be used for scoring. It is not possible to define scoring using the automatically generated User Defined IDs that start with a
Define a task as an entry point for the CBA ItemBuilder project. Since the scoring definition is done per task, a task must always be defined first (see section 3.6 for details).
When tasks are defined, define Classes. For each variable that should be included in the result data for a particular Task define one variable. For all newly created items activate the option
Use first active hit per class (applies to all classes).
Define Hit-Conditions, that evaluate to
trueif the required conditions are fulfilled (see section 5.3.2. Order the Hit-conditions and add a default condition with the hit-syntax
trueas last condition. This will ensure that each class has one active hit (see section 5.3.3). For a usual workflow Miss-conditions are not needed, neither are Weights.
Extract string information using the
resultText()-operator, if necessary. While hits are as values of categorical variables, the Result-Text can be used to capture numerical or text responses.
Assign hits to classes. Classes fulfill the function of variables in the scoring of CBA ItemBuilder projects. Besides the unique name of the class (variable name), a description of the class can be entered in the Class Comment (variable description).
Decide how to handle missing response and implement, if necessary, additional Hit-Conditions for omitted responses and not reached questions (see section 5.3.11).
Test the scoring implementation using the Scoring Debug Window. If the option
Use first active hit per class (applies to all classes)was not activated, make sure that exactly one hit (or miss) is active for each class at any point in time (see 8.4.2).