Combine weighted scores into a measures of student learning rating
By assigning weights to each score associated with the multiple measures in educator evaluations, districts are signaling which results or measures in the system are deemed to have more value than others, are better aligned with learning goals, are more appropriate for measuring educator impact or may signal that all results should be weighted equally. After each of the measures of student learning are scaled (e.g., on a 0-3 scale), the next step would entail assigning final weights to each and applying an approach to calculate a total score for measures of student learning. Districts may wish to preliminarily weight the results from each measure as it is selected at the beginning of the school year. Districts are encouraged to continuously evaluate the impact of weighting decisions and make revisions as needed.
For example districts may want to assign more weight to:
- Outcomes from measures deemed to be of higher technical quality
- Outcomes reflecting collective efforts from a team of teachers (note that the statute and rules do not specify a minimum weight for either individual or collective attribution measures but do suggest that each must have a “measurable influence”)
- Outcomes from measures deemed by district stakeholders to have higher value for teachers
Although districts can decide how to weight the scores from each of the multiple measures, districts will need to select weighting percentages that sum up to 100 percent. Multiplying the scores earned by the assigned weight yields the weighted score for each measure. The composite score in this example represents a compensatory approach, which was selected as a design choice to ensure that each measure included in an educator’s body of evidence can have a measureable influence on the measure of student learning score. Table 4 provides an illustration of how districts may consider distributing the weights assigned to each score for their teachers, and how a single index score is computed.
Table 4: Weighting and combining scores example (refer to Tables 2 and 3 for possible scores)
|Measures/Results from Colorado Growth |
Model and Student Learning Objectives
|Score Earned||Weight Assigned||Weighted Score|
|TCAP Reading MGP (collective school)||2||.15||.3|
|TCAP Writing MGP (collective school)||2||.15||.3|
|SLO 1 Results (collective grade-level reading)||2||.35||.70|
|SLO 2 Results (individual teacher)||1||.35||.35|
In this example, the assumption is made that the district has agreed to attribute Colorado Growth Model results from reading and writing (total of six points possible) to all teachers in the school. Further, all teachers will have two additional measures based on targets yielding two scores (total of six points possible) for attainment of expected targets. The first column in Table 4 is the measure that is included. The second column reflects the rating earned - Much Less than Expected (zero points), Less than Expected (one point), Expected (two points) and More than Expected (three points) - by a hypothetical teacher with all these measures relevant to his/her goals. To assign weights to scores, a district can allocate smaller or higher percentages to each rating and ensure that the weights assigned across all measures sum up to 1 (or a 100 percent) as shown in the third column. In this example, the district has decided that each of the results from their SLO targets and the set of combined TCAP growth results should have about the same weight. The third column shows that each SLO result has a weight of .35 and the set of combined TCAP growth scores has a total weight of .30. The fourth column in the table shows the weighted scores. These are computed by multiplying the score earned for each measure (in column 2) by the assigned weight (in column 3). In this example, it is determined that the raw score for measures of student learning is 1.65.
In the Colorado State Model Evaluation System each educator also earns a professional practices rating based on their performance on the model system rubric (combining the other five Quality Standards). In order to be able to combine the ratings for the professional practices and student learning, the same scale must be used. Because of this, the model system converts both the professional practices rating and the measures of student learning rating into a 0-540 scale. The number 540 was selected based on the number of elements included in the professional practices Quality Standards for teachers. There are 27 elements across five standards for teachers and 540 is a number that can be divided by 27 so that if a district weights the elements equally, each has an equal number of points possible without decimals. In order to combine the professional practice score with the measures of student learning score so that each score represents 50 percent of an educator’s evaluation the measures of student learning score will be translated into an index score that can be translated to a 540 scale as described below. For information on the approach and method used in the model system download the appropriate document:
- Determining a Final Educator Effectiveness Rating for Teachers
- Determining a Final Educator Effectiveness Rating for Principals/Assistant Principals
Calculating a measures of student learning score
The sum of all weighted scores (1.65) in Table 4 represents the composite student learning score earned by a teacher. Table 5 is used to translate the composite score into qualitative judgments about student learning for a given teacher. The cut points in Table 5 for raw composite scores are based on scores of 0 for much lower than expected, 1 for lower than expected, 2 for expected, and 3 for higher than expected. When numbers in the four ranges in Table 5 are combined and rounded to the nearest whole number, they are placed in the four categories as shown. The fractions are produced when teachers have multiple assessment results which are weighted and combined.
Table 5: Cut points for composite measures of student learning scores
|Composite Rating||Much Lower |
Total RAW Composite
|0.0 to 0.49||0.50 to 1.49||1.50 to 2.49||2.50 to 3.0|
In Figure 2 the raw composite score of 1.65 is converted to a measures of student learning score between zero and 540. The measure of student learning score will be added to an educator’s professional practices score in order to determine an overall effectiveness rating.
Figure 2: Illustration of calculating a student learning score
Table 6 describes the method for converting the measures of student learning raw composite score into a measure of student learning score. Note: the model system Excel rubrics will do this math for users.
Table 6: Rules for converting a Measure of Student Learning raw score to the 540 point scale
|Measures of Student Learning Raw Composite Score||Formula for computing a Measures of Learning Score|
|Much Lower than Expected (0 < score < .5)||(score – .0) * 270|
|Lower than Expected (.5 <= score < 1.5)||(score – .5) * 135 + 135|
|Expected growth (1.5 <= score < 2.5)||(score – 1.5) * 135 + 270|
|Higher than Expected growth (2.5 <= score <= 3.0)||(score – 2.5) * 270 + 405|
In Figure 2 the raw composite score of 1.65 in Table 4 (above) is converted to a measures of student learning score between 0 and 540. The measure of student learning score will be added to an educator’s professional practices score in order to determine an overall effectiveness rating.
Note that an overly high weight or percentage attributed to collective attribution measures may decrease the ability at the school or district level to recognize high-performing teachers (who may be held back by the average) and to identify struggling teachers (who may be propped up by the average). Therefore, it is imperative that districts understand the importance of finding the right balance between weighting the measures that reflect teacher-level and collective attribution results.
To help districts visualize the impact that weighting has on the overall student learning score in the educator evaluation, CDE has developed a Measures of Student Learning Tool for use by personnel evaluation committees and educators. Districts can use the tool to explore the impact that varied weights assigned to different measures can have on an overall rating and use the tool to get feedback on weighting decisions.
Although this guidance includes compensatory approaches that allow for strong performance on some measures to compensate for weaker performance for others, districts may consider other approaches. For example, a district may want to use a conjunctive approach that requires a minimum threshold to be earned across all measures before assigning a “passing” or “meeting” rating for teachers on student outcomes. The value statement articulated by a district electing to use a conjunctive approach is that they believe teachers should demonstrate a minimum level of proficiency on each measure being considered prior to earning a rating of “meeting expected student learning.” Using a conjunctive methodology would indicate high confidence in each measure’s technical validity and appropriateness for attributing the results to an individual teacher.
Tools/resources for completing Step 6:
- Measures of Student Learning Tool: This Microsoft Excel tool is one sample approach designed to help Colorado educators input the measures that will be used in their evaluations, see the impact of the weighting decisions for each measure, input the desired learning targets that are expected as a result of their instruction, and synthesize the evidence from multiple measures into one score that will be used in educator evaluation. It includes the requirements included in S.B. 10-191, the rationale for decisions made, and creates sample graphics for various groups of teachers.