thin blue line
Date of Publication: December 2000 CYFERNet For Professionals

Section 7: Measuring Outcomes

The Importance of Measurement

Thin Magenta Line
Previous Page Home Next Page
Thin Magenta Line

Reliability and Validity

Accurate and precise measurement of program goals is a key part of establishing evidence of a program's effectiveness. To a great extent, accuracy is determined by the validity and reliability of the measures that are used. Reliability is defined as the consistency with which the results are repeatable using the same measure. Validity is defined as the ability of a measure to represent the concept of interest. Validity is a complex concept that is mainly concerned with 3 key questions about the properties of a measure:

  • Does the measure seem to make sense? (face validity)
  • Does it agree with other related measures? (criterion validity)
  • Does it predict the outcome of interest? (predictive validity)
Technical Tip 7.1: Quantitative Measurement

Measurement is a means to determine the value or level of some phenomenon in a way that is quantifiable. For example, many measures of families and family characteristics, such as family violence, are in the form of scales derived by adding up a number of possible behaviors. Typically, these are continuous scales that have a range of possible values. Other measures are dichotomous meaning they have only two possible values as in the case where a behavior is present or not. For statistical purposes, the presence of a behavior would be assigned the value of "1" and the absence of a behavior would be assigned a value of "0". For example, a family in which any wife abuse had occurred would get a score of "1". If no wife abuse had occurred, the score would be "0" for this variable.

Target Population

Results obtained with standardized measures have greater acceptability because such measures have undergone extensive testing to determine validity and reliability (Mika, 1996) across different population groups. The extent to which a particular measure is appropriate to the program's population is also important. Information manuals on measures generally provide information on the types of populations on which the measure has been tested. This can be an important criterion in selecting a valid instrument for the program under evaluation because norm-referenced measures allow the target population's scores to be compared with those of a similar reference group. For example, an instrument tested or normed only on college undergraduate psychology majors may not produce reliable results if the program's target population consists mainly of teen-age mothers who have not attended college. Similarly, an instrument normed on a lower socio-economic group of urban unwed mothers may not be appropriate to the general military population served by USAF family violence prevention programs.

Acceptability

The integrity of measurement is also influenced by the manner in which the measures are completed and collected, the attitude of staff towards the measures, and the willingness of clients to complete the measures. Thus factors other than the scientific properties need to be considered including the following:

  • Time to administer
  • Clarity of directions
  • Language of instrument
  • Sensitivity of questions
  • Scoring ease

If problems with any of the above are present (e.g., data are lost due to high rates of refusals) then the validity and reliability of measurement results are in question. Part of the quality assurance procedures in place for programs should involve periodic monitoring of how measurements are conducted, and how program participants and staff regard the measurement process.

Enhancing Outcome Measurement

Qualitative data are also important in enhancing the evaluator's understanding of the outcome evaluation (See also Sections 5, 6 for discussions of qualitative data). Qualitative data can provide another dimension to program evaluation by allowing participants to elaborate on their experiences and on the process of change. Mika (1996) provides an example of how both qualitative and quantitative measures can complement one another:

Example 7.1: Use of Qualitative Measures in an Evaluation of a Parenting Program:

"The quantitative measure includes a standardized measure of the parenting skills taught in the program. The qualitative measure asks participants to describe how the program has helped them become better parents, what they continue to have difficulties with, and what they have learned in the program that they use now at home with their children." (Mika, 1996, p.62)

Operational Definition

The ability of a measure to mirror the program goal is of primary importance. The more precisely a goal is stated (its operational definition), the better one can identify a measure that is a good fit with the goal. In addition, one needs to determine which dimension of family functioning is being assessed. An elaborated logic model can assist the evaluator in conceptualizing linkages between program goals, activities and outcomes (See Section 2, Subsection, "Specifying The Prevention Mechanisms") and in working through the job of converting goals to outcomes. For example, one goal of the USAF Family Advocacy Programs is to enhance family functioning. In turn, good family functioning could be operationalized as a reduction in parental stress, and enhanced parent-infant attachment.

Thin Magenta Line
Previous Page Home Next Page
Thin Magenta Line