What is Measurement?
Measurement is simply the process of assigning numbers to characteristics according to rules.
What are the rules in psychological measurement?
- The construct (idea or characteristic that explains patterns in behavior) must be defined. In other words, you must define what you are measuring. Examples of constructs to measure include intelligence, anxiety, working memory, attention, and social communication. You may be thinking that you can’t “see” anxiety or “intelligence”. This is true. You can see behaviors, responses and performances. So construct helps us make sense of observable behaviors.
- Clinical Example: We cannot “see” working memory. We can observe how many digits an individual is able to repeat or how long they can hold information briefly.
- The task must be standardized. This means that everyone must be given
- the same instructions
- the same materials
- the same time limits
- the same scoring criteria. If these are not done, the resulting numbers and data don’t have the same meaning and we cannot reasonably interpret them.
- Responses must be scored systematically because the same performance must receive the same score independent of who does the scoring. What this means is that there have to be clear criteria for
- what counts as correct
- what counts as incorrect
- basal and discontinuation rules
- Basal Rule: This is the point in an assessment where we assume an examinee would correctly answer all easier items below that level. Tests are arranged from easier to harder items. We do this to avoid unnecessary testing. For example, if an individual answers three consecutive items correctly at items 10, 11, and 12 we assume they would also answer questions 1-9 correctly and give them credit accordingly.
- Discontinuation Rule: This is also sometimes called the “ceiling”. It is the point in an assessment where we stop administering items because the examinee is consistently missing them. If someone demonstrated repeated difficulty we assume harder items would also be incorrect. Why do we have a ceiling? If an individual misses several items in a row, continuing would not likely provide additional meaningful information and may increase the individual’s frustration. For example, if an assessment defines the ceiling as missing four items in a row, if the individual misses items 12, 13, 14, and 15, testing stops because it is assumed more difficult items would also be missed.
- Raw scores must be interpreted using norms. Raw scores are basically the number of items an individual gets correct. By themselves, these numbers are meaningless. If an individual answers 24 items correctly on a subtest, what does that information really tell us? How many questions were there? Who are we comparing the individual to (What is the population)? At what age? This is where norms come in. Norms are data collected from a large group of people that allow for us to compare one person’s performance to others. Norms are important because:
- Norms provide context or meaning to scores. For example, the allow us to answer if a score is typical for someone that age.
- Norms turn numbers into information. They allow us to determine if a score is typical, above average, below average, or how far from average it is. With norms, we can identify patterns.
- Norms allow for comparisons with other people who are a similar age or share other characteristics. Assessment is not just about describing performance. It is about determining whether performance is “typical” or “atypical”. This can only be determined through comparison. This creates clinical consistency and reduces bias.
- Norms help us to identify an individuals strengths and weaknesses by helping us see what stands out about their performance.
- Norms prevent clinicians from over- or -underinterpreting raw numbers
- Norms provide a reference point for clinical decision making because they create consistency. This guides diagnoses, eligibility decisions, and things like intervention planning.
- Measurement assumes order. This means that higher scores reflect more of the construct (in most cases). For Example. If someone tests higher on a vocabulary test compared to norms for their population and age, we assume that they have a stronger vocabulary knowledge.
- Measurement includes error (variation in human performance). In psychological assessment, no score is perfectly precise. In other words, a test score is an estimate and not a perfect reflection. Error exists because psychological traits are not fixed mechanical qualities! There is fluctuation even under standardized conditions. No two performances are identical. Measurement error is not a flaw, it is a feature of measuring human behavior.
- Example: An individual may earn slightly different scores on an assessment on different days. Did their knowledge or ability change? That is unlikely. What is more likely is that
- they were tired the next day
- they hesitated longer
- they were cold
- they were hungry
- they misunderstood an item
- they were less confident
- they were emotionally less regulated. These examples of small differences are what introduce measurement error.
- Example: An individual may earn slightly different scores on an assessment on different days. Did their knowledge or ability change? That is unlikely. What is more likely is that
