Much Ado About Nothing?
-
Posted by Robert Kaestner
- Posted on 4/30/2020 3:58:39 PM
This article presents an empirical analysis of the effect of heat on PSAT test scores. It is a well executed article that aptly applies the best practices toolkit.
However, is it interesting? Economically meaningful?
The main result of the article is that a 1 degree increase in average annual (school year) temperature, which is a large change representing a 1 standard deviation change in temperature, is associated with a 0.002 change in the standard deviation of PSAT test scores. Similarly, an additional day above 90 degrees (one-third of a standard deviation change) is associated with a 0.0005 change in the standard deviation of PSAT test scores.
My reading of this evidence is that temperature (really deviations from average temperature) has virtually no effect on PSAT test scores despite the statistical significance of the estimate. Estimates are tiny. for example, a 6 standard deviation change (-3 to 3) in temperature would change PSAT test scores by 0.012 standard deviations. Even an extremely rare (never observed?) temperature change would barely budge PSAT test scores.
Of course, and I truly believe this, a null finding, which is how I interpret this estimate, is interesting if the alternative hypothesis is plausible. The article motivates the scientific plausibility of the research question as follows:
"Even moderately elevated temperatures can impair decision-making and cause substantial discomfort, and short-term impacts of heat on cognition have been extensively documented in laboratory settings (Mackworth 1946; Seppänen, Fisk, and Lei 2006). Hot classrooms may thus reduce the effectiveness of instructional time through physiological impacts on both students and teachers, making it harder for both to focus and accomplish a given set of learning tasks. In cases of extreme heat, schools may close or dismiss students early, directly reducing the amount of instructional time."
So two citations are sufficient? Here is part of an abstract from one of the studies in Seppänen, Fisk, and Lei 2006 (a review article that includes 4 classroom setting studies, although nothing like the current setting and analysis):
(From Allen et al. 1978) 65 male college students learned and later recalled a paired associate list (word-number pairs) in 1 of 5 air (dry bulb) temperatures (52, 62, 72, 82, or 92°F), with wet bulb temperature held constant. They learned and recalled best at 72°F, with performance declining at successively lower and higher air temperatures. In a 2nd experiment with 85 Ss, dry and wet bulb temperatures were varied from 52 to 82°F, in 5° increments, with relative humidity held constant. Other male students learned equally well in these effective temperatures."
This study points to a optimal temperature and declines moving away in either direction; this is not consistent with the results reported here (or with hypothesis).
A summary of a related article not cited (Wyon et al. 1981 Studies in environmental Science) is as follows:
"This chapter discusses the effects of moderate heat stress on mental performance. Asystematic relationship between heat stress and mental performance has been demonstrated at intervals of only 1 °C, despite the low numbers of subjects exposed to each temperature. Moderate heat stress, only a few degrees centigrade above the optimum, has a marked effect on mental performance when temperatures rise slowly. Tasks demanding concentration and clear thinking are adversely affected, but memory and reutilization can be improved by temperatures up to 26 °C, declining rapidly thereafter. In hot weather, concentrated work should, therefore, be carried out early in the day. Memory and tasks requiring an increased breadth of attention can with advantage be postponed until the temperature has risen, provided that it does not rise above 27 °C. The positive and negative effects of these moderate levels of heat stress are likely to be greater for men than for women."
Again, results not consistent with hypothesis.
And from a meta-analysis not cited (Wargocki et al. 2019 Building and environment):
"The relationship derived in the analysis shows that the performance of psychological tests and school tasks can be expected to increase on average by 20% if classroom temperatures are lowered from 30 °C to 20 °C and that the temperature for optimal performance is lower than 22 °C. The relationship is valid only for temperate climates."
It is interesting to note that relationship reported in this study is similar to estimate in the Park et al. article: a 1 degree Fahrenheit change is associated with a 1% change in cognitive test.
The other citation is:
Mackworth, N.H. 1946. “Effects of Heat on Wireless Operators.” British Journal of Industrial Medicine 3 (3): 143–58
The relevance of this 1946 study is not obvious.
The upshot is that the authors of this article did not assess thoroughly the scientific plausibility of the research question. They justified the plausibility of the alternative hypothesis (heat matters) in a cursory fashion and relied on the econometric toolkit to deliver a "credible" answer. If they had done due diligence on the scientific plausibility, they may have expected the null result they find, or at least had a more informed road map for analysis.
Much Ado About Nothing?
However, is it interesting? Economically meaningful?
The main result of the article is that a 1 degree increase in average annual (school year) temperature, which is a large change representing a 1 standard deviation change in temperature, is associated with a 0.002 change in the standard deviation of PSAT test scores. Similarly, an additional day above 90 degrees (one-third of a standard deviation change) is associated with a 0.0005 change in the standard deviation of PSAT test scores.
My reading of this evidence is that temperature (really deviations from average temperature) has virtually no effect on PSAT test scores despite the statistical significance of the estimate. Estimates are tiny. for example, a 6 standard deviation change (-3 to 3) in temperature would change PSAT test scores by 0.012 standard deviations. Even an extremely rare (never observed?) temperature change would barely budge PSAT test scores.
Of course, and I truly believe this, a null finding, which is how I interpret this estimate, is interesting if the alternative hypothesis is plausible. The article motivates the scientific plausibility of the research question as follows:
"Even moderately elevated temperatures can impair decision-making and cause substantial discomfort, and short-term impacts of heat on cognition have been extensively documented in laboratory settings (Mackworth 1946; Seppänen, Fisk, and Lei 2006). Hot classrooms may thus reduce the effectiveness of instructional time through physiological impacts on both students and teachers, making it harder for both to focus and accomplish a given set of learning tasks. In cases of extreme heat, schools may close or dismiss students early, directly reducing the amount of instructional time."
So two citations are sufficient? Here is part of an abstract from one of the studies in Seppänen, Fisk, and Lei 2006 (a review article that includes 4 classroom setting studies, although nothing like the current setting and analysis):
(From Allen et al. 1978) 65 male college students learned and later recalled a paired associate list (word-number pairs) in 1 of 5 air (dry bulb) temperatures (52, 62, 72, 82, or 92°F), with wet bulb temperature held constant. They learned and recalled best at 72°F, with performance declining at successively lower and higher air temperatures. In a 2nd experiment with 85 Ss, dry and wet bulb temperatures were varied from 52 to 82°F, in 5° increments, with relative humidity held constant. Other male students learned equally well in these effective temperatures."
This study points to a optimal temperature and declines moving away in either direction; this is not consistent with the results reported here (or with hypothesis).
A summary of a related article not cited (Wyon et al. 1981 Studies in environmental Science) is as follows:
"This chapter discusses the effects of moderate heat stress on mental performance. Asystematic relationship between heat stress and mental performance has been demonstrated at intervals of only 1 °C, despite the low numbers of subjects exposed to each temperature. Moderate heat stress, only a few degrees centigrade above the optimum, has a marked effect on mental performance when temperatures rise slowly. Tasks demanding concentration and clear thinking are adversely affected, but memory and reutilization can be improved by temperatures up to 26 °C, declining rapidly thereafter. In hot weather, concentrated work should, therefore, be carried out early in the day. Memory and tasks requiring an increased breadth of attention can with advantage be postponed until the temperature has risen, provided that it does not rise above 27 °C. The positive and negative effects of these moderate levels of heat stress are likely to be greater for men than for women."
Again, results not consistent with hypothesis.
And from a meta-analysis not cited (Wargocki et al. 2019 Building and environment):
"The relationship derived in the analysis shows that the performance of psychological tests and school tasks can be expected to increase on average by 20% if classroom temperatures are lowered from 30 °C to 20 °C and that the temperature for optimal performance is lower than 22 °C. The relationship is valid only for temperate climates."
It is interesting to note that relationship reported in this study is similar to estimate in the Park et al. article: a 1 degree Fahrenheit change is associated with a 1% change in cognitive test.
The other citation is:
Mackworth, N.H. 1946. “Effects of Heat on Wireless Operators.” British Journal of Industrial Medicine 3 (3): 143–58
The relevance of this 1946 study is not obvious.
The upshot is that the authors of this article did not assess thoroughly the scientific plausibility of the research question. They justified the plausibility of the alternative hypothesis (heat matters) in a cursory fashion and relied on the econometric toolkit to deliver a "credible" answer. If they had done due diligence on the scientific plausibility, they may have expected the null result they find, or at least had a more informed road map for analysis.