Is the difference-in-differences method the most appropriate way to evaluate policy effects when experiments are not feasible?

In this blog post, we examine the reliability and limitations of the difference-in-differences method, which is widely used to evaluate policy effects in situations where experiments are difficult to conduct.

 

In economics, there are many instances where the effects of a policy must be evaluated to facilitate evidence-based policy discussions. This process is crucial for solving social and economic problems and demonstrating the validity of policies. In particular, it is essential to clearly determine whether the introduction of an economic policy or social program has actually yielded positive effects or caused unintended side effects. Evaluating the effect of a policy involves comparing the outcomes after the policy was implemented with the outcomes that would have occurred had the policy not been implemented. This comparison serves as essential information for policymakers, informing future policy design and ultimately contributing to the enhancement of overall social welfare.
However, since hypothetical outcomes cannot be observed, the effect of an event is evaluated by comparing the outcomes of a treatment group—composed of samples that experienced the event—with those of a control group—composed of samples that did not experience the event. The composition of the control and treatment groups is a critical factor determining the accuracy of the evaluation. If the two groups differ in factors other than the event itself, these differences can distort the evaluation results. Therefore, the key to this process is to form two groups for which there is no reason for the results to differ other than the event itself. For example, when evaluating the effect of an event on wages, the groups should be formed such that, in the absence of the event, the average wages of the treatment group and the comparison group would necessarily be the same. To achieve this, an experimental design in which samples are randomly assigned to the two groups is ideal. However, this method is often inapplicable when dealing with human subjects or social issues.
Due to these difficulties, quasi-experimental methods are frequently used in situations where experimental methods are not feasible. The Difference-in-Differences (DID) method is a widely used technique among these quasi-experimental methods. The Difference-in-Differences (DID) method evaluates the effect of an event by subtracting the change observed in the comparison group from the change observed in the treatment group. This evaluation is based on the parallel trends assumption, which posits that even in the absence of the event, a change of the same magnitude would have occurred in the treatment group as in the comparison group. If this assumption holds, it is not necessary to ensure that the pre-event conditions of the two groups are on average the same.
The usefulness of the difference-in-differences method is recognized not only in economics but also in various social science studies. Looking at its historical origins, it is known to have been first used by John Snow in 1854. He focused on residents in the same area of London who were supplied with water by two different water companies. Of the two companies using the same water source, only one changed its source, yet the residents were unaware of which company supplied their water. ‘John Snow’ compared changes in cholera mortality rates before and after the water source change among residents whose source had changed and those whose had not, and concluded that cholera was transmitted through water rather than the air. This demonstrates that the difference-in-differences method can serve as a powerful tool not only in economic analysis but also in other fields such as public health. In economics, this method was first used in the 1910s to assess the effects of the introduction of minimum wage laws.
However, when using the difference-in-differences method, it is essential to verify whether the underlying assumption of parallel trends is satisfied. If the parallel trends assumption is not met, applying the difference-in-differences method will lead to an incorrect assessment of the treatment effect. For example, when evaluating the employment-increasing effect of a worker training program, the parallel trends assumption will not hold if the proportion of workers in industries experiencing a sharp decline in jobs is higher in the treatment group than in the control group. However, simply designating the intervention group from a pre-event period as the comparison group to increase the statistical similarity of the samples between groups does not guarantee that the parallel trends assumption is met. This is because, for changes sensitive to economic fluctuations—such as employment—the simultaneity of the changes may be more important for satisfying this assumption than the statistical similarity of the samples between groups.
To make the application of the difference-in-differences method more reliable, it is important for researchers to construct multiple comparison groups and verify whether the evaluation results obtained by applying the method to each group are consistent. These methods can enhance the reliability of evaluations using the difference-in-differences method. Furthermore, constructing comparison groups that exhibit high statistical similarity to the treatment group across various characteristics can reduce the likelihood that the parallel trends assumption will be violated. The importance of these methods is particularly highlighted in social science research, where experimental methods are difficult to apply.
The difference-in-differences method is a powerful analytical tool that can be utilized in various fields, including not only policy impact evaluation but also the assessment of corporate management strategies and the analysis of educational program effectiveness. However, before applying it, it is important to carefully examine the validity of the parallel trends assumption and, if necessary, use it in conjunction with other complementary methods.

 

About the author

Writer

I'm a "Cat Detective" I help reunite lost cats with their families.
I recharge over a cup of café latte, enjoy walking and traveling, and expand my thoughts through writing. By observing the world closely and following my intellectual curiosity as a blog writer, I hope my words can offer help and comfort to others.