Can the Journal Citation Index truly evaluate the value of scientific research?

In this blog post, we examine the problems with scientific research evaluation methods centered on the Journal Citation Index and consider whether they accurately reflect the true value of research.

 

In the scientific community, researchers’ qualifications and the significance of their papers are evaluated using the “journal impact factor.” However, criticism of this practice has been growing in recent years. Last May, over a hundred science and technology researchers gathered in San Francisco to announce the “San Francisco Declaration on Research Assessment.” Currently, tens of thousands of researchers have joined this movement. They point out several critical flaws in the evaluation method based on the Journal Citation Index. In this article, I will examine the specific criticisms of the Journal Citation Index and argue that it cannot serve as a standard for evaluating scientific research.
The journal impact factor is a numerical measure of a journal’s influence. It was originally designed as a metric for librarians. This was because libraries needed to evaluate the relative importance of various candidates when selecting which academic journals to subscribe to on a regular basis. The method for calculating the journal impact factor is simple. For example, if there is a journal titled “Science and Technology,” and if a total of 20 papers were published in “Science and Technology” over the past two years, and these papers were cited a combined total of 200 times, then the impact factor of “Science and Technology” would be 200/20, or 10. In other words, a “citation index of 10” means that the average number of citations per paper published in *Science and Technology* over the past two years is 10. Thus, the journal citation index is a numerical value that quantitatively expresses the importance of a journal.
However, the problem is that the journal citation index is often applied directly to evaluate the importance of individual papers. The value of the journal impact factor directly becomes the influence score for the papers published in that journal. For example, if there are two journals—‘Science and Technology’ with an impact factor of 10 and ‘Monthly Engineering’ with an impact factor of 90—then all papers published in ‘Science and Technology’ are evaluated at 10 points, while all papers in ‘Monthly Engineering’ are evaluated at 90 points. Scores evaluated in this manner are naturally applied to the authors of those papers as well. Consequently, Researcher A, who published a paper in *Science and Technology*, earns 10 points, while Researcher B, who published a paper in *Monthly Engineering*, earns 90 points. Regardless of the researchers’ qualifications or the originality of the papers, an 80-point gap emerges between A and B.
The first problem with the Journal Citation Index is the “statistical trap.” Since the Journal Citation Index is an average, the number of citations can vary widely even among papers published in the same journal. The influence of an individual paper is not simply proportional to the citation index of the journal in which it was published. For example, A’s paper may have a citation index of 10, but the actual number of citations could be in the tens or hundreds. However, it is possible that the overall average dropped to 10 simply because the citation counts of other papers in *Science and Technology* were very low. Conversely, while Author B’s paper may have been cited only once or twice, it could have benefited from a spillover effect, with the journal’s impact factor jumping to 90 because other papers in *Monthly Engineering* were cited very frequently. In such a situation, it is meaningless to consider the journal’s impact factor when comparing the research of A and B. Instead, it is reasonable to compare the citation counts of A’s and B’s papers individually and conclude that A’s paper has a much greater impact. According to the San Francisco Declaration, roughly 25% of the papers in a journal account for 90% of the total citations. In other words, if the journal impact factor is applied directly to individual papers, there will inevitably be papers that are disadvantaged and others that benefit unfairly.
Second, research evaluation based on the journal impact factor fails to reflect the unique characteristics of each discipline. For example, in fields such as medicine or biology, once a theory is proposed, numerous experiments are conducted to verify its validity. Since clinical trials related to a single paper are often followed by subsequent studies, the impact factors of biology and medical journals are inevitably higher than those in other natural sciences or engineering fields. On the other hand, in pure mathematics, a single paper is typically self-contained. Subsequent research or experiments are unnecessary. Therefore, mathematics papers tend to have relatively few citations, and the impact factors of pure mathematics journals are inevitably lower. Additionally, in highly specialized fields with few researchers, the number of citations is relatively low. Conversely, journals in large, active fields with many researchers will naturally have a higher number of citations. Thus, the evaluation method based on the Journal Citation Index has the limitation of failing to account for fundamental differences stemming from the nature of each discipline.
The third issue is the adverse effect of the journal impact factor: the concentration of papers in a small number of popular journals. Researchers preparing to submit a paper naturally hope to have their work published in world-renowned journals such as *Cell*, *Nature*, and *Science*. This is because these journals have the highest journal impact factors. If this concentration in a few journals becomes excessive, publishing a paper may come to be viewed as more important to researchers than the research itself. A culture that recognizes only papers published in top-tier journals has, before we knew it, become a global phenomenon. If this “luxury brand mentality”—which undervalues the diligent research process and emphasizes only visible results—continues, it could distort the very essence of science. Furthermore, some journals encourage “self-citation”—that is, citations of papers published in their own journals—in order to boost their own impact factors. Thus, evaluation based on journal impact factors is giving rise to unreasonable and unethical competition.
Of course, journal impact factors are not without their merits. The reason they are widely used is their ability to facilitate quick and convenient evaluation of researchers. The editors of each journal serve as “expert evaluators.” They swiftly identify important and noteworthy research amidst the flood of research outputs. In today’s rapidly changing and expanding scientific community, this is an undeniable benefit.
However, as we have seen, there are critical flaws lurking behind this convenience. The first is the statistical trap: the journal impact factor may differ from the actual number of citations a single paper receives. The second is that the impact factor fails to account for the characteristics of individual disciplines. It has been pointed out that in some fields, the number of citations is almost unrelated to a paper’s influence. The final issue is that this evaluation method fosters wasteful competition within the scientific community, such as by creating a monopoly structure dominated by a few prominent journals.
To overcome these issues and achieve proper scientific progress, the scientific community is currently seeking new evaluation criteria. The simplest solution is to incorporate individual researchers’ citation counts into the evaluation. This serves as an alternative to address the pitfalls of statistics. Another approach involves using an adjusted index that reflects the unique characteristics of each discipline. By using the adjusted index—calculated by dividing a journal’s impact factor by the average citation count of the top 20% of journals in its field—it is possible to normalize the data to account for the specificities of each academic discipline. A more fundamental alternative would be to revitalize peer review and develop qualitative evaluation criteria rather than relying on quantitative metrics such as journal impact factors or citation counts. What is currently required of the scientific community is self-reflection and communication to devise rational evaluation methods.

 

About the author

Writer

I'm a "Cat Detective" I help reunite lost cats with their families.
I recharge over a cup of café latte, enjoy walking and traveling, and expand my thoughts through writing. By observing the world closely and following my intellectual curiosity as a blog writer, I hope my words can offer help and comfort to others.