Friday, May 20, 2011

You're just a number: introduction to the h-index

Measuring a single scientist's output has always been problematic. Why? First, in order for the statistics to be reliable, the scientist has to produce a considerable publication output and get cited. That takes time. Second, measures like research productivity, number of publications and citations don't always correlates. Measuring the output of journals and universities has been far more reliable than measuring that of one person.

Suggested by physicist Jorge Hirsch, h-index (2005) offers an attractive way of quantifying one's scientific output as a single number. The index is defined as:

“A scientist has index h if h of his or her Np papers have at least h citations each and the other (Nph) papers have ≤ h citations each” (Hirsch, 2005).

So, if a scientist published at least ten papers, which each were cited at least ten times, her h-index is ten. A zero h-index, on the other hand, says that the scientist perhaps published papers, but is yet to have an actual impact.

The h-index is attractive because it takes into account both the number of publications and the number of citations. It isn't phased by "one hit wonders", but favors a body of work that each of its components has at least a certain impact (citations).

Problems and disadvantages

Which database to use? Different databases cover different journals, conferences, etc. Web of Science, for example, has better coverage of STEM than of the humanities, which tend to publish books rather than papers. Using Google Scholar will likely inflate the h-index.

Which field are you in? Larger fields mean a larger potential for citations, resulting in a higher h-index.

You aren't a number! (Or at least, not just *one* number). Reducing scientists to a single number ignores other factors, such as their teaching skills and ability to collaborate. Can an entire career really be described as a single number?

The age factor: The older the scientist gets, the longer she had to publish and get cited. Younger scientists are at disadvantage with the h-index.

Relevance: Since the h-index doesn't decrease, it can't tell whether a scientist is still active and/or where her work is still relevant for others in her field.

Since the h-index is a single number, scientists with the same h-index can have very different numbers of papers and citations. In the following table, scientist A and scientists B have the same h-index, but scientist A has far more citations in the overall raw calculation.

Because of the h-index many problems, offering new corrections to it, or coming up with other indices altogether is the official new sport for bibliometricians. The new indices are supposed to offer a better way to make decisions about promotions and grants, but despite all the efforts, it seems that the way to the promised tenure will continue to be paved with peer evaluation.

Bornmann, L., & Daniel, H. (2007). What do we know about the h-index? Journal of the American Society for Information Science and Technology, 58 (9), 1381-1385 DOI:10.1002/asi.20609

Bornmann, L., & Daniel, H. (2008). The state of h index research. Is the h index the ideal way to measure research performance? EMBO reports, 10 (1), 2-6 DOI: 10.1038/embor.2008.233

Hirsch, J. (2005). An index to quantify an individual's scientific research output Proceedings of the National Academy of Sciences, 102 (46), 16569-16572 DOI: 10.1073/pnas.0507655102


  1. Another way to look at actual impact is to use the "Outcomes and Output" framework. This seperates having a real effect on society from mere activity.

  2. It’s very informative and you are obviously very knowledgeable in this area. You have opened my eyes to varying views on this topic with interesting and solid content webstagram