iThenticate recently released an infographic headed “iThenticate Plagiarism Graphic” which, they claim, “illustrates the growing problem of plagiarism and other forms of misconduct in research, as well as the types of damages that are incurred by misconduct.” The page is an “extension” of a recent iThenticate study “True Costs of Research Misconduct.”
The factoids are startling, as of course they are meant to be, and unlike some of the iThenticate group’s earlier infographics, the sources behind the claims are documented. One reason for documenting sources is, of course, to allow the interested reader to follow-up, to read more.
Two factoids in the infographic caught my eye. In a section headed “Types of damage from deceptive research” we see:
CAPITAL COSTS : $525,000 : cost of a single investigation into research malpractice in U.S.
$110 MILLION : total cost of investigations into research misconduct in U.S. in 2010.
Both statements are attributed to a PLOS Medicine paper, The Costs and Underappreciated Consequences of Research Misconduct: A Case Study by Michalek et al.
Michalek and his team attempted to quantify the costs, hidden and actual, in investigating a charge of academic misconduct. The hidden costs can be incalculable, including loss of reputation for the sponsors of the research, for the institution at which the misconduct took place including possible loss of future grants, for the journal publishing the research, for those trying to follow up and use the findings of the research, for people directly affected by poorly and wrongly researched data (including medical patients), and more.
Actual and calculable costs can include time taken by those involved in the investigations, stationery and material costs, correspondence costs, consultancy and legal fees, storage costs and more. Michalek and his team produce a formula which attempts to enable calculation of the direct and actual costs.
To illustrate their work, they use a real (but unidentified) case study,
An allegation of research misconduct … made against a senior scientist for enhancing and fabricating images and data contained in a federal grant application.
They make a number of assumptions about the time involved and the various costs, and they admit to it. Technological aspects of the investigation “cost an estimated $10,000.00.” The Investigation Committee held ten meetings to review the evidence, interviews were held, they spent “well over 100 hours in meetings … and an estimated 700 hours outside of committee.” Time had to be spent reviewing the miscreant’s earlier work.
Michalek and team “estimate that the direct cost of this case approached $525,000.” The paper suggests figures for the indirect and hidden costs as well.
In their conclusion, Michalek and team
… conservatively estimate that if one were to apply our observed costs to all of the allegations of misconduct reported in the United States to the ORI (n = 217 cases) in their last reporting year, the direct costs would exceed $110 million.
As noted, these investigators make a number of assumptions in their basic calculation – assumptions they admit: note how many times they use the word “estimate”. They then assume that this case and the costs are typical, that this one case study provides an average cost, that if one multiplies the cost of this case by the 217 allegations of misconduct, the total cost of the investigations is $113,925,000, and yes, this in excess of $110 million.
That is a lot of assumptions, a lot of IFs, including the assumption that the case is typical and representative, that the cost of this one case is typical, including the assumption that all those allegations of misconduct were upheld.
But my quibble is not with Michalek and co, but with iThenticate.
iThenticate appears to make the further assumption – or perhaps just hopes that the reader of the infographic will assume – that all cases of plagiarism in academic research likewise cost an average $525,000. Michalek’s case, remember, was a matter of fabrication of data and results, not of plagiarism.
And then, iThenticate takes Michalek’s assumptions and estimates and extrapolation, and reports them as fact: Michaleks’s assumptions
“We conservatively estimate that if one were to apply our observed costs to all of the allegations of misconduct reported in the United States to the ORI (n = 217 cases) in their last reporting year, the direct costs would exceed $110 million.”
becomes iThenticate’s fact
“$110 MILLION : total cost of investigations into research misconduct in U.S. in 2010.”
Somehow, this seems similar to a tactic used by the sensational press: take an isolated incident or instance (true or not), claim that it is typical, and then cry, Something must be done.
The iThenticate infographic finishes with graphics and figures which show
(1) that more and more publishers are using some form of text-matching software (and iThenticate claims to have the largest database against which to check), and
(2) that iThenticate checks 2.3 million manuscripts a year, and that in the previous 18 months, iThenticate identified 10+ million matches.
I am not sure what point iThenticate is trying to make with the first of these graphics. As for the second, the time periods are different, making calculation difficult, but this suggests an average of well below five matches per manuscript. (An average of five matches per manuscript? Only five?)
Given that a match does not necessarily indicate or prove plagiarism, these particular factoids seem somewhat … inconclusive? Something does not follow. Non sequitur, throughout. Sleight-of-hand. But then, perhaps, that’s just typical… ?
iThenticate (2012). Costs of Research Misconduct: iThenticate Plagiarism Infographic. iParadigms. Retrieved 20 May 2013 from http://www.ithenticate.com/research-misconduct-infographic/
Michalek AM, Hutson AD, Wicher CP, Trump DL (2010) The Costs and Underappreciated Consequences of Research Misconduct: A Case Study. PLoS Med 7(8): e1000318. doi:10.1371/journal.pmed.1000318.