2 min read

Checking Citations

If you conduct peer reviews of manuscripts for publication, you will do the author(s) a great favor by double checking their citations.

  • Is every work cited in the body of their text listed in the references section?
  • Is every work listed in the references section cited in the body of the text?

Checking for one-to-one correspondence manually is time consuming. That’s why it’s so common to find mismatches.

You can make the job a bit easier using the xCitations() function of the jvamisc package.

library(jvamisc)
xCitations
## function (file = NULL, txt = NULL) 
## {
##     if (is.null(txt)) 
##         txt <- readLines(file)
##     x1 <- unlist(str_extract_all(txt, pattern = "\\(.*?\\)"))
##     x <- unlist(str_extract_all(x1, "[A-Z].*?[0-9]{4}"))
##     return(sort(unique(x)))
## }
## <environment: namespace:jvamisc>

I based the function on code posted by Kay Cichini on 26 Mar 2012 on theBioBucket. I wanted something simple to serve my purposes, so I built the function to take either a text file path or object as input, and the function returns a sorted vector of the unique citations. This makes it easy for me to (yes, this part is still manual) compare them to the works listed in the references.

x <- paste("Yarmouth (1977) said something.",
 "Evidence of x (Barber and Jones 1991),",
 "y (House et al. 1982),",
 "and z (Smith 1990; Folger and Penn 2000).")
xCitations(txt=x)
## [1] "Barber and Jones 1991" "Folger and Penn 2000"  "House et al. 1982"    
## [4] "Smith 1990"

Note that this function does not capture citations in which the author is mentioned directly in the text and only the year is in parentheses, Yarmouth (1977) in the example.

Room for improvement:

  • My current approach is to copy and paste the contents of a manuscript into a text file (removing all images), then use xCitations() on the text file.
    • This works well, but it would be an improvement to read the manuscript document directly into R (typically a Word or PDF file).
  • My current approach is to manually compare the returned vector of citations with the list of references in the manuscript.
    • Fully automating this process would be an improvement.