How to use CoCites — the basics

A radically different search method

The default method to searching scientific literature is entering keywords in a literature database. The quality and relevance of the keywords determines the quality and relevance of the results. Entering several best-guess keywords will return relevant articles, but they are likely not all there is and may not be a ‘representative’ selection. Researchers who don’t want to miss any relevant studies write extensive search queries for multiple literature databases that may catch all relevant articles and many others. Our preferred strategy is rather inefficient.

CoCites searches the scientific literature radically differently. The tool finds related articles using co-citations. The start of the search isn’t a set of keywords, but a set of one or more articles that is exactly what the search should find more of.

Searching co-citations and citations for one or more articles

The co-citations are the articles that are cited together with any of the articles in the query set, by experts on the topic. A co-citation search can only find articles that are cited and falls short on find articles with few citations and newly published articles. As these articles themselves cite multiple articles on the same topic, we can find them by tracking the citations to all articles in the query set. A citation search will find these articles when the query set contains enough articles about the specific topic.

This blog summarizes how to perform the co-citation and the citation searches using CoCites for one article and for a set of articles.

Finding the co-citations of a single article using the browser extension

The CoCites browser extension (www.cocites.com) embeds the CoCites search button in PubMed and Google Scholar. This button is blue or grey. The blue button, with the article’s number of citations, is an active link that opens in CoCites; the grey button is inactive, meaning the article is not (yet) in the citation database.

CoCites buttons in PubMed

The co-citation button shows the list of co-cited articles on the CoCites website. This is what you see there:

Search results in CoCites

1. The query or seed article. This example article has 23 citations.

2. The number of co-cited articles. CoCites can retrieve all co-cited articles or only those that are co-cited more than once. The latter is default. In our pilot study, we found that 80% of the articles in the search results is co-cited once and that these articles are unlikely relevant to the topic of interest.

3. The search results. These are the co-cited articles. The title links to the article information in PubMed for more details about the article, including the abstract.

4. a. Times cited. The number of citations for each article in the results. The number is a hyperlink that starts a co-citation search for the specific article.

b. Times co-cited. The number of times the article appears in a same reference list as the query article. The number is a link to the list of articles that cite both.

c. Similarity. This is what we’ve termed the relative co-citation index. It is the number of co-citations (6) divided by the lowest number of the two articles’ citations (23 vs 25), so the calculation is 6/min(23,25). This index is informative when the query article is highly-cited as their top-ranked search results are highly-cited too. When these articles are not on the same niche topic — -and they are merely frequently co-cited because they are highly-cited in the same field — then the similarity score is lower. See this example about equipoise and the ethics of clinical research. The co-cited articles with high similarity scores are on equipoise, others are not.

If you have registered on the CoCites website, then the following options appear as well:

5. Export the search results. The results can be exported to a data file (Excel) or to citation managers such as Endnote, Zotero, Mendeley, and others.

6. Building query sets. The co-citation search can be performed for one article or for many on a similar topic. Relevant articles can be selected and added to the query set. That’s next.

Co-citation and citation searches using query sets

CoCites can search co-citations for one article or for many. When multiple articles on the same topic are combined in a query set, the co-citation search finds articles that are frequently co-cited with any of the articles in the set. A query set of multiple articles is more likely to find recent articles on the same topic. Researchers who publish a new randomized trial may cite one or more previous trials, but not all and not necessarily the same few. Building a query set is like using synonyms in keyword searches.

The citation search tracks the citations and references of all articles in the query set and ranks them in descending order of appearance. This search is ideal for finding new articles that are too new to be cited and works better when the number of articles is larger. As a rule of thumb we use a minimum of 25, but that will vary by field and topic. The validation study on the performance of CoCites provides some support for this number, but more research is needed.

This are the options for the co-citation and citation searches:

1. Limiting the citing articles. In co-citation searching, the co-cited articles of highly-cited query articles may dominate the search results. This may be okay when the study is exactly what the search should find more of, such as a famous randomized clinical trial, but it may make the search inefficient otherwise. Instead of excluding the article from the query set, you can limit the search to its most recent citations, e.g., a similar number as the other articles in the set. It is recommended to limit the citing articles per query article, not overall, to prevent that the citations of one query article dominate the search results. The co-citation search is restricted to 1,000 citing articles (which is way more than you want to search).

2. Only retrieve articles (co-)cited > 1. By default, CoCites does not return articles that are (co-)cited once. About 80% of the co-citations are co-cited only once and these are rarely relevant for the specific topic. Instead of screening these articles for relevance, it is recommended to add more articles to the query set and re-run the search. Articles that were previously co-cited once are now flagged as newly found.

3. Only retrieve articles published before/after. We can specify a year to only retrieve older or newer articles. This option is handy for updating systematic reviews and meta-analyses where the interest is in finding articles published after the previous review date.

4. Flag or only show new articles. When the search is repeated, CoCites flags articles that are new compared to the last search. New articles were not in the previous results or co-cited only once. The new search can flag the results of a previous co-citation search, citation search, or both. All articles are retrieved and available for export, but only the new ones will be flagged or shown.

5. Flag or only show articles not in the query set. By default, CoCites flags articles that are not in the query set.

6. Filter by similarity score. As mentioned above, highly-cited are frequently cited together with other highly-cited articles in the same field, without them really being on the same topic. These articles will have a low similarity score. The results can be filtered to exclude articles with a low similarity score.

7. View citing articles. If you perform a new co-citation search, this button shows the articles that are citing the articles in the query set. Their reference lists are used to retrieve the co-cited articles.

Miscellaneous tips and comments

Query sets, broad and narrow. Researchers who work on a systematic review may build a query set of eligible studies. These studies can also be those that will be excluded from the review because of methodological reasons — they still are on the same topic. Such narrow searches yield narrow results, broad searches may return a wide variety of studies. You find the latter when your query set includes reviews, perspectives, and methods papers.

More query articles isn’t necessarily better. The key to building a good query set is to think what researchers may cite if they publish about the specific topic of your interest. Think of key and classic papers. Avoid generic articles about diagnostic criteria, study protocols, and statistical methods. These are likely cited in their papers, but so are they in papers that have little to do with your specific topic.

Screen articles (co-)cited once? Our research (here, here, and coming soon) shows that relevant articles tend to be (co-)cited more than once. Screening the rest is inefficient. A preferred alternative is to screen the search results for relevant articles, add these to the query set, and perform the search again. Relevant articles that were (co-)cited once may rank higher when more query articles are added.

The database. CoCites uses the iCite database of NIH and is so far only available for health research. Let us know about other open access citation databases in which you would like to use citation searching.

This blog will be updated regularly. If you have suggestions or questions about CoCites, email cecile.janssens[@]cocites.com. For the latest news, follow me on Twitter.

Professor of epidemiology | Emory University, Atlanta USA | Writes about (genetic) prediction, critical thinking, evidence, and lack thereof.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store