Are AI Research Tools Trustworthy?
Curious about research tools with artificial intelligence built-in? Wondering if they are trustworthy?
The short answer is that it depends. The structure of the tool, the scope of the data it’s pulling from, and its connectivity with other tools together determine a tool’s usefulness and credibility.
And while there are dozens making the rounds in research circles, I will here outline and share my thoughts on just two, which are specifically research article citation-mining tools.
Citation-Mining Tools
Presenting and linking the citation relationships between research articles isn’t totally new; you may have noticed the “cited by” links on google scholar or the related articles featured on PubMed article webpages. New tools like Research Rabbit, Elicit, and Scite package these relationships in more visually appealing interfaces, and typically add in other features like dashboards, tracking, and plug-ins.
What makes these tools different from older citation-based tools is their use of a kind of AI called “retrieval augmented generation”, or RAG. Unlike large language models which function like auto-complete and cannot fact-check, RAG pulls out snippets of an article surrounding a citation, allowing users to visually scan actual excerpts of articles where another source is cited and mentioned. In terms of what articles the Scite is accessing, preliminary anecdotal testing (Fry 2023) shows the article database to be similar to academic databases: “Its coverage and search results are very comparable to those of Web of Science and Scopus, the other two cited reference search tools most common in academic library collections, and also Google Scholar.” (Fry 2023).
Some tools, like Scite, additionally code these citation relationships as “mentioning”, “contrasting”, or “supporting”, though this functionality seems to be untrustworthy, as a recent research study found that the AI was less accurate than human coding of citation relationships (Bakker et al 2023).
Scite (to which UND subscribes, thanks to the Chester Fritz Library!) also includes a companion AI “assistant” chat, which uses a large language model alongside the RAG to provide “summaries” of literature and bodies of research. Typically, chat-based research tools are murkier than those which just pull and structure citations, and in my own testing of Scite, the AI assistant gave mixed results. Being presented with a single-page summary of the current state of research on a particular topic is appealing, especially when you can re-format it automatically as a table. However, I was disconcerted by how the tool re-frames user queries as assertions, which it then goes on to support with what seem to be cherry-picked sources. Experienced researchers may be able to pick the useful nuggets out of these summaries, but I worry that those less familiar with a specific body of research or the formal conventions of scholarly communications may take the AI’s summaries as confirmation of their hypotheses without looking further for other sources.
Research Rabbit is a much simpler tool in comparison with Scite or Elicit, as it mainly visualizes citation and author relationships, allowing the user to burrow endlessly into the intersections of research studies. Research Rabbit does have dashboard features, allowing users to save articles and searches, and it also (at least as of this writing) does not contain any LLM-based chat features.
These citation-based research tools may be particularly useful for research into interdisciplinary topics where keyword searching alone can be thorny and particularly difficult for beginning researchers. Research Rabbit specifically offers a simplified tool without problematic LLMs, and could be used to support instruction in the social nature of scholarly research writing.
I believe these tools qualify as “useful”; however, any use of these tools requires critical engagement with and a basic understanding of the different AI models at their base (LLMs and RAG), so that users can think critically about the tool’s outputs and make informed decisions about which information is trustworthy and which is not. Whether these tools are trustworthy depends on what we expect them to do, and whether our expectations are reasonable.
If you’re still on the fence, check out the recording of the recent SMHS TLAS workshop, “Leveraging Artificial Intelligence for Research and Educational Scholarship”, featuring Instructional Designer Andrea Guthridge and Research & Education Librarian Devon Olson discussing the above tools and others.
Summaries
Scite
- Scite is a UND-supported subscription-based citation mining research tool
- Uses retrieval-augmented generation (RAG) AI alongside large language model (LLM) AI
- AI assistant is less trustworthy than citation search feature
- Generally less user friendly to beginning researchers
- Offers plug ins for browsers, zotero, etc.
- Can export results in various formats (RIS, csv, etc.)
- Data set pulls from millions of articles with DOIs via agreements with many major publishers as well as harvesting from open databases like Unpaywall and PubMed.
Take-home
Surfacing of citations for readers to scan quickly is a new and possibly very useful feature, while other aspects of this tool like the LLM AI-based chat assistant may artificially limit and misrepresent a body of research, which could be especially confusing and harmful for beginning researchers.
Research Rabbit
- Research Rabbit is a free tool
- Appealing visualizations of citation relationships
- Offers dashboards
- States that it pulls articles from PubMed and Semantic Scholar
- Can export collections of papers in BibTeX or RIS format
Take-home
Generally as useful as Google Scholar; could be helpful to teach beginning researchers about the social, networked nature of research communications, especially given its appealing, graphic interface.