How Are AI Chatbots Changing Scientific Publishing?

Subscribe to Science Friday

A human hand writing on a paper, with a robot arm writing alongside it. — Made with elements from Canva and Shutterstock.

Since ChatGPT was released to the public almost three years ago, generative AI chatbots have had many impacts on our society: They played a large role in the recent Hollywood strikes, energy usage is spiking because of them, and they’re having a chilling effect on various writing-related industries.

But they’re also affecting the world of research papers and scientific publishing. They do offer some benefits, like making technical research papers easier to read, which could make research more accessible to the public and also greatly aid non-English speaking researchers.

But AI chatbots also raise a host of new issues. Researchers estimate that a significant amount of papers from the last couple years were at least partially written by AI, and others suspect that they are supercharging the production of fake research papers, which has led to thousands of paper retractions across major journals in recent years. Major scientific journals are struggling with how to set guidelines for generative AI use in research papers, given that so-called AI-writing detectors are not as accurate as they were once thought to be.

So what does the future of scientific publishing look like in a world where AI chatbots are a reality? And how does that affect the level of trust that the public has with science?

Ira Flatow sits down with Dr. Jessamy Bagenal, senior executive editor at The Lancet and adjunct professor at University of North Carolina at Chapel Hill, to talk about how generative AI is changing the way scientific papers are written, how it’s fueling the fake-paper industry, and how she thinks publishers should adjust their submission guidelines in response.

Donate To Science Friday

Invest in quality science journalism by making a donation to Science Friday.

Donate

Segment Guests

Jessamy Bagenal

Dr. Jessamy Bagenal is senior executive editor at The Lancet and an adjunct professor at the University of North Carolina, Chapel Hill. She’s based in London, UK.

Segment Transcript

IRA FLATOW: This is Science Friday. I’m Ira Flatow. Since its debut almost three years ago, ChatGPT and other generative AI chat bots have changed how we think about the role artificial intelligence plays in all walks of life, right? Just think about it. They played a huge role in last year’s Hollywood strikes. Teachers report more students using them to write essays. And they suck up a lot of electricity, which is prompting AI companies to find cheaper sources.

But there’s another part of our society where their effects are now coming into clearer view. I’m talking about research papers and scientific publishing. According to a researcher at University College London, approximately 1% of scientific articles published in 2023 might have used generative AI. Using chat bots to help write research papers. This might be appealing to some, but they also pose existential threats to the industry.

So how are scientific journals navigating this new environment? Here to talk about the effects these chat bots are having on scientific publishing is my guest, Dr. Jessamy Bagenal, senior executive editor at The Lancet, physician and adjunct professor at University of North Carolina at Chapel Hill. Welcome to Science Friday.

JESSAMY BAGENAL: Hi. Thanks so much for having me on.

IRA FLATOW: You’re welcome. OK, tell us, where does this start for you? Tell us the first time you saw an example or an article that showed the power that AI chat bots could have on scientific publishing?

JESSAMY BAGENAL: Well, it’s been a couple of years now. But obviously, it’s a rapidly moving field. And I’ve come at it from an editor’s point of view, but also from a clinicians point of view in how we think about evidence, and how we think about knowledge. I followed the story very closely after its initial launch. And I think some of the things that I found most striking were those original small studies where, for example, trained researchers would look at abstracts that had been generated by a researcher, and abstracts that had been generated by a generative AI, large language model. And for the most part, they couldn’t tell the difference.

I think that study came about perhaps, you know, within the first six months of ChatGPT first being sort of released onto the market. And it was very clear that if experienced researchers aren’t able to tell the difference between a generative AI abstract and one that has been written by their colleagues, then this is a really big problem for us.

IRA FLATOW: Is that because they lack the effective tools to detect AI generated content?

JESSAMY BAGENAL: I mean, that’s right to an extent. We don’t have any effective tools that will reliably and sensitively pick up when generative AI has been used. But I was sitting on a panel recently– this is this is a huge discussion within the field. And in fact, my colleague, who’s our deputy editor, made a joke the other day that– we’d had an agenda item on a meeting for generative AI, and she said, we always know that we need at least 45 minutes to discuss anything about generative AI.

Because editors, researchers, we’re very alive to this topic in thinking about the best way that we can use it all the time. But a colleague of mine on this panel was saying that, we’re in the business of text. We’re in the business of language. And now we have this amazing tool which can generate language. But there’s actually no part of our value chain that might not be disrupted by this innovation. Peer review is done, for the most part, through the written word. Articles are still written in a way that hasn’t changed for a very long time.

All of this is based on text, on language. And so there’s no part of our work that could not be disrupted by this new tool.

IRA FLATOW: Interesting. Now, when you say disruption– and I hear you speaking about this as being negative.

JESSAMY BAGENAL: No. I don’t think it’s negative.

IRA FLATOW: No, you don’t?

JESSAMY BAGENAL: I don’t think it’s just negative. But it has to be thoughtfully and sensitively implemented. And that’s challenging, because it’s a very rapidly moving field. And we’re all just getting up to speed with how people are using it and what appropriate use looks like. So for example, at The Lancet, we implemented a new tick box about six months ago where we ask authors at the submission stage, have you used generative AI in any part of this study? And if they tick, yes, then we ask them how.

And then in our editorial manager, we have a little sort of red A which appears to alert editors to the fact that generative AI has been used in some way in this manuscript. And then we’re able to follow external, widespread policies on how generative AI should be acknowledged in a manuscript. But these things are changing all the time. And I think there’s a huge opportunity for generative AI to be a great positive influence on scientific publishing.

But there are also dangers. And so it has to be very carefully thought about.

IRA FLATOW: I understand that. I have been reading research papers for decades, and I’m always struck about how poorly some of them are written. Not that the data is bad. And I’m thinking, hey, if you unleash AI on this, maybe you can get a better written narrative here going. Would that be some positive way it could be used?

JESSAMY BAGENAL: Definitely, that’s one positive. And I think from an inclusion and diversity point of view, we still transmit so much knowledge through the English language, which excludes an enormous amount of people because English is not their first language. We’re very lucky at The Lancet to have a very large team of internal assistant editors who make everything into Lancet-style. But that’s not the same across the scientific publishing field. In many journals, there isn’t that internal expertise.

And so actually, if you submit something which isn’t written particularly well, then of course, that will impact the likelihood of whether that editor decides to send it out or not. Because they might have problems understanding what the actual research is saying. So I think there’s an enormous opportunity there to make it more inclusive– make scientific more inclusive and fairer. And I think also when we’re thinking about people who might be neurodivergent.

I’ve got lots of clinician friends who are dyslexic, and actually being able to use large language models to help them structure sentences and how to structure an article is a very efficient way of them being able to articulate themselves and their ideas in what most people consider a sort of socially acceptable manner.

IRA FLATOW: Yeah, because most people think of AI as cheating. But you’re not talking about that here. And you’re pointing out the positive aspects. And when you talk to scientists who do use generative AI, what did they say about using, like, ChatGPT to help write their research papers?

JESSAMY BAGENAL: I think scientists and clinicians across the world are, for the most part, doing amazing work under incredibly stressful situations. And they’re often overloaded with work. Their to-do lists are extraordinarily long. And so having ChatGPT as an efficiency tool, which can allow them to put together an article very quickly, or might allow them to write a cover letter in a more compelling manner.

And from our point of view, from a scientific publishers point of view, generative AI that might be able to allow our submission process to be easier for authors. And for us, as editors, to be able to interact with them in a kind of more slick and easy fashion. That type of efficiency could have real benefits to patients, and to people’s lives, and to scientific progress.

IRA FLATOW: I know that since ChatGPT came out, the major journals have provided some guidelines and policies to researchers about the use of generative AI in papers. But it also seems like a pretty messy landscape right now. I mean, are these guidelines standard across all the journals and research papers?

JESSAMY BAGENAL: Well, we obviously have external bodies which bring together a number of different journals. So for instance, the ICJME, which is sort of bringing together lots of medical editors and journals. They have published guidance. And so any journals that sort of consign to them also tend to take some of their guidance on from for generative AI. And equally, organizations like COPE, which help with editorial guidance for journals, also have their own sort of set of guidelines.

So I think it’s right that there are these external benchmarking places which are releasing loose guidance. But obviously, each journal is different. Each journal has a different topic. They have different article types. They have different things that they’re trying to do with those journals. And let’s not forget that journals are actually very human endeavors. They are how we, as humans, interpret scientific progress. And how we put it into context. And what that means for either patients, or for science, and for that field.

And so I also think that it’s right that each journal should be very clear on how they want generative AI to be used. So for example, at The Lancet, we have a section which includes commentaries, correspondence, perspectives, art of medicine. And this is an area of our journal which really requires human interpretation. And so we’re in the process, at the moment, of thinking about the fact that we would like to limit the use of generative AI for this section.

Because we feel passionately that human ingenuity, and putting things into context, and being able to see what’s new– not just trawling through what’s on the internet and putting together what sounds good about a particular topic, but actually expertise, experience, and vision. We are thinking about limiting the use of generative AI in that section to just using it for English grammar and spelling. So that we’re not excluding people who don’t speak English.

IRA FLATOW: Right. Interesting point. Of course, the elephant in the room here are paper mills. And I’m not talking about factories that make paper. Can you explain what those are, and why they’re such a big issue?

JESSAMY BAGENAL: Yeah. So paper mills are sort of nefarious organizations that essentially have understood the scientific publishing landscape and are gaming it, and selling authorship for manuscripts that often are filled with nonsense. And so when you– may all have heard of mass retractions from different publishers having to retract articles that essentially were not based in any scientific fact, and were not really science, but often complete nonsense. And they’re a huge problem.

They’re a problem for publishers. But in the wider context, they’re a problem for science because this really breaks down the trust.

IRA FLATOW: We’re talking about phony papers here, right?

JESSAMY BAGENAL: We’re talking about phony papers. So it could literally be a manuscript about complete nonsense where the results are fabricated, the context is fabricated. And authorship is sold to academics for these publishers. And so they’ve kind of got into the editorial process by perhaps having guest editors. They’ve manipulated the peer review process. And they’re an enormous problem for the scientific publishing world.

And so there’s a real question there as to, in the context of generative AI, how as editors do we make sure that what we’re reading is real?

IRA FLATOW: Yeah. And this may sound like a crazy question, but why not use– if they’re writing it in ChatGPT or AI, why not use ChatGPT or AI to find them to weed out some of these papers?

JESSAMY BAGENAL: That’s exactly right. This is a big data problem. And I think Elsevier, which is the company that owns The Lancet, is putting in an enormous amount of resources and effort into thinking about research misconduct and research integrity in the context of big data. How can we use some of these patterns across many, many different papers to be able to pick out what’s real and what’s not real?

But in the larger context, in some ways, generative AI will sort of turbocharge that. Because you’re able to very quickly put together a nonsense manuscript that looks and sounds like it should be published, but actually might be about nothing. On the other hand, paper mill business model is based on people paying for authorship. And actually, if people at home on their own can put together this type of paper, why would they pay a paper mill to do it for them? They might just do it themselves.

So I think this is a huge problem. And one that I know a lot of people are thinking about very seriously.

IRA FLATOW: This is science Friday from WNYC Studios. And some of the potential solutions here, can you offer any?

JESSAMY BAGENAL: I mean, I think they lie in sort of big and small changes. And for the most part, they’re probably going to be pretty costly and difficult. I think a major step is to recognize that over the past decade, two decades, we’ve had a major trend towards the open science movement. Which nobody can disagree with from an ethical or moral way. We all want science to be accessible and available to everybody.

But in reality, what that’s meant from a scientific publishing business model aspect, it’s meant that authors pay to get their articles published open access. And so there has been a focus on quantity over quality. And I think that that’s rapidly adjusting and changing. And many scientific publishers are changing their way that they’re thinking about that. So we need some better business models. We need some other ways of thinking about open access in the context of generative AI.

And then I think another major step is thinking about the environment within which we all work. There is a serious problem with academic environments which often reward the quantity that an academic has published over the quality that they’ve published.

IRA FLATOW: Publish or perish.

JESSAMY BAGENAL: Yeah, exactly. Publish or perish. So there’s an incentive there to publish, publish, publish, regardless of whether it might constitute a bit of research waste in terms of, does this question really need to be answered? Has it already been answered? But then on the other end of the scale, there is an incentive there to try and get things published which aren’t necessarily adding to human health or to scientific progress.

IRA FLATOW: Dr. Bagenal, you write about the steps necessary to take, and where you think this might be going. How well is this progressing? Or how well are we getting toward the goals that you talk about?

JESSAMY BAGENAL: I think when there’s been any huge innovation in technology, there’s always a bit of a policy gap between people trying to catch up with what’s happened and create policies and ways of working which will adapt to this huge new innovation. And that’s certainly what we’re seeing now. It’s been a couple of years since ChatGPT. And only really now, I think, are journals and editors really getting up to speed with the types of things that we might need.

But also, because large language models are incredible tools with the ability to improve all the time– and we’ve seen that with the versions that have already come out. Each time, there’s an improvement in how they are performing– we need to be very flexible and adaptable to change to those different things because we might we might start seeing more hallucinations. There was a very interesting paper in nature a couple of months ago about the fact that large language models are– when they come to the end of what’s already been published, what’s already on the internet, how do they get new data?

And what happens if you use synthetic data? And actually, for the most part, it looked like those models almost completely fell apart. They stopped being able to work. So there are lots of issues that are going to become clear over the coming months that we’ll need to be very alive to and be able to adapt to. But at the moment, I think we are– certainly at The Lancet, and I know many other journals– spending an awful lot of time thinking about this.

We are implementing practical, tangible policies which are meant to be able to improve the process for authors. But also to make the content that we publish very high quality and very useful and usable for our readers.

IRA FLATOW: So you have to keep up with it. As ChatGPT gets better, you have to–

JESSAMY BAGENAL: We have to get better. Exactly. Exactly. We must.

IRA FLATOW: Yeah, very interesting stuff. We’re going to keep track of all of this. Thank you very much for taking time to be with us today, Dr. Bagenal.

JESSAMY BAGENAL: No problem. It was lovely to chat to you.

IRA FLATOW: Dr. Jessamy Bagenal, senior executive editor at The Lancet, and adjunct professor at the University of North Carolina at Chapel Hill.

Copyright © 2024 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/

Meet the Producers and Host

About Dee Peterschmidt

Dee Peterschmidt is a producer, host of the podcast Universe of Art, and composes music for Science Friday’s podcasts. Their D&D character is a clumsy bard named Chip Chap Chopman.

About Ira Flatow

Ira Flatow is the founder and host of Science Friday. His green thumb has revived many an office plant at death’s door.

Cookie	Duration	Description
_abck	1 year	This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
ASP.NET_SessionId	session	Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
AWSALBCORS	7 days	This cookie is managed by Amazon Web Services and is used for load balancing.
bm_sz	4 hours	This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
cookielawinfo-checkbox-advertisement	1 year	Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken	past	This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID	session	The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
nlbi_972453	session	A load balancing cookie set to ensure requests by a client are sent to the same origin server.
PHPSESSID	session	This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
TiPMix	1 hour	The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
visid_incap_972453	1 year	SiteLock sets this cookie to provide cloud-based website security services.
X-Mapping-fjhppofk	session	This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
x-ms-routing-name	1 hour	Azure sets this cookie for routing production traffic by specifying the production slot.

Cookie	Duration	Description
__cf_bm	30 minutes	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie	2 years	LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie	2 years	LinkedIn sets this cookie to store performed actions on the website.
lang	session	LinkedIn sets this cookie to remember a user's language setting.
lidc	1 day	LinkedIn sets the lidc cookie to facilitate data center selection.
S	1 hour	Used by Yahoo to provide ads, content or analytics.
sp_landing	1 day	The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t	1 year	The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
UserMatchHistory	1 month	LinkedIn sets this cookie for LinkedIn Ads ID syncing.

Cookie	Duration	Description
__jid	30 minutes	Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
_gat	1 minute	This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_UA-28243511-22	1 minute	A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB	7 days	AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
countryCode	session	This cookie is used for storing country code selected from country selector.

Cookie	Duration	Description
_fbp	3 months	This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr	3 months	Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE	1 year 24 days	Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
NID	6 months	NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
personalization_id	2 years	Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie	15 minutes	The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
vglnk.Agent.p	1 year	VigLink sets this cookie to track the user behaviour and also limit the ads displayed, in order to ensure relevant advertising.
vglnk.PartnerRfsh.p	1 year	VigLink sets this cookie to show users relevant advertisements and also limit the number of adverts that are shown to them.
VISITOR_INFO1_LIVE	5 months 27 days	A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC	session	YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id	never	YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests	never	This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.

Cookie	Duration	Description
_dc_gtm_UA-28243511-20	1 minute	No description
abtest-identifier	1 year	No description
AnalyticsSyncHistory	1 month	No description
ARRAffinityCU	session	No description available.
ccc	1 month	No description
COMPASS	1 hour	No description
cookies.js_dtest	session	No description
debug	never	No description available.
donation-identifier	1 year	No description
f	never	No description available.
GFE_RTT	5 minutes	No description available.
incap_ses_1185_2233503	session	No description
incap_ses_1185_823975	session	No description
incap_ses_1185_972453	session	No description
incap_ses_1319_2233503	session	No description
incap_ses_1319_823975	session	No description
incap_ses_1319_972453	session	No description
incap_ses_1364_2233503	session	No description
incap_ses_1364_823975	session	No description
incap_ses_1364_972453	session	No description
incap_ses_1580_2233503	session	No description
incap_ses_1580_823975	session	No description
incap_ses_1580_972453	session	No description
incap_ses_198_2233503	session	No description
incap_ses_198_823975	session	No description
incap_ses_198_972453	session	No description
incap_ses_340_2233503	session	No description
incap_ses_340_823975	session	No description
incap_ses_340_972453	session	No description
incap_ses_374_2233503	session	No description
incap_ses_374_823975	session	No description
incap_ses_374_972453	session	No description
incap_ses_375_2233503	session	No description
incap_ses_375_823975	session	No description
incap_ses_375_972453	session	No description
incap_ses_455_2233503	session	No description
incap_ses_455_823975	session	No description
incap_ses_455_972453	session	No description
incap_ses_8076_2233503	session	No description
incap_ses_8076_823975	session	No description
incap_ses_8076_972453	session	No description
incap_ses_867_2233503	session	No description
incap_ses_867_823975	session	No description
incap_ses_867_972453	session	No description
incap_ses_9117_2233503	session	No description
incap_ses_9117_823975	session	No description
incap_ses_9117_972453	session	No description
li_gc	2 years	No description
loglevel	never	No description available.
msToken	10 days	No description

How Are AI Chatbots Changing Scientific Publishing?

Subscribe to Science Friday

Further Reading

Donate To Science Friday

Segment Guests

Segment Transcript

Meet the Producers and Host

About Dee Peterschmidt

About Ira Flatow

Explore More

Subscribe to Science Friday

Further Reading

Donate To Science Friday

Segment Guests

Segment Transcript

Meet the Producers and Host

About Dee Peterschmidt

About Ira Flatow

Explore More

Understanding And Curbing Generative AI’s Energy Consumption

What To Do When Your Hypothesis Is Wrong? Publish!