03/21/2025

‘Delving’ Into The ‘Realm’ Of AI Word Choice

12:16 minutes

A computer with melting pixels on its screen. Behind it is unreadable text with highlighted words "delve" "realm" "intricate" and "underscore."
Image made with elements from Shutterstock and Canva.

Several years ago, some eagle-eyed readers of scientific papers noticed an unusual trend—an increase in the number of abstracts using certain words. The terms, including “delve,” “realm,” “evolving landscape,” and more, were suddenly appearing more often than they used to.

Researchers analyzed the abstracts and compared them to abstracts written just a few years earlier, before the widespread availability of artificial intelligence large language model chatbots. They came to the conclusion that abstracts written by AI were more likely to use words from a list of around 20 favorites than regular human speech. The question was, why? If the models were trained on conventional writing, how did a preference for words such as “delve” creep in?

Host Flora Lichtman talks with Dr. Tom Juzek and Dr. Zina Ward of Florida State University, who set out to try to understand the origins of some of AI’s favorite words.


Further Reading


lightbulb Our AI For STEM Education collection includes an activity to help kids investigate simple chatbots and gain insights into the inner workings of large language models as well as machine learning and artificial intelligence. Resources at the bottom of the activity cover the history of chatbots, ethical considerations, and limitations of AI.

Segment Guests

Tom Juzek

Dr. Tom Juzek is an assistant professor of computational linguistics in the Department of Modern Languages and Linguistics at Florida State University in Tallahassee, Florida.

Zina Ward

Dr. Zina Ward is an assistant professor in the Department of Philosophy at Florida State University in Tallahassee, Florida.

Segment Transcript

FLORA LICHTMAN: This is Science Friday. I’m Flora Lichtman. Later in the hour, new cosmology results raise more questions about dark energy. But first, you know how when you read something you can sometimes get a picture of the person who wrote it?

Like, if I start talking about touching base to synergize after we circle back, first of all, please call for an intervention, but second of all, you might reasonably suspect that someone from the E-suite got ahold of this script. Or if I talk about giving 110% or winners don’t quit, you might clock me as a sports person.

Well, in recent years, people started noticing that AI chatbots’ large language models had their own linguistic tells, certain words that seemed to crop up a lot more often in AI sentences than in human English, words like “delve” and “realm.” And it wasn’t subtle. It was a big, sudden shift in word usage.

Joining me now to talk about that are two researchers who decided to delve into the realm of AI linguistics. Tom Juzek is an assistant professor of computational linguistics in the Department of Modern Languages and Linguistics, and Zina Ward is an assistant professor in the Department of Philosophy, both at Florida State University in Tallahassee. Hello to you both. Welcome to science Friday.

ZINA WARD: Hi. Thanks for having us.

TOM JUZEK: Yes, thanks for having us.

FLORA LICHTMAN: Tom, I have to know more about AI vernacular. What are some of the words that AI loves to overuse?

TOM JUZEK: Some of the words that we found– and it’s quite a list– is the more obvious ones, like “delve,” “garnered,” “realm,” “an evolving landscape.”

FLORA LICHTMAN: “Evolving landscape” is one of them?

TOM JUZEK: I’ve seen that a lot, so I think that’s a candidate. And the more subtle ones are something like “underscore,” “emphasize,” and “showcasing.”

FLORA LICHTMAN: How did you figure this out?

TOM JUZEK: It started with a discussion on social media, that people noticed that certain words, especially “delve,” really suddenly popped up everywhere, especially in scientific writing and education. And so people started to look into these more systematically. But then what interests us from the beginning was the why is that happening?

FLORA LICHTMAN: Why does this happen?

ZINA WARD: So our hypothesis, which we investigated with a couple of different experiments, is that the models prefer these words because they were trained on human feedback where the humans exhibited a preference for those words. So that’s the thing we wanted to test, the idea we wanted to test.

And in short, what we did is ran a couple of online studies where we asked participants to judge the quality of abstracts with and without those words, and we found in the most recent run of the study that participants do exhibit a very slight preference for abstracts with words like “showcase” and “underscore” and “comprehend.”

And so the models are trained to produce responses that human evaluators like. The models learn from human feedback in a certain stage of their training. And so if these human evaluators are exhibiting that preference, the models are going to pick up on that and reproduce those words more frequently. So that’s what we think is happening.

FLORA LICHTMAN: That’s so fascinating. To me, “delve” and “evolving landscape” really feel like words I might hear in a Silicon Valley conference pod or whatever. Is that a coincidence? Is that who’s training these models?

ZINA WARD: Well, we don’t know for sure– tech companies don’t publish the details of their training procedures– but it seems likely that the workers in general are employees in the Global South. The tech companies often outsource that kind of labor to countries where wages are lower.

And so those workers are seeing potential outputs from the models and providing their feedback on which ones are higher quality, and then the models are being trained to reproduce those sorts of responses. So it’s really probably the preferences of these employees in the global south that are filtering through and shaping the lexical choices of these large language models.

FLORA LICHTMAN: Yeah, that’s fascinating. Tom, we’ve seen linguistic fads before. I’m old enough to remember the meteoric rise of “touch base.” How is what you observed here different?

TOM JUZEK: Rather rapid language change is also nothing new. “Touch base–” I looked up some Millennial-speak– “awesome,” “cool–” what they exhibit when you look at the data over the time is that they have a period of several years of pretty steep rise, almost like an S-curve. They start to rise. Then they have this period of really quickly gaining traction, and then they plateau out.

Sometimes we do see a sudden spike, which is when a real world event occurred that motivates that spike, concretely SARS or Omicron. Now what is different, the now-observed changes, is that there is the sudden spike, almost like a step function, with an abrupt breakthrough. But there’s no real-world motivation why it should be “delve” or “intricate” or “nuanced,” and that is different.

FLORA LICHTMAN: You mean like it almost feels random?

TOM JUZEK: Yeah. It’s like someone pulled a switch, and here’s “delve.” Here’s “intricate.”

FLORA LICHTMAN: That seems only possible using AI because humans maybe don’t change as fast.

TOM JUZEK: Yes, and that is one of the areas where still a lot of research is directed to, the question, do these words really sit-in our language systems now? Is there “delve” or “intricate” in the abstract of a scientific article or a student essay because people have this in their language system and this is what they put down? Or is it actually tool usage, AI usage?

And so far, this is under research. It’s a hypothesis one way or the other, and I think what we will be seeing sooner or later is where people look at spontaneous spoken language. And I think chances are they will find some traces of these AI words having entered the human language system.

FLORA LICHTMAN: Well, I’m really interested in this. We talk a lot about us training AI, but is AI training us? Is there evidence that these AIs are changing the way that we talk and we write and we think?

TOM JUZEK: Yeah, the state of the research is that this is an hypothesis. And the key here is to look at spoken language. Most research has focused on written language. And when it comes to spoken language, I think it’s fair to say we do not see these abrupt changes that we’ve seen in written texts.

But now there is first research coming out. There’s an analysis of YouTube talks, which is still scripted spoken language, where there has been an uptake of these AI-induced words. And there is research in the making– I have a student working on this– that when it comes to semispontaneous spoken language, conversational language, there seems to be an uptick, but that seems to be more mixed.

FLORA LICHTMAN: That, to me, seems really profound.

ZINA WARD: Absolutely. And it’s interesting, too, that to the extent that these models are changing our language and to the extent that their preferences about words are shaped by these workers in the Global South that we’ve talked about, it’s a really interesting inversion of the usual way that linguistic change happens.

So linguists and historians of language have studied over time how changes in language in the Western industrialized world trickle outward, especially in this sort of globalized culture and economy, to everywhere, to other English-speaking parts of the world if we’re looking at English in particular.

And what’s interesting about these linguistic changes is that if our language really is being shaped by LLMs and LLMs’ language is in turn being shaped by employees of tech companies in the Global South, there’s a kind of a reversal of the usual direction of influence.

TOM JUZEK: And if I may come in on this, there are two sides to this. One is today’s produced data will become tomorrow’s training data for these models. So there’s a chance that we will see a loop and all this accelerating.

And then the other side– there is what one could call the creep-in factor. We observe AI influences human language to quite some degree, but it might well be that when we reflect on it we might come to the conclusion this is not something we actually want. And so some of that language that now enters our intercellular system does so beyond the level of perception. And items like “underscore,” “multifaceted,” or “necessitate–” they are less discussed.

But, of course, that links to the discourse of a more concerning version of this creep-in factor, which is, say, undesired political beliefs that then gradually seep into our belief system. And I think this is why the research to try and identify where this model behavior comes from, matters to a good degree.

FLORA LICHTMAN: Can these words be used as a diagnostic, like as a way to tell that something was written by AI?

ZINA WARD: Not with high confidence. As you probably know, AI detection is a– really, it’s a cat-and-mouse game, and it’s really quite difficult. So we’re, I think, a little bit pessimistic about the potential to use this for diagnostics, but it certainly raises eyebrows, I think, for us when we see these words. It raises questions, even if it doesn’t provide a conclusive answer.

FLORA LICHTMAN: Zina, you’re a philosopher. Why does this interest you?

ZINA WARD: I’m a philosopher of science, and so these models just fascinate me. I think, like many people, I’m amazed by how well they work, and it’s all the more amazing given that we don’t really understand why they work as well as they do.

And so it’s also a kind of intellectual challenge trying to figure out how they work because it’s not like you can just pop the hood and look at the word choice module. They’re really complex systems, so you have to be quite creative in how you probe them to try to figure out what they’re doing. So yeah, for me that’s why it’s of interest. It’s a complex system that really requires some ingenuity to understand.

FLORA LICHTMAN: And Tom, what about you?

TOM JUZEK: From a linguistic perspective, what we’re seeing right now is almost unprecedented language change. The shifts in word usage that we’re observing over such a short period of time– that’s really remarkable. I say, “almost,” because we’ve seen in history– we’ve seen technology influencing language. I’m thinking like the printing press, the internet, social media.

But what we’re observing right now is quite something. We could be entering a period of rapid language change. So what we’re seeing is not just a year or two a few dozen words, but really this continues for an extended period of time. This is one possibility.

The other possibility is that now we had two or three years of rapid language change, and it will kind of fade out. And we don’t know yet what is going to happen. We’ll find out. But there is a chance that we will see considerable language change over an extended period of time.

FLORA LICHTMAN: I want to thank you both for joining me today to talk about this.

ZINA WARD: Thanks for having us.

TOM JUZEK: Thank you so much.

FLORA LICHTMAN: Tom Juzek is an assistant professor of computational linguistics in the Department of Modern Languages and Linguistics, and Zina Ward is an assistant professor in the Department of Philosophy, both at Florida State University in Tallahassee.

Copyright © 2025 Science Friday Initiative. All rights reserved. Science Friday transcripts are produced on a tight deadline by 3Play Media. Fidelity to the original aired/published audio or video file might vary, and text might be updated or amended in the future. For the authoritative record of Science Friday’s programming, please visit the original aired/published recording. For terms of use and more information, visit our policies pages at http://www.sciencefriday.com/about/policies/

Meet the Producers and Host

About Charles Bergquist

As Science Friday’s director and senior producer, Charles Bergquist channels the chaos of a live production studio into something sounding like a radio program. Favorite topics include planetary sciences, chemistry, materials, and shiny things with blinking lights.

About Flora Lichtman

Flora Lichtman is a host of Science Friday. In a previous life, she lived on a research ship where apertivi were served on the top deck, hoisted there via pulley by the ship’s chef.

Explore More