SciFri Findings is a series that explores how we understand the impact of science journalism, media and programming on our audiences. Sign up for our newsletter to get the latest reports!
Getting feedback from those who we serve has always been an integral part of Science Friday’s objective in making science more accessible. Audience research is no different— I am always curious to know what value does the work we do provide our audiences? Where can we make things better? How can we have deeper engagement and impact? In June 2023 we launched an audience survey across multiple platforms (radio, social media, newsletters, donors, etc). This audience survey was informed by in depth interviews on radio programming conducted in Fall 2022. I excitedly waited as the survey went into the world, hoping 200-300 people would care enough to complete it.
The joy and the surprise as the numbers started trickling and then monsooning in— 1, then 100, 500, 2500, all the way up to over 6800! Our Director of Audience, Ariel Zych, and I started playing with the data–we wanted to use ChatGPT to theme and summarize some of the qualitative data. We copied some open responses into Chat and started noticing something odd here and there in the data we were pasting over.
There, in our own raw data, were a damning number of clearly AI-generated responses, shamelessly self-disclosing with responses to open preference questions like “As an AI language model, I do not have personal preferences…” Scrolling more through the data it became clear…AI bots had struck our study, HARD! We looked at each other and couldn’t help but laugh at the irony. Here we share some lessons learned (and some we had forgotten) after we wiped away our tears and started cleanup.
Tips for Your Next Online Survey
Use survey software that has a CAPTCHA: “Completely Automated Public Turing test to tell Computers and Humans Apart ” or CAPTCHAs are questions we have all likely seen. These programs differentiate humans from bot respondents. Many online software companies provide CAPTCHA options for surveys but only for paid subscriptions.
No CAPTCHA? Trap ‘em: You may not have the budget for licensing survey tools with a CAPTCHA feature. Trap questions are an alternative to a CAPTCHA that can help provide some coverage against bots. They are used to identify respondents who are not paying attention to survey questions (e.g. someone choosing “Strongly Agree” for all questions). A trap question can take many forms, including a question to identify an object in an embedded picture, a prompt to type specific words into a text box, etc. Once the data is collected, you can filter out any respondents with incorrect answers. Trap questions not only protect against bots, but also bad actors such as trolls with an agenda, or those who don’t actually know about the product or program but who want to receive the cash incentive. By providing a small number of trap questions, you can ensure your target audiences are the ones providing you with good data, and eliminate the rest.
We incorporated a trap question during the design phase of our audience survey. Participants were asked to identify Science Friday’s host, Ira Flatow. Answer choices included only other male science journalists and communicators so that all options could be viable options and limit the number of bots/bad actors in the data. We used this type of trap question because we wanted to survey existing audiences who should know the host, not new audiences. This one step eliminated almost 20% (N=1357) of our sample!
More money, more bots, more problems: Cash is king in the survey world. Participants are often rewarded with cash or gift cards for each completed survey. Even the chance for a lottery incentive has shown to increase response rates for online research. We chose to provide a $50 e-gift card lottery incentive to balance the length of the survey and motivate more audience members to complete it. Money is great, but with more money comes higher incentives for bot creators, bad actors, and trolls to participate for cash alone. We quickly realized $50 was a lot to offer for a ~12-13 min survey. It made me think: How can I make sure to value the participants time while still making sure I get the information I need? Next time, we will consider lowering the threshold of our cash incentive. Perhaps it could have been limited to $25 instead? If this didn’t yield enough participants, maybe a second recruitment waive would be in order? In the future, particularly for audience surveys, we might consider offering other things of value, such as merchandise or free event tickets instead. Non-cash offers might reduce the number of people interested in just being paid for survey completion. It can also provide value by giving participants tangible materials and/or deeper engagement with your organization.
Segment audiences: Whenever feasible, use different utm or referral links for different recruitment pathways for your surveys. We used different links for each platform (i.e. Twitter, Newsletters, Donors) to understand where traffic was coming from, look for differences in the preferences between audiences, and to capture the possible universe size for our sample. We had more than half our respondents come from Facebook, which is disproportionately higher than we usually see for surveys. Generally, we find our radio audiences to be the largest referrals so seeing so many come from Facebook was a red flag. Additionally, segmenting audiences can identify any strange patterns in the data. For example, if you have previously surveyed audiences, you may already have demographic data to check against new data. If you know your organization primarily serves older adults, and see that your survey consists of only young participants the data may be compromised. Consider whether it could be the topic of the survey, recruitment, or if this anomaly is a potential bot.
Cleaning Up The Data
After a few laughs and tears, I had the task of figuring out how exactly to clean up the jumbled mess of data we had. With a filtered dataset (thanks trap question!), I started cleaning the data using
Impossible timestamps: Responses submitted within the same second of each other were removed. Many of the most suspicious responses were submitted with nearly the same time stamp late at night (12-3 am) or early morning (4-7 am) which are unlikely times for our US-based audiences to complete surveys.
Obvious AI language: I had a number of open-ended questions for the survey. Any responses that had very obvious language (“As an AI language model, I do not have personal preferences…”) were removed.
Non human sounding responses: Some of our open-ended questions included asking why participants preferred certain broadcast formats. We eliminated any responses that didn’t sound authentic to an audience voice. For example, “Live call can increase the audience’s sense of participation and loyalty…” It is doubtful that an audience member would be discussing loyalty.
Human-sounding, but identical, open responses: There were some responses that repeated often. This includes phrases like “It can create memorable moments for both the host and the audience” and “Maintained the authenticity of the program”. It was highly unlikely that multiple individual respondents used the exact same phrasing.
Designing audience centered content is an inherently inclusive process. Audience surveys are an opportunity to listen to the needs and concerns of our audiences. Surveys are just one tool we use to help gather audience feedback at Science Friday. When all the cleaning was said and done, we were still left with 1200+ survey participants in our sample! This was significantly higher than the 200-300 we initially anticipated. As online research continues to grow, so does the potential for AI bots. I am appreciative of having discovered new ways to improve my practice even if it cost me hours of work and some new gray hairs.
Nahima Ahmed was Science Friday’s Manager of Impact Strategy. She is a researcher who loves to cook curry, discuss identity, and helped the team understand how stories can shape audiences’ access to and interest in science.
Explore More
What Do Two Anesthesiologists Do For The Fears Of A General Audience?
Using an Ask-An-Expert model leads to increased knowledge and comfort levels on anesthesia for audiences.
Read More
What’s That Smell? An AI Nose Knows
A computer model can map the structure of a chemical to predict what it probably smells like.
Read More
XThis website uses cookies to improve your experience. We'll assume you're okay with this, but you can opt-out if you wish.Read more.SETTINGSREJECTACCEPT
Manage consent
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these cookies, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may have an effect on your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
_abck
1 year
This cookie is used to detect and defend when a client attempt to replay a cookie.This cookie manages the interaction with online bots and takes the appropriate actions.
ASP.NET_SessionId
session
Issued by Microsoft's ASP.NET Application, this cookie stores session data during a user's website visit.
AWSALBCORS
7 days
This cookie is managed by Amazon Web Services and is used for load balancing.
bm_sz
4 hours
This cookie is set by the provider Akamai Bot Manager. This cookie is used to manage the interaction with the online bots. It also helps in fraud preventions
cookielawinfo-checkbox-advertisement
1 year
Set by the GDPR Cookie Consent plugin, this cookie is used to record the user consent for the cookies in the "Advertisement" category .
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
csrftoken
past
This cookie is associated with Django web development platform for python. Used to help protect the website against Cross-Site Request Forgery attacks
JSESSIONID
session
The JSESSIONID cookie is used by New Relic to store a session identifier so that New Relic can monitor session counts for an application.
nlbi_972453
session
A load balancing cookie set to ensure requests by a client are sent to the same origin server.
PHPSESSID
session
This cookie is native to PHP applications. The cookie is used to store and identify a users' unique session ID for the purpose of managing user session on the website. The cookie is a session cookies and is deleted when all the browser windows are closed.
TiPMix
1 hour
The TiPMix cookie is set by Azure to determine which web server the users must be directed to.
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
visid_incap_972453
1 year
SiteLock sets this cookie to provide cloud-based website security services.
X-Mapping-fjhppofk
session
This cookie is used for load balancing purposes. The cookie does not store any personally identifiable data.
x-ms-routing-name
1 hour
Azure sets this cookie for routing production traffic by specifying the production slot.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Cookie
Duration
Description
__cf_bm
30 minutes
This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
bcookie
2 years
LinkedIn sets this cookie from LinkedIn share buttons and ad tags to recognize browser ID.
bscookie
2 years
LinkedIn sets this cookie to store performed actions on the website.
lang
session
LinkedIn sets this cookie to remember a user's language setting.
lidc
1 day
LinkedIn sets the lidc cookie to facilitate data center selection.
S
1 hour
Used by Yahoo to provide ads, content or analytics.
sp_landing
1 day
The sp_landing is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
sp_t
1 year
The sp_t cookie is set by Spotify to implement audio content from Spotify on the website and also registers information on user interaction related to the audio content.
UserMatchHistory
1 month
LinkedIn sets this cookie for LinkedIn Ads ID syncing.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better experience for the visitors. They help us to know which pages are the most and least popular and see how visitors move around the site. All information these cookies collect is aggregated and therefore anonymous. If you do not allow these cookies we will not know when you have visited our site, and will not be able to monitor its performance.
Cookie
Duration
Description
__jid
30 minutes
Cookie used to remember the user's Disqus login credentials across websites that use Disqus.
_gat
1 minute
This cookie is installed by Google Universal Analytics to restrain request rate and thus limit the collection of data on high traffic sites.
_gat_UA-28243511-22
1 minute
A variation of the _gat cookie set by Google Analytics and Google Tag Manager to allow website owners to track visitor behaviour and measure site performance. The pattern element in the name contains the unique identity number of the account or website it relates to.
AWSALB
7 days
AWSALB is an application load balancer cookie set by Amazon Web Services to map the session to the target.
countryCode
session
This cookie is used for storing country code selected from country selector.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide anonymized information on metrics the number of visitors, bounce rate, traffic source, etc. The use of these cookies is strictly limited to measuring the site's audience. These cookies do not allow the tracking of navigation on other websites and the data collected is not combined or shared with third parties.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Cookie
Duration
Description
_fbp
3 months
This cookie is set by Facebook to display advertisements when either on Facebook or on a digital platform powered by Facebook advertising, after visiting the website.
fr
3 months
Facebook sets this cookie to show relevant advertisements to users by tracking user behaviour across the web, on sites that have Facebook pixel or Facebook social plugin.
IDE
1 year 24 days
Google DoubleClick IDE cookies are used to store information about how the user uses the website to present them with relevant ads and according to the user profile.
NID
6 months
NID cookie, set by Google, is used for advertising purposes; to limit the number of times the user sees an ad, to mute unwanted ads, and to measure the effectiveness of ads.
personalization_id
2 years
Twitter sets this cookie to integrate and share features for social media and also store information about how the user uses the website, for tracking and targeting.
test_cookie
15 minutes
The test_cookie is set by doubleclick.net and is used to determine if the user's browser supports cookies.
vglnk.Agent.p
1 year
VigLink sets this cookie to track the user behaviour and also limit the ads displayed, in order to ensure relevant advertising.
vglnk.PartnerRfsh.p
1 year
VigLink sets this cookie to show users relevant advertisements and also limit the number of adverts that are shown to them.
VISITOR_INFO1_LIVE
5 months 27 days
A cookie set by YouTube to measure bandwidth that determines whether the user gets the new or old player interface.
YSC
session
YSC cookie is set by Youtube and is used to track the views of embedded videos on Youtube pages.
yt-remote-connected-devices
never
YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt-remote-device-id
never
YouTube sets this cookie to store the video preferences of the user using embedded YouTube video.
yt.innertube::nextId
never
This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.
yt.innertube::requests
never
This cookie, set by YouTube, registers a unique ID to store data on what videos from YouTube the user has seen.