Comparing AI Detection Tools: One Instructor’s Experience (Continued)

Comparison of different programs that claim to detect AI-generated text by Dr. Ellie Andrews, Anthropology.

I have decided that the article is too long for a MTI newsletter to send in its entirety.  You can access the whole article here.

Why you can’t find Turnitin’s AI Writing Detection tool

by Joseph Brown, Director of Academic Integrity, TILT

April 26, 2023

What happened?

You may have noticed colleagues at other institutions discussing Turnitin’s new AI Writing Detection tool. However, when you access your Turnitin similarity reports in Canvas, you won’t find it. Why? 

In the weeks leading up to Turnitin’s release of the new tool, I joined colleagues from other offices to discuss the new tool and its capabilities. As those discussions progressed, concerns arose that led the group to recommend a pause for the rollout of Turnitin’s product. It is important to note that CSU joined other Unizin institutions in asking to join Turnitin’s “suppression list.” 

The concerns were: 

  • A continued lack of specifics/ details about how Turnitin arrives at its evaluation of student work. This is the so-called “black box” aspect of the service. Unlike the “similarity report,” faculty and student users cannot click through and see a 1-1 comparison of the text in question. 
  • The inability of institutions or instructors to turn off the AI detection score in Feedback Studio. Every essay would produce a score, whether the instructor wanted it or not. 
  • We were being forced to operate on Turnitin’s timeline, not CSU’s. Our campus needs more preparation, information, and education before rolling out a tool with such a potential impact on our courses, student learning, and academic misconduct investigations.

What happens next?

Moving forward, the plan in place is for a faculty-led group to review the tool, study the information provided by Turnitin, and consider the big questions about using third-party detection tools. Ultimately, that group will make a recommendation to our university leadership regarding whether or not CSU will adopt it and, if so, what the timeline will look like. 

As the Director of the Academic Integrity Program, I hear from faculty often and I know that AI writing is posing a serious challenge to our courses. At the same time, I know that our faculty have an appreciation for the challenges posed by an institutionalized detection tool that we don’t fully understand. As more information becomes available, I commit to sharing it with you. 

As always, feel free to call or email me with your questions or otherwise let me know how I can assist you.

Tools You Can Use Right Now

by Joseph Brown, Director of Academic Integrity, TILT

February 9, 2023

The very first question faculty members ask me when they learn about ChatGPT is whether or not there are any tools available that can reveal if a text was created by an AI engine. A week ago, I released this video on GPTZero.  In just one week, this area of the issue has seen significant innovation and change. Not only has GPTZero undergone a notable streamlining and redesign, but OpenAI, the company behind ChatGPT, released its own. 

You can access them here:

  • OpenAI “Text Classifier” (LINK)
  • GPTZero (LINK)

However, these programs have serious limitations, and I want to share what I know right now (2nd week of February) about how those limitations will impact how you’re able to assess student work. 

Are they easy to use? 

Mostly, yes, and they are getting easier and more streamlined every day. You can compare the way GPTZero looked in the videos I did last week to what it looks like now and it’s remarkable. 

Will they tell me if a student’s work was generated by Artificial Intelligence? 

It’s complicated. Neither of the most well-known programs will say definitively if writing is or is not AI-generated. They use slippery terms like “Likely.”

OpenAI’s “Text Classifier” (with its notably non-judgmental appellation) will choose a level from a rubric-like spectrum of classifications: “very unlikely, unlikely, unclear if it is, possibly, or likely AI-generated.” 

  • Both programs warn users that false-positives are common. 
  • Both programs warn against using these tools as the sole determinant when deciding if a student’s work is their own.

You can read what OpenAI has to say about their tool and its limitations here (LINK).  

Do these programs explain how they work? In other words, do they share how they determine if writing is AI-generated? 

No, not entirely or (in OpenAI’s version) not at all. This is the most frustrating aspect to using them right now. We simply don’t know their criteria and rationale. OpenAI said recently that its classifier correctly identified only 26% of AI-written text as “likely AI-written” and incorrectly identified text created by humans 9% of the time. It’s worth noting they haven’t shared how they do this. In addition, the low rates of identification are frightening when you consider that they’re trying to identify their own generated text.

The Academic Integrity Program recommends that you only use these results in the context of an broader evaluation process when you suspect that a student’s work may not be their own. It is also recommended to only use these results as the starting point for a conversation with the student. You can access more information and tips about having that kind of conversation here (LINK). 

It’s important to remember that this is a rapidly developing/escalating issue. These tools may look and act remarkably different in the coming days and weeks. Keep checking back with the “AI-and-AI” hub here (LINK) at the TILT website for the most updated information and, as always, feel free to contact me at 970-491-2898. 

I hope you find these posts helpful for your teaching.  As always, I appreciate your questions, comments and feedback on this and other teaching related topics.  I am still very interested in hearing from those that have been wrestling with the use of generative AI in software coding assignments.

Cheers, Paul