Connect with us

Featured

Can Artificial Intelligence Plagiarize?

Published

on

By Erin Garrett

University of Mississippi

Since the launch of ChatGPT in November, the online tool has gained a record-breaking 100 million active users. Its technology, which automatically generates text for its users based on prompts, is highly sophisticated. But are there ethical concerns?

Thai Le. Photo by Thomas Graning/Ole Miss Digital Imaging Services

A University of Mississippi professor has co-authored a paper, led by collaborators at Penn State University, showing that artificial intelligence-driven language models, possibly including ChatGPT, are guilty of plagiarism – in more ways than one.

“My co-authors and I started to think, if people use this technology to write essays, grant proposals, patent applications, we need to care about possibilities for plagiarism,” said Thai Le, assistant professor of computer and information science in the School of Engineering. “We decided to investigate whether these models display plagiarism behaviors.”

The study, which is the first of its kind, evaluated OpenAI’s GPT-2, a precursor to ChatGPT’s current technology. They tested three separate criteria for plagiarism: direct copying of content, paraphrasing and copying ideas from text without proper attribution. 

To do this, they created a method to automatically detect plagiarism and tested it against GPT-2’s training data, which is “memorized” in part and reproduced by the technology. Much of this data, which is publicly available online, is scraped from the internet without informing content owners. 

By comparing 210,000 generated texts to the 8 million GPT-2 pre-training documents, the team found evidence of all three types of plagiarism in the language models they tested. Their paper explains that GPT-2 can “exploit and reuse words, sentences and even core ideas in the generated texts.” 

Furthermore, the team hypothesizes that the larger the model size and associated training data, the greater the possibility of plagiarism.

“People pursue large language models because the larger the model gets, generation abilities increase,” said Jooyoung Lee, first author and an information sciences and technology doctoral student at Penn State. “At the same time, they are jeopardizing the originality and creativity of the content within the training corpus. This is an important finding.”

The scientists believe that this automatic plagiarism detection method could be applied to later versions of OpenAI technology, such as those used by ChatGPT.

The research team will present their findings at the 2023 ACM Web Conference, set for April 30-May 4 in Austin, Texas.

Robert Cummings, associate professor of writing and rhetoric at Ole Miss, has given advice to higher education professionals about ChatGPT’s implications in the classroom. A collaborator with Le in other AI-related research, Cummings suggests that users should be pragmatic when referencing material gained from language models.

“We have to be careful about what ideas are ours and what are borrowed,” Cummings said. “Pre-ChatGPT, I’d Google something as part of my research, and it would be sourced. If I was looking for general knowledge, I’d consult Wikipedia.

“Now, it’s important to designate what came from ChatGPT and put it off to the side as unsourced ideas.”

Le acknowledges the importance of finding solutions to these ethical issues, whether that be on the user side or on the side of scientific advancement.

“There are many important philosophical questions related to this technology,” he said. “Computer science researchers will continue to think of ways to improve these language models to change the way they generate text in such a way that they would not plagiarize.”

This material is based upon work supported by the National Science Foundation under Grant Nos. 1934782 and 2114824.


Sports Editor

2024 Ole Miss Football

Sat, Aug 31Furman Logovs Furman W, 76-0
Sat, Sep 7Middle Tennessee Logovs Middle TennesseeW, 52-3
Sat, Sep 14Wake Forest Logo@ Wake ForestW, 40-6
Sat, Sep 21Georgia Southern Logovs Georgia SouthernW, 52-13
Sat, Sep 28Kentucky Logovs KentuckyL, 20-17
Sat, Oct 5South Carolina Logo@ South CarolinaW, 27-3
Sat, Oct 12LSU Logovs LSUL, 29-26 (2 OT)
Sat, Oct 26Oklahoma Logovs OklahomaW, 26-14
Sat, Nov 2Arkansas Logo@ ArkansasW, 63-35
Sat, Nov 16Georgia Logovs GeorgiaW, 28-10
Sat, Nov 23Florida Logo@ FloridaL, 24-17
Sat, Nov 30Mississippi State Logovs Mississippi StateW, 26-14
Thu, Jan 2Duke Logovs Duke (Gator Bowl)W, 52-20

Ole Miss Men’s Basketball

Mon, Nov 4Long Island University Logovs Long Island University W, 90-60
Fri, Nov 8Grambling Logovs GramblingW, 66-64
Tue, Nov 12South Alabama Logovs South AlabamaW, 64-54
Sat, Nov 16Colorado State Logovs Colorado StateW, 84-69
Thu, Nov 21Oral Roberts Logovs Oral RobertsL, 100-68
Thu, Nov 28BYU Logovs BYUW, 96-85 OT
Fri, Nov 29Purdue Logovs 13 PurdueL, 80-78
Tue, Dec 3Louisville Logo@ LouisvilleW, 86-63
Sat, Dec 7Lindenwood Logovs LindenwoodW, 86-53
Sat, Dec 14Georgia Logovs Southern MissW, 77-46
Tue, Dec 17Southern Logovs SouthernW, 74-61
Sat, Dec 21Queens University Logovs Queens UniversityW, 80-62
Sat, Dec 28Memphis Logo@ MemphisL, 87-70
Sat, Jan 4Georgia Logovs Georgia11:00 AM
SECN
Wed, Jan 8Arkansas Logo@ 23 Arkansas6:00 PM
TBA
Sat, Jan 11LSU Logovs LSU5:00 PM
SECN
Tue, Jan 14Alabama Logo@ 5 Alabama6:00 PM
TBA
Sat, Jan 18Mississippi State Logo@ 17 Mississippi State5:00 PM
TBA
Wed, Jan 22Texas A&M State Logovs 13 Texas A&M8:00 PM
TBA
Sat, Jan 25Missouri Logo@ Missouri5:00 PM
SECN
Wed, Jan 29Texas Logovs Texas8:00 PM
ESPN2
Sat, Feb 1Auburn Logovs 2 Auburn3:00 PM
TBA
Tue, Feb 4Kentucky Logovs 10 Kentucky6:00 PM
ESPN
Sat, Feb 8LSU Logo@ LSU7:30 PM
SECN
Wed, Feb 12South Carolina Logo@ South Carolina6:00 PM
SECN
Sat, Feb 15Mississippi State Logovs 17 Mississippi State5:00 PM
TBA
Sat, Feb 22Auburn Logo@ Vanderbilt2:30 PM
SECN
Wed, Feb 26Auburn Logo@ 2 Auburn6:00 PM
TBA
Sat, Mar 1Oklahoma Logovs 12 Oklahoma1:00 PM
TBA
Wed, Mar 5Tennessee Logovs 1 Tennessee8:00 PM
TBA
Sat, Mar 8Florida Logo@ 6 Florida5:00 PM
SECN

@ COPYRIGHT 2024 BY HT MEDIA LLC. ALL RIGHTS RESERVED. HOTTYTODDY.COM IS AN INDEPENT DIGITAL ENTITY NOT AFFILIATED WITH THE UNIVERSITY OF MISSISSIPPI.