Jump to content

A Princeton student built an app which can detect if ChatGPT wrote an essay to combat AI-based plagiarism

CycloneTM

Summary

 Princeton student Edward Tian has developed GPTZero, an app that can detect if an essay was written by AI or a human. GPTZero analyzes text based on its "perplexity and burstiness" - how complicated it is and how randomly it is written - to determine if it was written by an AI like ChatGPT. The app's popularity caused its website to crash due to high traffic, but it is still available to use on Tian's Streamlit page. 

 

Quotes

Quote

"GPTZero scores text on its 'perplexity and burstiness' – referring to how complicated it is and how randomly it is written."

Quote

"Are high school teachers going to want students using ChatGPT to write their history essays? Likely not," tweeted Edward Tian, the developer of GPTZero.

 

My thoughts:

 

 It's interesting to see efforts being made to combat AI plagiarism, especially as AI writing tools like ChatGPT become more popular. It will be interesting to see how GPTZero's accuracy holds up in real-world testing, and whether it becomes a widely-used tool in education and other sectors. 

 

Sources

 Student Built App to Detect If ChatGPT Wrote Essays to Fight Plagiarism (businessinsider.com)

Did ChatGPT Write That? College Kid Creates AI Essay Detector (gizmodo.com)

Link to comment
Share on other sites

Link to post
Share on other sites

Its the start of an arms race between AI plagiarism(kinda? I mean it’s probably plagiarizing someone) and AI plagiarism detection.


I wonder how accurately we can determine GPTZero performance since, to my understanding, chatGPT is a bit of a moving target output-wise.

Link to comment
Share on other sites

Link to post
Share on other sites

Yeah I'm sure the likes of Turnitin are already building their own solutions like this, and the arms race cycle shall continue as before.

CPU: i7 4790k, RAM: 16GB DDR3, GPU: GTX 1060 6GB

Link to comment
Share on other sites

Link to post
Share on other sites

me who used chatgpt for my assignment 🤡

 

anyway this can be combated with quillbot tbf

|:Insert something funny:|

-----------------

*******

#

Link to comment
Share on other sites

Link to post
Share on other sites

I'd love to see if you can show ChatGPT the app in question, and ask it to write in a way that circumvents it 🤔

CPU: Ryzen 9 5900 Cooler: EVGA CLC280 Motherboard: Gigabyte B550i Pro AX RAM: Kingston Hyper X 32GB 3200mhz

Storage: WD 750 SE 500GB, WD 730 SE 1TB GPU: EVGA RTX 3070 Ti PSU: Corsair SF750 Case: Streacom DA2

Monitor: LG 27GL83B Mouse: Razer Basilisk V2 Keyboard: G.Skill KM780 Cherry MX Red Speakers: Mackie CR5BT

 

MiniPC - Sold for $100 Profit

Spoiler

CPU: Intel i3 4160 Cooler: Integrated Motherboard: Integrated

RAM: G.Skill RipJaws 16GB DDR3 Storage: Transcend MSA370 128GB GPU: Intel 4400 Graphics

PSU: Integrated Case: Shuttle XPC Slim

Monitor: LG 29WK500 Mouse: G.Skill MX780 Keyboard: G.Skill KM780 Cherry MX Red

 

Budget Rig 1 - Sold For $750 Profit

Spoiler

CPU: Intel i5 7600k Cooler: CryOrig H7 Motherboard: MSI Z270 M5

RAM: Crucial LPX 16GB DDR4 Storage: Intel S3510 800GB GPU: Nvidia GTX 980

PSU: Corsair CX650M Case: EVGA DG73

Monitor: LG 29WK500 Mouse: G.Skill MX780 Keyboard: G.Skill KM780 Cherry MX Red

 

OG Gaming Rig - Gone

Spoiler

 

CPU: Intel i5 4690k Cooler: Corsair H100i V2 Motherboard: MSI Z97i AC ITX

RAM: Crucial Ballistix 16GB DDR3 Storage: Kingston Fury 240GB GPU: Asus Strix GTX 970

PSU: Thermaltake TR2 Case: Phanteks Enthoo Evolv ITX

Monitor: Dell P2214H x2 Mouse: Logitech MX Master Keyboard: G.Skill KM780 Cherry MX Red

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, dizmo said:

I'd love to see if you can show ChatGPT the app in question, and ask it to write in a way that circumvents it 🤔

ChatGPT does not connect to the internet, the newest stuff on there is from 2021. You can give it a text and tell it to write in that style however.

|:Insert something funny:|

-----------------

*******

#

Link to comment
Share on other sites

Link to post
Share on other sites

Just now, adarw said:

ChatGPT does not connect to the internet, the newest stuff on there is from 2021. You can give it a text and tell it to write in that style however.

You can give it code though, no? So couldn't you just give it the code for the written app? 

CPU: Ryzen 9 5900 Cooler: EVGA CLC280 Motherboard: Gigabyte B550i Pro AX RAM: Kingston Hyper X 32GB 3200mhz

Storage: WD 750 SE 500GB, WD 730 SE 1TB GPU: EVGA RTX 3070 Ti PSU: Corsair SF750 Case: Streacom DA2

Monitor: LG 27GL83B Mouse: Razer Basilisk V2 Keyboard: G.Skill KM780 Cherry MX Red Speakers: Mackie CR5BT

 

MiniPC - Sold for $100 Profit

Spoiler

CPU: Intel i3 4160 Cooler: Integrated Motherboard: Integrated

RAM: G.Skill RipJaws 16GB DDR3 Storage: Transcend MSA370 128GB GPU: Intel 4400 Graphics

PSU: Integrated Case: Shuttle XPC Slim

Monitor: LG 29WK500 Mouse: G.Skill MX780 Keyboard: G.Skill KM780 Cherry MX Red

 

Budget Rig 1 - Sold For $750 Profit

Spoiler

CPU: Intel i5 7600k Cooler: CryOrig H7 Motherboard: MSI Z270 M5

RAM: Crucial LPX 16GB DDR4 Storage: Intel S3510 800GB GPU: Nvidia GTX 980

PSU: Corsair CX650M Case: EVGA DG73

Monitor: LG 29WK500 Mouse: G.Skill MX780 Keyboard: G.Skill KM780 Cherry MX Red

 

OG Gaming Rig - Gone

Spoiler

 

CPU: Intel i5 4690k Cooler: Corsair H100i V2 Motherboard: MSI Z97i AC ITX

RAM: Crucial Ballistix 16GB DDR3 Storage: Kingston Fury 240GB GPU: Asus Strix GTX 970

PSU: Thermaltake TR2 Case: Phanteks Enthoo Evolv ITX

Monitor: Dell P2214H x2 Mouse: Logitech MX Master Keyboard: G.Skill KM780 Cherry MX Red

 

 

Link to comment
Share on other sites

Link to post
Share on other sites

4 minutes ago, dizmo said:

You can give it code though, no? So couldn't you just give it the code for the written app? 

Interesting, did he open source it? 

|:Insert something funny:|

-----------------

*******

#

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, C0stanza said:

Its the start of an arms race between AI plagiarism(kinda? I mean it’s probably plagiarizing someone) and AI plagiarism detection.


I wonder how accurately we can determine GPTZero performance since, to my understanding, chatGPT is a bit of a moving target output-wise.

Yeah it's going to get to the point of literal impossibility. What will happen is the false positive rate will exceed the usefulness of any detection tools since the chatGPT will essentially become a human being when it comes to essay writing and the fact that everyone writes differently. However this does set us up for an interesting turing test of sorts (of which the original has already been beaten by Google)

Link to comment
Share on other sites

Link to post
Share on other sites

I wonder how far you could go just getting chatGPT to write a crap load of essays for a given prompt, training a classifier on those essays, and using that to detect generated ones? If the classifier is made at the time the assignment is given it could work. I guess the kids could change the prompt and then the tool would have to account for that. Plus its always gonna have a steady number of false positives. Might just be fighting the inevitable.

Link to comment
Share on other sites

Link to post
Share on other sites

So, next step is to use AI aided reverse engineering to the app to get its code to be inserted for another AI to start developing countermeasures so we can train AI to write essays the app doesn't notice.

 

Or then after a while has passed and the arms race has gone a while someone notices the same thing as with general essay plagiarism and it just becomes another percentage meter to be kept on eye while writing essays that will be checked for it. As in at least in some schools in Finland thesis and other more important student works are run through a program which will check it for plagiarism, it doesn't directly tell if the paper is copy from something else but it will give a percentage depending on use of same words in same order as known texts, usage of complex words in exact order and other parameters that would be a sign of plagiarism and from that percentage the teacher/checker can decide will they manually check the paper for plagiarism.

 

Also how do you plagiarize something that isn't natural person who could have copyright? Kind of same as being accused for stealing from AI which legally cannot own anything.

The word you are looking for is "cheat", cheating by creating an essay with AI.

Link to comment
Share on other sites

Link to post
Share on other sites

4 hours ago, CycloneTM said:

Summary

 Princeton student Edward Tian has developed GPTZero, an app that can detect if an essay was written by AI or a human. GPTZero analyzes text based on its "perplexity and burstiness" - how complicated it is and how randomly it is written - to determine if it was written by an AI like ChatGPT. The app's popularity caused its website to crash due to high traffic, but it is still available to use on Tian's Streamlit page. 

 

Quotes

 

My thoughts:

 

 It's interesting to see efforts being made to combat AI plagiarism, especially as AI writing tools like ChatGPT become more popular. It will be interesting to see how GPTZero's accuracy holds up in real-world testing, and whether it becomes a widely-used tool in education and other sectors. 

 

Sources

 Student Built App to Detect If ChatGPT Wrote Essays to Fight Plagiarism (businessinsider.com)

Did ChatGPT Write That? College Kid Creates AI Essay Detector (gizmodo.com)

As per my comments in the "stability AI" fork kickstarter, 100% of AI Machine Learning is detectable for various reasons that amount to short-cuts taken to make the training faster.

 

For Text, it's the random, or "incoherent" connection between topics. For Art it's various things like hands eyes and feet in the output. For TTS and Voice Conversion, it's the lack of dynamic range. Because, just like the AI art, it's an "average" taken for each type of sound, and the more training data put in, the worse it is. If you want to deepfake a specific person, you need to train it entirely on that person, and nobody else.

 

And this is the problem with AI plagiarism, is the presentation of "AI generated" materials as though it was written, drawn or performed by a human.

 

And to be honest, if you can't tell if something is AI generated, you probably should not be in a position (eg teaching, law, politics) to accept feedback on blind faith. All it takes is an AI to "stuff the inbox" of a politician with unique emails and voicemails to create it's own "AI-driven astroturfing"

 

Discovering when emails have been written by GPT is something that should be kicked over to Microsoft and Google to mark emails in their email software as "likely AI written".

Link to comment
Share on other sites

Link to post
Share on other sites

5 hours ago, dizmo said:

I'd love to see if you can show ChatGPT the app in question, and ask it to write in a way that circumvents it 🤔

I wonder if you took something made with chatgpt and modified it a little bit would it still be detected by this app? Personally I always thought essays were stupid so I don't see the problem with people using chatgpt as tests are what really matter tbh. Essays always felt like busy work to me especially if you already knew the subject well and it isn't an English class. With an English class i can see the thought process of learning how to write but I would rather essays be exclusive to English class tbh. Honestly most of my essays were hardly my work when I was in school as it was peer reviewed and by the time I was done making edits it was basically a totally different piece of work. 

Link to comment
Share on other sites

Link to post
Share on other sites

because false positives in automatic detection has never been a thing..

🌲🌲🌲

 

 

 

◒ ◒ 

Link to comment
Share on other sites

Link to post
Share on other sites

Me who used chatGPT to translate 2 guides I had to send to a client and is now part of the official documentation of the company I work for...

One day I will be able to play Monster Hunter Frontier in French/Italian/English on my PC, it's just a matter of time... 4 5 6 7 8 9 years later: It's finally coming!!!

Phones: iPhone 4S/SE | LG V10 | Lumia 920 | Samsung S24 Ultra

Laptops: Macbook Pro 15" (mid-2012) | Compaq Presario V6000

Other: Steam Deck

<>EVs are bad, they kill the planet and remove freedoms too some/<>

Link to comment
Share on other sites

Link to post
Share on other sites

14 hours ago, dizmo said:

You can give it code though, no? So couldn't you just give it the code for the written app? 

That's not really how it works. You can't train the AI by just inputting something through the final model.

Just giving the AI the source code would be like writing "I want you to be better at spell checking" in a Microsoft Word document. The text will be processed by MS Word, but it will not alter the code that makes up the program.

Same with this. Pasting the source code into ChatGPT would run the text through ChatGTP, but it would not alter the code of ChatGTP.

 

 

 

8 hours ago, mr moose said:

So students can now fail because of an app and have no way to actually prove they wrote it themselves?

It's already happening in the art world.

 

The scariest part about this story is that no matter what a student does, they can't prove that they wrote it themselves. An artist might be able to provide the Photoshop files, but in the case of written essays there is now way of actually checking if the words came from the student's mind. Even if we were to log keystrokes, there is no way of checking if the student came up with everything themselves, or if they just copied the AI generated text word for word by hand in a Word document.

 

 

I think the best way of dealing with this is doing what my sister does with her students. All assignments that decide their grades are written by hand in the classroom. They still get assignments that they can do on their computers, but those assignments have no impact on which grades they get. 

Apparently cheating in schools, at least in Sweden, have gotten really bad in the last 6 or so years. It didn't use to be a big problem, but nowadays people pay others to write their assignments for them, so this was my sister's standard practice even before ChatGPT came out.

Edited by LogicalDrm
Removed part of quote.
Link to comment
Share on other sites

Link to post
Share on other sites

51 minutes ago, LAwLz said:

I think the best way of dealing with this is doing what my sister does with her students. All assignments that decide their grades are written by hand in the classroom.

Idk about how the rest of the world handled this until now, but in Germany it's always been this way. Everything that is graded happens in the classroom.

Maybe i missed some changes introduced by covid and the introduction of digital teaching in general, which all happened after i finished school myself.

If someone did not use reason to reach their conclusion in the first place, you cannot use reason to convince them otherwise.

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, LAwLz said:

 

I think the best way of dealing with this is doing what my sister does with her students. All assignments that decide their grades are written by hand in the classroom. They still get assignments that they can do on their computers, but those assignments have no impact on which grades they get. 

Apparently cheating in schools, at least in Sweden, have gotten really bad in the last 6 or so years. It didn't use to be a big problem, but nowadays people pay others to write their assignments for them, so this was my sister's standard practice even before ChatGPT came out.

I wouldn't say "hand-written" but rather anything that counts for the class grade has to be done "in class" like it should be. To hell with homework assignments when you can just let the computer do it for you. That has NEVER helped anyone understand the subject and most people abhor doing it. If you want to study, read more.

 

If you can't do it "right here, right now", then you don't understand it at a level you could ever use it in the future either.

 

But let's not get too far ahead, most people will not use even half the stuff they were told they "will have to use" in school, when things like doing taxes, cooking and how to wash your clothes would have been far more useful to everyone, yet you're lucky if the school even requires you to learn about food safety.

 

Link to comment
Share on other sites

Link to post
Share on other sites

1 hour ago, LAwLz said:

That's not really how it works. You can't train the AI by just inputting something through the final model.

Just giving the AI the source code would be like writing "I want you to be better at spell checking" in a Microsoft Word document. The text will be processed by MS Word, but it will not alter the code that makes up the program.

Same with this. Pasting the source code into ChatGPT would run the text through ChatGTP, but it would not alter the code of ChatGTP.

 

 

 

It's already happening in the art world.

 

The scariest part about this story is that no matter what a student does, they can't prove that they wrote it themselves. An artist might be able to provide the Photoshop files, but in the case of written essays there is now way of actually checking if the words came from the student's mind. Even if we were to log keystrokes, there is no way of checking if the student came up with everything themselves, or if they just copied the AI generated text word for word by hand in a Word document.

 

 

I think the best way of dealing with this is doing what my sister does with her students. All assignments that decide their grades are written by hand in the classroom. They still get assignments that they can do on their computers, but those assignments have no impact on which grades they get. 

Apparently cheating in schools, at least in Sweden, have gotten really bad in the last 6 or so years. It didn't use to be a big problem, but nowadays people pay others to write their assignments for them, so this was my sister's standard practice even before ChatGPT came out.

I am fine with graded things is written in the classroom, but by hand? No. As someone that has problems with shitty handwriting, just no. Well, math is fine for hand but writing long texts, no.

Just use PCs with those tests mode and no internet.

“Remember to look up at the stars and not down at your feet. Try to make sense of what you see and wonder about what makes the universe exist. Be curious. And however difficult life may seem, there is always something you can do and succeed at. 
It matters that you don't just give up.”

-Stephen Hawking

Link to comment
Share on other sites

Link to post
Share on other sites

42 minutes ago, Kisai said:

I wouldn't say "hand-written" but rather anything that counts for the class grade has to be done "in class" like it should be. To hell with homework assignments when you can just let the computer do it for you. That has NEVER helped anyone understand the subject and most people abhor doing it. If you want to study, read more.

 

If you can't do it "right here, right now", then you don't understand it at a level you could ever use it in the future either.

 

But let's not get too far ahead, most people will not use even half the stuff they were told they "will have to use" in school, when things like doing taxes, cooking and how to wash your clothes would have been far more useful to everyone, yet you're lucky if the school even requires you to learn about food safety.

 

Just use PC without internet/with test mode.

Or just the keep an eye out on what they are doing.

People have bad handwriting and have valid reasons for it.

 

Personally, I learn a lot more from doing tasks than I do from just reading on my own, no matter if it's at school or on my own.

 

Graded stuff should be done at school, but I am also against just having one big test at the end of the year, have more things that is down at school and is graded.

“Remember to look up at the stars and not down at your feet. Try to make sense of what you see and wonder about what makes the universe exist. Be curious. And however difficult life may seem, there is always something you can do and succeed at. 
It matters that you don't just give up.”

-Stephen Hawking

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Mihle said:

Just use PC without internet/with test mode.

That's probably too much of a IT overhead to turn the internet on and off for devices the user might have on them, and instead the school might schedule an "exam day" where they block all internet traffic from the school, and individual students are required to turn in their mobile devices, smart watches, etc at the beginning of the day or not bring them, failure to do so gets them a 0.

 

1 minute ago, Mihle said:

Or just the keep an eye out on what they are doing.

People have bad handwriting and have valid reasons for it.

And forcing people to handwrite will result in an extreme slowdown in grading. Speaking from experience when the standard way tests were marked in middle and high school was to shuffle the tests and hand them out to random students to mark, so whoever got the students with the worst handwriting had to often ask the teacher for enough help that they may as well have marked it themselves.

 

 

1 minute ago, Mihle said:

Personally, I learn a lot more from doing tasks than I do from just reading on my own, no matter if it's at school or on my own.

That wasn't the point why I said that. Sending kids home to copy things from the text book or the internet for marks was always a stupid waste of time, because the teachers would not bloody coordinate with each other, having to go home with "read 2 chapters of X book", "write an essay on Y subject", "do all the math questions on page 420", and "coordinate with your lab partner to do Q project", while also having to do hours of chores at home. You know what get's sidelined? the schoolwork.

 

I always wound up doing all that homework that counted for marks in the break before that class started (eg lunch), because there was no way I was doing that on the hour+ school bus rides to and from the school.

 

Let's see, if School ends at 3PM, the bus doesn't arrive till 3:20, you don't get home at the minimum before 3:40 and as late as 4:30, and your parents aren't home until 5, where you then have to cook dinner, you're literately not getting to the school work until like 7. 

 

Anyway, the point was that the teachers blindly assigned everyone homework as busywork. Not as study material. If I only had to read it, I could read it before class. But when I had to take 30lbs of books home every day and spend around an hour on the bus just to go home, to do all this homework busywork, no, that was stupid and was always stupid.

 

1 minute ago, Mihle said:

Graded stuff should be done at school, but I am also against just having one big test at the end of the year, have more things that is down at school and is graded.

Final exams were not a great way to ensure the students actually learned something, because it was prone to unintended cheating. An example of this is that "final exams" for most classes, the students got the previous years exam as "test prep" to study a week before the exam. If you're going to do that, why have the entire half year to learn it if you can just memorize only the subjects on the exam.

 

Things like math, physics, chemistry, and even cooking you actually have to "know", or you won't pass the class anyway. But English, History and other classes that are mostly about regurgitating the text book (which may even be incorrect if your school uses 20 year old books) in the first place are the subjects that ChatGPT are going to steamroll over if the student is allowed to use it. It's not like the old conundrum of "no calculators on tests"

 

Calculators tell you the answer, but you still need to know the order of operations to get a correct result. Asking ChatGPT to tell you about the battle of Gettysburg is going to result in something not unlike the existing Wikipedia article.

 

Link to comment
Share on other sites

Link to post
Share on other sites

10 hours ago, dizmo said:

I'd love to see if you can show ChatGPT the app in question, and ask it to write in a way that circumvents it 🤔

It'd probably be enough to tell ChatGPT to imagine a situation where it's writing is being detected and to rewrite it in a way that'll break the detection.

Link to comment
Share on other sites

Link to post
Share on other sites

1 minute ago, Beskamir said:

It'd probably be enough to tell ChatGPT to imagine a situation where it's writing is being detected and to rewrite it in a way that'll break the detection.

That's giving ChatGPT too much credit. 

 

GPT is auto-complete, it possesses no imagination, no minds eye. Asking it to write in a way to avoid being detected as GPT is outside the scope of how it works. It's like asking a diphone TTS to sing opera. It can't do that because the ML was not trained to deceive, it was only trained to regurgitate and average out what it was trained on.

 

That's why ChatGPT can spit out incorrect information. It got that information wrong because the trained input was wrong more often than right. Training ChatGPT on Wikipedia and it's edit history (where things were not deleted at least) would likely result in a lot of this incorrect and false information getting into the training weights. The pages look nearly the same except for added/removed sections.

 

Put all those inputs together and you may end up with a result that is a mixture of correct and incoherently incorrect if you give ChatGPT too much freedom. It does not have a minds eye, it can not create, it can only auto-complete to the next piece of input. Where auto-complete in your phone is a predictive speller, ChatGPT takes that from subject to subject.

Link to comment
Share on other sites

Link to post
Share on other sites

2 hours ago, LAwLz said:

Apparently cheating in schools, at least in Sweden, have gotten really bad in the last 6 or so years. It didn't use to be a big problem, but nowadays people pay others to write their assignments for them, so this was my sister's standard practice even before ChatGPT came out.

Most of the western world is so wealthy now that average students can afford to pay someone else to write their assignments? I wonder if these students are also the ones using "boomer" as a derogatory term blaming older generations for the cost of housing, living, etc?

 

 

Grammar and spelling is not indicative of intelligence/knowledge.  Not having the same opinion does not always mean lack of understanding.  

Link to comment
Share on other sites

Link to post
Share on other sites

6 minutes ago, Kisai said:

That's giving ChatGPT too much credit. 

 

GPT is auto-complete, it possesses no imagination, no minds eye. Asking it to write in a way to avoid being detected as GPT is outside the scope of how it works. It's like asking a diphone TTS to sing opera. It can't do that because the ML was not trained to deceive, it was only trained to regurgitate and average out what it was trained on.

 

That's why ChatGPT can spit out incorrect information. It got that information wrong because the trained input was wrong more often than right. Training ChatGPT on Wikipedia and it's edit history (where things were not deleted at least) would likely result in a lot of this incorrect and false information getting into the training weights. The pages look nearly the same except for added/removed sections.

 

Put all those inputs together and you may end up with a result that is a mixture of correct and incoherently incorrect if you give ChatGPT too much freedom. It does not have a minds eye, it can not create, it can only auto-complete to the next piece of input. Where auto-complete in your phone is a predictive speller, ChatGPT takes that from subject to subject.

I was mostly joking since making something fictional or imaginary is one way in which GPT's own filters/safeties can be bypassed. Although, I've been really impressed with what GPT can do and wouldn't be too surprised if it could change it's tone or reword things in a way in which the detector wouldn't be able to figure out if it's GPT anymore.

Link to comment
Share on other sites

Link to post
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now

×