> *2. Non-technical users (teachers) treat this like a 100% certainty* This is t...

munificent · on Jan 31, 2023

I think you're misunderstanding the primary purpose of essays.

Teachers don't have the time to do deep critical reasoning about each student's essay. An essay is only partially an evaluation tool.

The primary purpose of an essay is that the act of writing an essay teaches the student critical reasoning and structured thought. Essays would be an effective tool even if they weren't graded at all. Just writing them is most of the value. A big part of the reason they're graded at all is just to force students to actually write them.

The main problem with AI generated essays isn't that teachers will lose out on the ability to evaluate their students. It's that students won't do the work and learn the skills they get from doing the work itself.

It's like building a robot to do push ups for you. Not only does the teacher no longer know how many push ups you can do, you're no longer exercising your muscles.

YeGoblynQueenne · on Jan 31, 2023

>> The primary purpose of an essay is that the act of writing an essay teaches the student critical reasoning and structured thought. Essays would be an effective tool even if they weren't graded at all. Just writing them is most of the value. A big part of the reason they're graded at all is just to force students to actually write them.

That's our problem, I think. Education keeps failing to convince students of the need to be educated.

Zababa · on Feb 1, 2023

I think that students know they need to be educated, but they also know that grading/academic success, in the form of good grades and going to prestigious universities, matters more than actual knowledge in the real world. And the funny thing is that if you teach critical reasoning to someone, there's a good chance they will use that skill to realize that the grade of the essay matters more than the actual process of writing it.

I think companies face a similar problem when they try to introduce metrics to evalute performance, either of individual employees or of whole parts of the company, and people start focusing on gaming these metrics instead of doing what's actually beneficial to the company. One reason for that is probably that it's really hard to evalute what actually beneficiates the company, and what part you played in it.

Back to students, maybe writing that essay instead of asking GPT-3 is more beneficial in the long run, but on the other hand you're also learning to use a new tech that will keep getting better, but maybe you're not learning the "value of hard work correctly", etc etc. Evaluating what's good for you is very hard, focusing on a good grade is easier and has noticable positive results. I think getting educated is very important, but I also think no one can certainly known if learning to use AI is actually a worse thing that doing stuff "yourself".

All in all, it's a very hard problem. It's trying to see the consequences of our own actions in very complex systems. And different people work differently. For example, when I use ChatGPT or Copilot, I end up spending more time overall working, and producing way more stuff even without counting what the AI "produced", because the back and forth between me and the AI is a more natural way of working for me. In the same vein, it's easier for me to write or even think by acting out a conversation. Maybe for some people it's the exact opposite and they need to be alone with their thoughts to be more productive.

munificent · on Feb 1, 2023

Delaying gratification is hard for all of us. We're just primates doing the best we can with our limited wetware.

hndamien · on Feb 1, 2023

Seem like it would be fairly trivial to make a document writer that measured if a human was doing the typing such that it was much more likely to have been written by a human sitting and thinking at a keyboard. We do it in ad fraud detection all the time at scale with much less willing participants.

BurningFrog · on Feb 1, 2023

The value of a degree is very clear.

The value of an education is much less clear.

I'm saying the students are probably right.

Al-Khwarizmi · on Jan 31, 2023

> It's like building a robot to do push ups for you. Not only does the teacher no longer know how many push ups you can do, you're no longer exercising your muscles.

While I already knew what you have described, I love this analogy, it's really spot on.

thelock85 · on Jan 31, 2023

For this exact reason, I feel like education systems and curriculum providers (teachers are just point of contact from a requirements perspective) should develop much more complex essay prompts and invite students to use AI tools in crafting their responses.

Then it’s less about the predetermined structure (5 paragraphs) and limited set of acceptable reasoning (whatever is on the rubric), and more about using creative and critical thinking to form novel and interesting perspectives.

I feel like this is what a lot of universities and companies currently claim they want from HS and college grads.

desro · on Jan 31, 2023

This is what I'm doing as an instructor at some local colleges. A lot of the students are completely unaware of these tools, and I really want to make sure they have some sense of how things are changing (inasmuch as any of us can tell...)

So I invite them to use chatGPT or whatever they like to help generate ideas, think things out, or learn more. The caveat is that they have to submit their chat transcript along with the final product; they have to show their work.

I don't teach any high-stakes courses, so this won't work for everyone. But educators are deluded if they think anyone is served by pretending that (A) this doesn't/shouldn't exist, and that (B) this and its successors are going away.

All of this stuff is going to change so much. It might be a bigger deal than the Internet. Time will tell.

sitkack · on Feb 1, 2023

I like this technique. You could also take a ChatGPT essay and have the students rewrite it or analyze for style.

Or have a session on how to write the prompts to generate the good stuff. In the hands of a skilled liberal artist, the models produce amazing results.

Yes the tool is powerful, but it still requires skills, knowledge and an ascetic voice.

Al-Khwarizmi · on Jan 31, 2023

A student can't go from zero to "much more complex essay prompts", though. Education has to go step by step. The truth is that humans start at a lower writing skill that ChatGPT. Before getting better than it, they need to first reach its level.

And then, there is the problem that those complex prompts might also become automatable when GPT-4 or GPT-5 is released.

class4behavior · on Jan 31, 2023

>Teachers don't have the time to do deep critical reasoning about each student's essay.

Projection much? Who are you speaking for? What countries, what states?

It's difficult to dive in that deep into someone's essay in any case. That's the challenge, not the lacking quality of one's education system.

Fomite · on Feb 1, 2023

I read every student essay I grade twice. Small classes, admittedly, but this has always been my practice.

geph2021 · on Jan 31, 2023

   ask their students to write things in class and get a feeling for what those individual students are capable of. Then those that turn in essays written at 10x their normal writing level will be obvious

I think that's a flawed approach. Plenty of people simply don't perform or think well under imposed time-limited situations. I believe I can write close to 10x better with 10x the time. To be clear, I don't mean writing more, or a longer essay, given more time. Personally, the hardest part of writing is distilling your thoughts down to the most succinct, cogent and engaging text.

deepspace · on Jan 31, 2023

> Plenty of people simply don't perform or think well under imposed time-limited situations

From first-hand experience, the difference between poor stress-related performance and a total lack of knowledge is night and day.

I have personally witnessed students who could not speak or understand the simplest English, and were unable to come up with two coherent sentences in a classroom situation, but turned in graduate level essays. The difference is blindingly obvious.

giovannibonetti · on Jan 31, 2023

> I have personally witnessed students who could not speak or understand the simplest English, and were unable to come up with two coherent sentences in a classroom situation, but turned in graduate level essays. The difference is blindingly obvious.

Maybe someone helped them with their homework?

remexre · on Jan 31, 2023

Unless their in-class performance increases as well, isn't that help "probably cheating"? (That's the "moral benchmark" I'd use, at least; if your collaboration resulted in you genuinely learning the material, it's probably not cheating.)

runarberg · on Jan 31, 2023

The point is for the teacher to get a sense of the students style and capabilities. Even if your home essay is 10x better and 10x more concise as your in class work, a good teacher that knows you—unlike an inference model—will be able to extrapolate and spot commonalities. Also a good teacher (that isn’t overworked) will also talk to students and get a sense of their style and capabilities that way, this allows them to extrapolate even better then a computer could ever hope to.

zopa · on Jan 31, 2023

Sure, but what about all the students with mediocre and/or overworked teachers? If our plan assumes the best-case scenario, we're going to have problems.

runarberg · on Jan 31, 2023

Honestly if we can’t have nice things and we keep skimping out on education, I’d rather we just accept the fact that some will students cheat, then to introduce another subpar technical solution to a societal problem.

londons_explore · on Jan 31, 2023

> blindly trusting the computer.

Professors blindly trust the computer not out of laziness, but to protect themselves from accusations of unfairness...

"The work was detected as plagiarism, but the professor overrode it for the pretty girl in class, but not for me"

mitchdoogle · on Jan 31, 2023

Seems like something like this should only be used as a first-level filter. If the writing doesn't pass, it warrants more investigation. If no proof of plagiarism is found, then there's nothing else to do and professor must pass the student

TchoBeer · on Jan 31, 2023

with a 26% true positive rate that seems flawed.

busyant · on Jan 31, 2023

I asked chatgpt to write an essay as if it were written by a mediocre 10th grader. It did a reasonably good job. It threw in a little bit of slang and wasn’t particularly formal.

Edit. I sometimes tell my students “if you’re going to cheat, don’t give yourself a perfect score, especially if you’ve failed the first exam. It fires off alarm bells.”

But the students who struggle usually can’t calibrate a non-suspicious performance.

I guess the same applies here.

Baeocystin · on Jan 31, 2023

You've touched upon a central issue that is not often addressed in these conversations. People who have difficulty comprehending and composing essays also struggle to work with repeated prompts in AI systems like ChatGPT to reach a solution. I've found in practice that when showing someone how prompting works, their understanding either clicks instantly, or they fail to grasp it at all. There appears to be very little in between.

asah · on Jan 31, 2023

seems like this is the future... 1. first day of class, write a N word essay and sign a release permitting this to be used to detect cheating. The essay topic is chosen at random.

2. digitize & feed to learning model, which detects that YOU are cheating.

upside: this also helps detect students who are getting help (e.g. parents)

downside: arms race as students feed their cheat-essays (memorize their essays?) into AI-detection models that are similarly trained.

kaibee · on Jan 31, 2023

The funniest implication here is that the student's writing skill isn't expected to improve.

eh9 · on Jan 31, 2023

I was just asking my partner who’s a writer if it would even be fair to train a model based on a student at Nth grade if the whole point is to measure growth. Would there be enough “stylistic tokens” developed in a young person’s writing style?

ask_b123 · on Jan 31, 2023

Personally, I feel mildly embarrassed when reading my essays from years prior. And I probably still count as a 'young person'.

That said, there's no need to consider changes in years when stylistic choices can change from one day to another depending on one's mood, recent thoughts, relationship with the teacher, etc.

That's why I've always been a little confused about how some (philologists?) treat certain ancient texts as not being written by some authors due to the text's style, as if ancient people could not significantly deviate from their usual style.

Aransentin · on Jan 31, 2023

> first day of class, write a N word essay

Initially I thought you meant having the student write an essay about slurs, as the AI will refuse to output anything like that. Then I realized you meant "N" as in "Number of words".

Still, that first idea might actually work; make the students write about hotwiring cars or something that's controversial enough for the AI to ban it but not controversial enough that anybody will actually care.

dragonwriter · on Feb 1, 2023

> upside: this also helps detect students who are getting help (e.g. parents)

Downside: it also likely detects, without differentiation, students whose writing style undergoes a major jump because of learning, which is, you know, the actual thing you are trying to promote.

JumpCrisscross · on Jan 31, 2023

> first day of class, write a N word essay and sign a release permitting this to be used to detect cheating

Why once? Most students need writing skills more than half the high-school curriculum.

feanaro · on Jan 31, 2023

There are also some countries that don't fetishize cheating this much so perhaps they will just continue not caring.

Zababa · on Feb 1, 2023

Arms race are not really an issue, you've managed to make your student work, one way or another.

userbinator · on Feb 1, 2023

Programming is fortunately one of those subjects where there's something objectively close to a correct/optimal solution. A trivial example is that there aren't very many sane ways to write a "Hello world" program, but this seems to hold for more complex tasks too. In fact, in my experience, the ones who cheat and get it wrong are the most obvious.

Unfortunately, the software industry also has plenty of literal tools who are far too trusting of what the computer says (or authority in general, but that's another rant...)

Gigachad · on Feb 1, 2023

I once got called up because my work was flagged as 100% copied. I had uploaded it, made a mistake so I deleted it and uploaded a new file. Second file was flagged as copied. Was able to explain it by pointing at the screen that was claiming I plagiarized my own name.

runarberg · on Jan 31, 2023

So the computer’s evaluation model assumed that each student’s learning is independent? That seems like a ludicrous assumption to put in a model like this, unless the model authors have never been in a class setting (which I doubt).

TheDudeMan · on Jan 31, 2023

You are asking teachers to be good at their job. But is teaching a merit-based profession?