LLM generated submissions should be disallowed

155 points by orib 8 hours ago on lobsters | 58 comments

Internet_Janitor | 8 hours ago

Sounds good to me. Even if slop occasionally slips through, an explicit policy against LLM-generated content- ideally without carveouts and exceptions to squabble over in the comments- should reduce junk on the front page and provide clear-cut grounds for banning regular offenders.

GavinAnderegg | 7 hours ago

I agree in general with this. If someone can't be bothered to write something themselves, I'm not interested in reading it. That said, I don't know of a foolproof way of identifying LLM-generated text. I don't love the idea of people (or sources) being banned because the articles they post might be generated. I've been accused of using LLMs in my writing because I sometimes use em-dashes… even though I've been using them for over 25 years now.

Internet_Janitor | 6 hours ago

The occasional false-positive shouldn't be a problem so long as it isn't a zero-tolerance policy, and I don't see why it would need to be. My reading of the OP is that bans would be at the discretion of mods and in response to a pattern of posting slop repeatedly.

I think it's very important to avoid letting imperfect detection of slop get in the way of having a policy against it. Mistakes will happen from time to time, but we must apply back-pressure against the onslaught of llm-generated garbage flooding the web and choking out human-authored articles.

hailey | 2 hours ago

The em-dash meme needs to go away. Sure it's a trope, but I don't actually care about it. There are far more obvious tells. Reading LLM generated text makes me feel concussed - there are a lot of words in front of me but for all the text I'm reading I am unable to actually pull much meaning out of it.

madhadron | 5 hours ago

This seems reasonable to me. If someone can't take the time to lay out their thoughts, why should I take the time to read them? If they want to use a chatbot as a rubber duck for working on their argument or checking their grammar, fine. I don't think we even need particular detection, just the expectation of community members and, in blatant cases, removal.

addison | 6 hours ago

I really despise LLM-generated articles and want to see them gone. This extreme case is obvious and likely easy to identify, and I believe there are exceptionally few who would dislike seeing these gone.

Let's suppose someone now submits software where they have accepted some LLM-generated commits. Or, maybe they've generated it entirely with LLMs, but have documented the process as an analysis of doing so. These whataboutisms are me playing devil's advocate, but it's clear that there is a spectrum of tolerance in lobsters. I highly doubt that any content touched by LLMs being banned will be accepted. I think the most likely to be widely accepted answer is flagging without negative karma consequences, just as a way for people to drop a "hey, this is generated to my threshold of unacceptable, heads up" to subsequent viewers. That is largely what the big comment threads now do, and perhaps we can reduce the fighting in the comments + give people some signal about the content they are exposing themselves to.

[OP] orib | 6 hours ago

The other scenarios you provided are different categories, and fairly clearly deliniated ones. If you want to have them treated differently from how they are now, start a new thread.

addison | 6 hours ago

I suspect that were this policy implemented, such categories would be flagged the same. People tend to look for witches when you give them torches.

As an aside, I'm not certain if it's intended, but your message comes off a bit hostile (since edited). I'm genuinely trying to engage with what you're proposing. I suspect that such a blunt instrument will not be effective, given that it will be used inappropriately and subjectively. I also want these things gone, but I do not feel that this is the way to do it. Spammers will already get cleared, and slop that only serves as engagement bait already gets pretty quickly marked as spam. This additional step will just invite people to debate even more under every post containing what they perceive or do not perceive to be intolerable slop.

[OP] orib | 6 hours ago

I trust the mods to look and remove repeat violations of the policy they intend to enforce. I don't believe that it would make sense for flagging to do anything than alert a moderator of a potential policy violation.

I don't think this is a real problem.

I do think it would make sense to discuss if we should allow vibe coding here, but that's a different thread.

hailey | 5 hours ago

Agreed.

It's usually pretty obvious when something is LLM generated, and in many cases I've seen the author has posted about using LLMs elsewhere on their site, even if they haven't disclosed it in the article in question. That tends to make it pretty clear.

The community's slop radar seems pretty accurate too - I can't recall seeing any big comment threads accusing the author of using LLMs when in fact they have not. If nobody can tell then nobody can tell.

I'm happy to proceed assuming good faith in the truly ambiguous cases, because usually it's blatantly obvious and it's the blatantly obvious stuff that's causing problems. Nobody is trying to game lobsters by sneaking in as many undetected LLM written posts as they can.

hongminhee | 3 hours ago

As a non-native English speaker, I'm worried this proposal will hit translators first.

I write in Korean first and use an LLM to translate into English. Sometimes that's Kagi Translate; sometimes Claude, when the subject needs more background. The thoughts are mine and I don't paste the output verbatim. Even after editing, I've been told my writing smells like slop.

Native speakers are much better at noticing what sounds like slop. I can catch it fairly well in Korean, but not in English. I can revise for hours and still miss the tells. The weak point is my English ear, not the argument.

If the test is “does this sound off to native speakers?”, non-native writers will lose. The rule may say “quality”, but the effect is: people like me post less. That pressure is already here on Lobsters. I feel it.

If the question is who came up with the ideas, then translation should not count against the post. It's no different from a grammar checker or a fluent friend's edits. The style may change. The claim does not.

This comment was also written with LLM assistance. To make that check possible, I'm sharing the Korean original here.

wareya | 3 hours ago

In most cases, "this was partly machine translated from <language X> into english" or something similar at the top of the post would go a long way toward convincing most people that it's not slop. It's not perfect but I think for most cases it would be enough. Maybe I'm just being optimistic though.

cole-k | an hour ago

I've said this before, but my gripe with LLM-generated prose is that it breaks the (what is now I guess antiquated) social contract that --- without disclosure --- whatever you are reading is text someone has thought about and composed word-for-word.

I personally wouldn't mind a different policy that requires proper disclosure of LLM use for writing in someone's article. So even if the whole thing is completely LLM-generated that's fine to me because at least I can close the tab immediately without feeling deceived. Also machine-translation is IMO a reasonable use of a language model.

It’s very easy to immediately yell “LLM slop!” on an article that you don’t like. And then, where are we? I want to see on-topic articles that I agree with, and also that I disagree with. That’s healthy.

I’m not sure how to evaluate articles as being “slop.” There are some obvious examples, and there are some not so obvious examples. There’s also the real possibility of legitimate articles appearing to be sloppy because the author happens to use certain style that Lemons tend to copy.

I think that a submitters “overall sloppiness” on “authored by” submissions might be a “fair” way to deal with this. Consistently posted, obvious slop, flags the author as sloppy. Maybe a mod reaches out says “stop” and if they don’t, they get banned.

Not sure if “sloppy submissions” via someone should be counted the same. It seems that the software could cool down, but not ban someone’s ability to post if they’re constantly submitting what appears to be slop. But, forcing every submitter to defend an article’s provenance or else they get banned, wouldn’t make for a good time.

wareya | 5 hours ago

Not every rule is a slippery slope. There are cases of actual obvious LLM slop and those are enough to be moderated. I think that you underappreciate just how antisocial LLM slop writing is. Right now those very obvious cases hover on the front page for multiple days because the AI bandwagoners are upvoting them.

There are cases of actual obvious LLM slop and those are enough to be moderated

Of course there are extremes, and if there’s a slop button, people will agree, in the same way “spam” is well filtered today.

You conveniently dismissed my point, which ironically, is the point.

I think that you underappreciate just how antisocial LLM slop writing is.

I think you under appreciate just how toxic assuming someone else doesn’t understand is.

wareya | 4 hours ago

I didn't dismiss your point, I argued past it because I don't think it's well-supported. You're making a slippery slope argument. You might not have thought about it to yourself that way, but that's what your first and final paragraphs are doing.

Tell me: what are the indicators of a slop post?

If your answer is anything close to “I know it when I see it”—there’s the argument. Again, we can all agree that obvious slop is obvious slop. In the absence of objective evaluation, you’ll get subjective bias.

wareya | 4 hours ago

Extremely high frequency of LLMisms, which change every year-ish but really are distinct from human-written text when published unfiltered. Having way too many em-dashes and "it's not just X, it's Y" and 3-item bullet-point lists and breaking everything down into high school essay format are the tells of about 6 months ago. Human writers do those things, but not with anywhere near the density that LLMs do them.

You don't need an objective evaluation. You only need a good-enough evaluation. There are existing rules that have subjective evaluations. In fact there are vibe-centric items in the posting guidelines. In fact, the very first item on the posting guidelines is vibe-centric: "Lobsters is more of a garden party than a debate club." It even specifically outlines that judgment is often necessary when moderators act: "There isn't a clear-cut line between this and discussing trends and advocating for improvements in the field, so expect frustrating judgement calls." This is normal for rules and guidelines written by mature people for other mature people.

If you can enumerate “LLMisms” then I can tell an LLM to mimic a different style that doesn’t use them.

Your argument is “I can tell you when I see it.” We’re cooked.

And yes, moderation is always subjective in the absence of objective rules…

wareya | 3 hours ago

My argument is not "I can tell you when I see it." I gave you a list of specific things. When there are too many of them, then I know it's LLM slop. Really. Yes, it's a statistical argument, but it's not "I know it when I see it".

If you make your LLM avoid emitting any LLMisms, then yeah, you're not going to be able to tell that its output came out of an LLM.

Yes, we're probably going to be cooked eventually.

I've seen actually obvious LLM slop articles that were like 15 pages long and probably seeded from like 1.5 pages of real human writing sit on the front page for like two days. If someone wants to get their thinking across to people, they should do it themselves and respect their readers' time, and also respect their own thoughts. When something gets mechanically expanded to 15 pages by an LLM, the arguments and bits of logic get confused and self-contradictory. Same with LLM-driven machine translation into languages that the author can't read or isn't sufficiently literate it.

That's the kind of low-hanging fruit that an LLM content ban needs to address ASAP. It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.

And yes, moderation is always subjective in the absence of objective rules…

Every good moderation system has a carveout for the mods to deal with people who are intentionally abusing the margins of overly-objective rules. Moderation is always subjective, even when it pretends to be objective. Rules aren't laws! They aren't a program!

bitshift | 2 hours ago

It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.

Well said! Thank you so much for writing this. I'm not anti-AI by a long stretch, but I am against people wasting my time, and I love rules that can be evaluated by readers without relying on suspicions. Banning all LLM content is overly broad, but I'm on board with banning "slop" in the pejorative sense.

Spam is a great comparison. It's okay to post an article from your company's blog—even if technically you got paid for writing it! The rule is against spam, not against money. And it's okay to ban an advertisement that's all fluff—even if you don't know for a fact that the author received cash versus a fruit basket in compensation.

The rule is against spam

Some people believe self / company promotion of any kind is spam in this community. It’s very subjective except in extremely obvious cases, and, honestly, not applied fairly. There are interesting self promotion posts that get squashed, while at the same time a run of 15 in a row self promoted, not very interesting, posts that make it through.

That’s fine! Moderation is imperfect, but it goes to the point that “slop” can’t always so easily be classified either.

[OP] orib | 33 minutes ago

And yet, we try to remove it. Let's do the same with the output of the dehumanizer.

I’m not arguing against you. I agree that blatant slop should be removed outright.

However, I am worried that non-blatant slop that people disagree with will just be marked as slop! This already happens. Yet, in this very thread, a person admits that they’re OK with slop if they can’t tell it was slop.

Would that person mark it as slop, still, if they agreed with it?

If they disagree? Yes. And this is why there’s no down vote button…

or at least use the blender well enough that I can't tell that they did.

I just caught this highlighted in a reply. This is a wild take. “It’s totaaally fine as long as it tricks me.”

How many times have you been catfished? And when it’s finally revealed… do you chuckle a bit and go “Damn! You got me! That’s the third time this week!”

or at least use the blender well enough that I can't tell that they did.

This is a wild take.

It is a take. I find it to be a sensible take. Someone needs to say this here. You don't need to be unnecessarily antagonistic while replying. The person you are replying to is taking the time to put forward their thoughts on an important matter and sharing it with us here. I think they are doing it well and I appreciate their comments. It is alright to disagree with them but you can do it without being so antagonistic.

From your other comment:

So… you agree. Got it.

Sheesh! Can we stop with this type of replies please? This is not the quality of conversation I expect when I come to a meta thread or any thread for that matter. I demand better from our crustaceans.

You have been here for a long time, so you should know this from https://lobste.rs/about already:

Climate: Lobsters is more of a garden party than a debate club. We're learning things we didn't know to be curious about and sharing what we've made. Disagreements are normal but fights are not; it's OK to make your point, share a resource, and let someone be wrong.

It is a take. I find it to be a sensible take. Someone needs to say this here. You don't need to be unnecessarily antagonistic while replying.

This person is admitting that they may get value out of LLM generated content if they are tricked into reading it!

What if they are tricked into reading it, and then they disagree? Do they then go claim it’s slop? If they agreed, is it still slop?

This is the exact point I’ve been trying to make. What is “slop” is too nuanced, except in the very obvious cases… which just means that heavy moderation will result in people trying harder to hide the fact that they didn’t write it.

I’d also invite you to reconsider who is the antagonist.

So… you agree. Got it.

If it were that simple then the labs would bake “avoid LLMisms” into the system prompt.

fleebee | 3 hours ago

If you can enumerate “LLMisms” then I can tell an LLM to mimic a different style that doesn’t use them.

I don't think it's very likely that the kind of "author" who types a 2-sentence prompt into ChatGPT and publishes the result verbatim in their blog will take the time cover their tracks.

LenFalken | 2 hours ago

They literally gave you examples. If someone makes the effort to mask their LLM slop, is it still slop? Isn't it now just LLM-aided writing at the forefront?

There are people in this thread that believe any LLM usage is slop and would prefer it not be allowed.

LenFalken | an hour ago

Yeah, but I'm kind of arguing in your favor too: any well masked LLM output is effectively good enough to share if no human can know it's from an LLM.

Otherwise, you get the ban :)

xyproto | 5 hours ago

We could just fight fire with fire and use an LLM slop detector.

Do you have an accurate slop detector? Cause I’ve never seen one, and since Lemon squeezers are ultimately optimizers, subtle changes until it’s defeated will just be done.

wareya | 4 hours ago

You don't need one. Just like you don't need absolute accurate detectors for every other rule violation that's currently on the rules. You're being disingenuous.

disingenuous

No. I am pointing out that you can’t “fight fire with fire” if you don’t have a slop detector. You can accept an imperfect detector if you want, but the goal is to continue to have a community that posts quality content to discuss. Too many false positives negatively reduces engagement. Too many false negatives, and all you’ve done is move the slop bar. Moving the slop bar just repeats the cycle.

k749gtnc9l3w | 6 hours ago

You are talking about selected worthless examples, but for all the garbage quality of LLM-generated text in https://lobste.rs/s/hfnps5/osmand_s_faster_offline_navigation it actually has unique on-topic content too…

[OP] orib | 6 hours ago

No. I'm talking about not wanting any LLM generated text to be posted, to the best of our ability. Note the people voting it as spam. If I had seen it, I would have been one of them.

k749gtnc9l3w | 5 hours ago

More people upvoted it, though, because it does have real content. (As you can see, I did complain about garbage LLM writing in the comments, but also tried to guess-the-prompt with bullet points of the meaningful content from the post)

Helithumper | 8 hours ago

Related: https://lobste.rs/s/wee21u/this_is_written_by_llm_comments_should_be

Some Examples:

I do agree that LLM Generated text should be filterable and/or flag-able.

The issue with labelling the content as "off-topic" is the case where the post is LLM-Generated, but actually on-topic. This may result in a contradictory use of off-topic. I think in the past a new option for flagging was discussed as well (e.g. https://lobste.rs/s/po97lh/new_tag_suggestion_genai_assisted)

I still think that a new flag option is a better option instead of abuse of the off-topic flag as discussed in https://lobste.rs/s/rkjpob/proposal_add_ai_generated_as_flag_reason.

[OP] orib | 7 hours ago

I accept your feedback, I phrased that poorly: It should be disallowed.

I don't particularly care about filterable or flaggable. The users posting it should be removed from the site. Flagging or tagging it is a waste unless it leads to action being taken.

cyberia | 4 hours ago

Concerns:

  • It is not easy to know with perfect accuracy when text is LLM-generated (although in a majority of cases it is obvious).
  • Occasionally, a post which is somehow "important" or "notable" is LLM-generated. For example the CopyFail report.

Proposal:

  • LLM-generated content be disallowed except under exceptional circumstances such as high-impact security vulnerabilities.
  • Judgement of whether text is LLM-generated should be conservative, giving benefit of the doubt in borderline cases.
  • "Exceptional circumstances" should be at mod discretion, or a list of qualifying circumstances could be specified and iterated on as the policy evolves.

I'd rather prioritize seeing every blog post from every incredible person here than ever see an LLM generated article.

Some don't hesitate to post their own content, but others do. It'd be nice to have a mass list of blogs from people on this site so I could add to my RSS reader list. "Homepage" is in our profile already so maybe there's a way to generate and make that info available.

duncan_bayne | an hour ago

Strong upvote from me. Disallowed; repeated posting equals ban; allow flagging as such.

DanOpcode | an hour ago

Yes please. At the very least remove the most blatant AI generated blog posts. If it's not obvious, let it be.

If the author cannot bother type it out himself/herself, I cannot bother reading it. And even if it's the author's own original thoughts, the AI writing style is tiresome.

Well, I'd say this is a good idea, but I also think it's going to turn into witch hunts.

The reason LLMs write like they do is because someone out there writes like that.

Or close enough to it.

I don't know what to advise. Other than caution.

[OP] orib | 3 hours ago

Yes, there will be mistakes. The problem with rushing to build an unpleasant future is that things tend to get worse. People that shouldn't have to care about certain problems start needing to. We already opted out of the best outcome, now we're trying to find ways to minimize the tech industry's harm.

It may be worth thinking about what the thngs that get built may be used do, and not merely what you hope they will do.

Fixing things after they get broken is hard, and even a good job leaves scars.

oceanhaiyang | 4 hours ago

All forms of dashes are illegal now

icefox | 4 hours ago

I don't know why people seem to think em dashes are some kind of smoking gun. Pandoc's HTML output will generate them from markdown --, for example.

bitshift | 2 hours ago

First they came for the hyphens, and I did not speak out, because I used en-dashes for numeric ranges.

Then they came for the en-dashes, and I did not speak out, because I used em-dashes for the cutting phrases in my haiku…


Then they came for the horizontal rules, and there was no one left to speak out for me.

singpolyma | 2 hours ago

I think users posting them regularly should be banned from the site.

This presumes the user has any way of knowing this is the case.

Student | 6 hours ago

Dvorak keyboard generated posts should be banned. I think users posting them regularly should be banned from the site.

Edit: I have flagged this post as spam for being written on a Dvorak keyboard.

hoistbypetard | 5 hours ago

Can you explain why you think think this is a reasonable analogue? The topic here is about how posts that aren't generated by humans should be disallowed. There's a big difference between a data entry method and how the post was generated.

This is a contentious comment (I saw it at -10, and now at -8), but it’s a fair criticism of the idea as far as I’m concerned.

Rovanion | 5 hours ago

Care to explain why you think that? I read it as both inflammatory and an apples to oranges comparison.

I’m on the side of limited LLM generated submissions, and banning those who continue to submit them, fwiw.

However, a part of me thinks that good content is good content, and I don’t necessarily care how it was written. If someone authors a blog post with a spell checker, or a grammar checker, or by speech-to-text, but the thought came from them, we probably can agree that it’s OK.

If an author has a thought, uses that to prompt an LLM to build an argument for or against it in a manner that treats the LLM as an “assistant” … is that OK? Where does the line get drawn exactly?

“Vibeblogging” — we can definitely agree is just slop and ban it. “Write a post about how a panda should have been the Linux mascot.” But, “Help me restructure this argument about why Object Oriented Programming blah blah blahs under the OOPS theorem of blah” … not sure?

The OP, here, points at this, albeit, as you say in a potentially inflammatory way.