Sounds good to me. Even if slop occasionally slips through, an explicit policy against LLM-generated content- ideally without carveouts and exceptions to squabble over in the comments- should reduce junk on the front page and provide clear-cut grounds for banning regular offenders.
Absolutely, people underestimate severely the value of a simple policy that minimizes the amount of debating, even if it might not be able to pull it all the way down to zero.
As much as I hate "hey gork, there are bulletpoints, make it big" articles, I think the distinction is kind of hazy in here. One "edge" case I can think of is using "rephrase" (or something like that, I'm too lazy to check right now) in LanguageTool. Is it LLM-generated? Yeah. Should it be banned? That's trickier question, especially when it comes to non-native English speakers. I don't think stuff I write is LLM-generated, but I consider my English at most mediocre, so I'm using this feature sometimes. And if it's allowed sometimes, we've just made a carve-out and we either need a cut-off or mod's vibe-based
That gets brought up every time, but come on - what percentage of AI articles submitted to aggregators or reshared on social media are malicious slop, versus "oh, I just asked ChatGPT to correct typos and now I'm being unfairly ostracized"? It's something like 1,000-to-1.
It's as if we were arguing that it's not OK to mark emails from Nigerian princes as spam because one of them could be from an actual prince. An excellent hypothetical, a terrible approach to spam filtering.
If someone is using an LLM for translation or minor copyediting, most of the time, it won't even be noticeable, but disclosure is a good way to avoid misunderstandings.
I have seen people on this very website treated very harshly when they have said they are only using LLMs for translation and minor copyediting. The sentiment is usually, better minor grammatical errors than any use of LLMs at all.
This disclosure is also often used as an excuse to derail discussions. The end result is people who use LLMs to help them write prefer just to not participate in the community at all.
It's not like in the past when someone said English wasn't their native language and they used Google translate to help them write their post, even though fundamentally it's the same underlying technology!
I have seen people on this very website treated very harshly when they have said they are only using LLMs for translation and minor copyediting
Oh, I've seen people say that. And in almost all instances, it was about getting caught red-handed and then insisting it's not your hand. In one case, it was a blog that was posting one long-form article about AI every single day. Dozens or hundreds of them, perfectly on schedule. Such a remarkable display of human creativity and persistence!
I'm not saying that there are no jerks or misguided people on the internet - I had someone call my own blog "slop" - but again, in the vast majority of cases, slop is slop and there isn't a whole lot of nuance to it.
I worry treating all people who claim they only use LLMs for copyediting as dirty rotten liars that are probably spammers will only lead to more toxic discourse. It also won't reduce the slop. Better to expand the spam and self-promotion policies to include LLM use and go from there.
I think articles that are heavily or entirely LLM generated should be flagged and automated systems deployed to detect them when posted.
I think people who are using LLMs should be upfront about it so others can more easily filter their comments. But I think the line between LLM aided posts and slop is fuzzy and benefits from careful moderation.
I'm not sure how this is different from my proposal, other than adding a few details on tools provided to moderators for finding attackers.
I left mechanism unspecified in my post deliberately, because I believe we will change the mechanism for removing slop as we gain experience on what works.
I do, however, believe that the specific threshold for this is a matter of trusting the moderation team to come up with and adjust policy, and I believe warnings and bans should come from people.
A policy that disproportionately harms non-native English speakers doesn't sit well with me. We are already seeing fairly exclusionary language being used elsewhere on the page https://lobste.rs/c/whnsrp and I don't think that's an acceptable price for preventing slop.
I would encourage non-native speakers to post in their native languages. Machine translation improves over time, but me seeing the improvement requires the original text. We may even be able to add a 'translate' button to lobste.rs, that would be nice.
In addition, I have seen people here offer to translate and edit non-native speakers, and I would like to see that continue.
As a non-native speaker myself, it is worth noting that most readers in sites such as lobste.rs are not receptive to texts written by non-native speakers unless they pass a very high-bar of either technical quality or english profficiency. This sudden defence of "non-native" speakers seems to come out of nowhere really. I don't really understand what non-native speakers have to do with the AI discussion at all.
In fact, from my perspective as a "non-native" speaker, I'd much rather see the text in its original language, or a manual translation by a human, as automatic translations can frequently fail to correctly translate the feelings and intentions behind the text.
I think articles that are heavily or entirely LLM generated should be flagged and automated systems deployed to detect them when posted.
I completely agree with your second paragraph. But this, i don't know...
First, it reads like implicitly asking the site maintainer to implement a complicated system to deal with a complicated social problem. And therefore it shifts the burden of fixing this problem to the maintainer and a hypothetical detection system. What happens when that automated solution inevitably fails and has negative consequences for actual people? After all, this whole thread is precisely about automated systems having unintended consequences on individuals and whole communities.
I'm sure this was not your intention to suggest this. This is simply where my mind went when reading "automated systems deployed to detect them when posted", and i just wanted to expand a bit on this idea.
I don't think an automated system could be a satisfactory solution for a site like Lobsters.
As i see it, this place is built on a foundation of trust between members. For any user here, you can see who invited them. For example, i was invited by someone i know in real life, someone whom i've had shared beers and experiences with. And they probably were invited by someone they knew, and so on.
It's important that we can trust that everyone here is a real person, with real interest in participating in this community. Of course, sometimes this trust is betrayed, and that's why moderation is so important too.
But just as the core of this site is "people-based" —i.e. members can join via invitation only, and you are responsible for not inviting a slew of bots or assholes—, i think that a proper policy against AI slop should also be people-based, and not automation-based.
First of all we could extend me-too flag to the articles. LLMs are built first of all to say again things already said a million times.
Yes, a human article saying again the same things already said in each of the three previous articles on the topic submitted to Lobste.rs this week will also suffer from this flag.
And, I guess, medium-term make a visible switch about how much to weight the flags. I will keep it at flag=downvote, some people will probably hide-on-three-flags.
Hi, I'm the one thousandth case. Asked Claude to proofread at least two of my most recent articles. Even ignored its perfectly good advice in order to maintain as much of my "humanity" as possible.
So how do you define slop? 400+ upvotes, nearly 200 comments, and not a single answer to this perfectly reasonable question. Someone even had the gall to flag it as a troll comment as though it was asked in bad faith.
When is the community going to answer the question? You might as well ban my domain from the site depending on how you define it. This "I know it when I see it" nonsense isn't cutting it.
I agree in general with this. If someone can't be bothered to write something themselves, I'm not interested in reading it. That said, I don't know of a foolproof way of identifying LLM-generated text. I don't love the idea of people (or sources) being banned because the articles they post might be generated. I've been accused of using LLMs in my writing because I sometimes use em-dashes… even though I've been using them for over 25 years now.
The occasional false-positive shouldn't be a problem so long as it isn't a zero-tolerance policy, and I don't see why it would need to be. My reading of the OP is that bans would be at the discretion of mods and in response to a pattern of posting slop repeatedly.
I think it's very important to avoid letting imperfect detection of slop get in the way of having a policy against it. Mistakes will happen from time to time, but we must apply back-pressure against the onslaught of llm-generated garbage flooding the web and choking out human-authored articles.
The em-dash meme needs to go away. Sure it's a trope, but I don't actually care about it. There are far more obvious tells. Reading LLM generated text makes me feel concussed - there are a lot of words in front of me but for all the text I'm reading I am unable to actually pull much meaning out of it.
I've never been concussed but I agree -- after reading one or two screenfulls of LLM-generated text, I begin to zone out. Again apologies to actual sufferers, but I imagine this is what it feels to have dyslexia, maybe?
I feel I should be able to take in the writing and ingest it into my conscious cognitive workspace, but it feels like I'm reading words and they just slip out of my mind right after I've read them. It's an extremely uncanny feeling.
I think we've lost this battle, with even people like Daniel Lemire shitting up good technical writing with sloppy images. It might be too late for pushback now, I don't think the community here would appreciate losing e.g. Lemire's writing.
As a non-native English speaker, I'm worried this proposal will hit translators first.
I write in Korean first and use an LLM to translate into English. Sometimes that's Kagi Translate; sometimes Claude, when the subject needs more background. The thoughts are mine and I don't paste the output verbatim. Even after editing, I've been told my writing smells like slop.
Native speakers are much better at noticing what sounds like slop. I can catch it fairly well in Korean, but not in English. I can revise for hours and still miss the tells. The weak point is my English ear, not the argument.
If the test is “does this sound off to native speakers?”, non-native writers will lose. The rule may say “quality”, but the effect is: people like me post less. That pressure is already here on Lobsters. I feel it.
If the question is who came up with the ideas, then translation should not count against the post. It's no different from a grammar checker or a fluent friend's edits. The style may change. The claim does not.
This comment was also written with LLM assistance. To make that check possible, I'm sharing the Korean original here.
In most cases, "this was partly machine translated from <language X> into english" or something similar at the top of the post would go a long way toward convincing most people that it's not slop. It's not perfect but I think for most cases it would be enough. Maybe I'm just being optimistic though.
And or having the original language linked so we can translate it ourselves. Again not sure people would be, it also doesn’t hurt since it may expand the readers.
No LLMs for comments on the bug tracker, including translation. English is encouraged, but not required. You are welcome to post in your native language and rely on others to have their own translation tools of choice to interpret your words.
Yeah this is definitely the better approach. Most browsers nowadays have translation built-in, and it feels almost utopian to imagine lobste.rs users posting freely in their native languages and being mutually intelligible.
Reddit silently did this recently, and while I'm annoyed about it from like three dimensions (Reddit, as usual, making sweeping changes against the wishes of their users; as a language-learner, the Russian subs I'm are on are all now by default English and require extra taps to reset; the translations are often poor and notably sloppy ...)
... it's also absolutely kinda breathtakingly utopian, too?
Although it's hurt my language-learning efforts, it's also utterly mind-bending to read, at full speed and comprehension, a deep thread on a Russian subreddit about their opinions about women. (That, particular thread, was unpleasant for other reasons, but ...)
Ich bezweifle das ein solches Vorgehen hier wirksam wäre. Moderation wäre zum einen schonmal fast unmöglich das kulturelle Neubegriffe wie "Sprich Deutsch Du Hurensohn" oder "Dieser Kommentarbereich ist nun Eigentum der BRD GmbH" nur schwer zu übersetzen sind, es wird zuviel kultureller Kontext benötigt um dies einheitlich über alle möglichen Sprachen so umzusetzen. Dazu kommt natürlich das nach meiner Erfahrung, solch mehrsprachliche Fäden oder Pfosten meist eher zu Fehlkommunikationen führen. Ich lehne eine solche Einstellung grundsätzlich ab, insbesondere da damit nur noch mehr GSM Nutzung gefördert wird in Form von der ausweitenden Nutzung von solchen Übersetzungstools die alle auf GSMs basieren.
I doubt that this would work. Moderation lacks context for cultural neologisms they do not understand, even just a handful of languages would make this impossible. Additionally, my experience with Reddit's implementation in threads and posts is that it leads to misunderstandings and miscommunications. I therefore reject such a mode of operation fundamentally, especially if it will only lead to more LLM usage in the form of promoting browser translations tools, which are all LLM based.
English is not my native language either. We should stop being so afraid of making grammatical mistakes and trying to get perfect grammar. Our unique voices, even if a bit broken english, are important.
I do not want a tech forum I frequent to be value-neutral on authenticity. It is normal to have values, and to establish rules and norms for the spaces you are a part of based on the values you share with other people who frequent that space.
This comment section is largely a conversation about the values that people in this space have, and the potential establishment of a new norm based on these values. You have different values, and do not want this norm to be established. Your comment reads like you’re against the concept of having norms entirely, when what’s actually happening is that you want a different norm.
I've said this before, but my gripe with LLM-generated prose is that it breaks the (what is now I guess antiquated) social contract that --- without disclosure --- whatever you are reading is text someone has thought about and composed word-for-word.
I personally wouldn't mind a different policy that requires proper disclosure of LLM use for writing in someone's article. So even if the whole thing is completely LLM-generated that's fine to me because at least I can close the tab immediately without feeling deceived. Also machine-translation is IMO a reasonable use of a language model.
It is not about writing an own thing. It is about when reading another blog posting and if the content looks legit, as a non-native English speaker I may not be able to recognize that it was "improved" with the help of an LLM. So when I submit this then to Lobsters, it does get spam modded and with this possible new rule I even get banned. This is really not a nice way to treat foreign speaking people, who still may have valuable inputs. Rules like this will make me even more hesitant to submit anything on Lobsters.
Thanks for sharing this. I think this is a very important point, and we should consider it.
Whatever policy gets put in place (or not), i think it's crucial that it leaves some room for mistakes, common human fuckups, and so on. Especially, discouraging newcomers from submitting stories, for fear of getting struck by the moderation hammer, would be a very sad outcome. And more so if this ends up discouraging people form ever writing or creating their own stuff in the first place.
I don't know how any of this will turn out. But for the time being, what i would say to you is: if you have something that you've personally written (/created), or someone whom you know has written, and you feel it's on-topic and want to share it, please do so. Even if it's not in perfect English, or it's too basic, or a "terrible hack", or whatever. I'm sure me and many other people here would appreciate having those genuine stories submitted :)
And, in a way, it's also our collective responsibility to push back against overzealous attempts at shutting people down for ambiguous claims of "this looks like AI-generated" when there's a person on the receiving end.
I guess the more complicated cases would be things like you mention: you read an article that you find interesting and worth sharing here, and it might end up being flagged as AI slop. In those cases, TBH, if you've actually read the thing and didn't catch it, it's a honest mistake, which anyone can do. Again, i would still encourage you to share in such cases. It's better to to be sincere and wrong than to self-censor.
Notice that that case is very different than someone (or a bot) sharing slop they haven't even read, just to promote some site, or to collect karma on their account here for more nefarious purposes. Those are the cases we need to protect against.
When I started blogging, I was like you: writing in French and translating (manually) in English. It is well known, this does not work well, with or without LLMs. I think this is a good idea for you to write in English and translate to Korean instead. Your English will sound better. You can still reach to a translator for a few sentences.
If tools for using LLMs to translate are ubiquitous, then why not post in your native language and let the readers translate if they want or need to? You should be posting the highest signal form of your writing possible and let others distill it if they want. Who knows, maybe I know Korean, you’ve removed a chance for me to read it.
Text which is translated through ml is not the same thing as text which is wholesale generated through prompting an LLM, which is what this thread is about. I don't think anybody cares about the former, it is clearly quite useful (as you can attest) and it carries none of the issues people have with the latter.
Thanks for sharing. I was able to put your original Korean through several different translation tools to try to get a consensus, which is something that I really don't mind doing. That exercise has helped me empathize with you more. All of the translations are a little bit awkward—you're never going to be able to get rid of that completely. And your translation was different in a couple places from the automated translation—some places were better and some places were worse. (e.g. "not the argument" in your third paragraph has a sort of negative connotation; the machine translation was "not my writing ability" but "not my writing" or "not my ideas" would sound good.) I'm at risk of over-analyzing your writing at this point, for which I apologize, but it does make more sense to me now.
Very good content. Obviously some writing issues (machine-translated), even analogies that are unexpected in English. Big disclaimer about being machine translated, link to the original.
This sidesteps the worry about being generated content completely in my view
I think people should just write in whatever language they are comfortable, and let others use a translator (and potentially ask for clarification) if they do not understand. We should embrace the diversity of the community. I would enjoy seeing people hold multilingual conversations; it would make the internet feel more world-spanning.
Language knowledge is not a binary off or on. Some people know a lot of a language that is not their mother tongue, others know a little. Some know more than others whose mother tongue is the language in question.
Participating in discussion is the best way to learn any language and that will likely involve using tools, such as dictionaries, to figure out how to say what you want to say.
I assume the people using translators do know some English, but aren't very confident in their skills so they prefer to write in their original language and then translate it.
But, more importantly - I'm happy to hear from people (in English) even if their English is at a very basic level, who I fear might feel excluded by your comment (even if that wasn't your intention). There are bloggers out there whose English clearly isn't that good, but whose blogs I've thoroughly enjoyed regardless.
Some of us were lucky (?) enough to be born in an English-speaking country. Some of us (like me) aren't native speakers, but were lucky enough to have had access to quality English education as a kid. There's an entire generation of people in my country that didn't - and, in general, the older you get, the harder it is to learn languages. I wouldn't want to exclude people based on that.
While I do understand that people are downvoting the post above because of the tone, and the tone does indeed come across as gate-keepy, this whole rethoric about non-native speakers is extraordinary to me.
I'm a non-native speaker. For many years I have been reading HN and lobste.rs, and other communities. In this time, I have seen the extremely high standards we apply to foreigners in these forums: if the language is not perfect, it proves that the ideas behind the text are bad and wrong. Any submissions must contain perfect English, specially when written by foreigners.
I mentioned above that this response is extraordinary. I mean this in the most specific meaning of that word. In this post alone, there are more than 10 people claiming this as a legitimate defense for LLM-generated content. In my lifetime, I've never seen such a strong defense of non-native speakers in such a forum. While heartening in current times, and I take it sincerely, I would hope that it does not alter the course of this discussion.
This is because it is in my opinion entirely unrelated to this topic, and this defense of LLM generated content is weak. The logic is that by using an LLM, non-native speakers may sound more native, and therefore improve their writing. I don't think I need to quote a whitepaper on this, but I'm sure I could if needed: asking an LLM to rewrite a text does not improve it, but rather removes specificity and makes it more bland and generic.
This seems reasonable to me. If someone can't take the time to lay out their thoughts, why should I take the time to read them? If they want to use a chatbot as a rubber duck for working on their argument or checking their grammar, fine. I don't think we even need particular detection, just the expectation of community members and, in blatant cases, removal.
I really despise LLM-generated articles and want to see them gone. This extreme case is obvious and likely easy to identify, and I believe there are exceptionally few who would dislike seeing these gone.
Let's suppose someone now submits software where they have accepted some LLM-generated commits. Or, maybe they've generated it entirely with LLMs, but have documented the process as an analysis of doing so. These whataboutisms are me playing devil's advocate, but it's clear that there is a spectrum of tolerance in lobsters. I highly doubt that any content touched by LLMs being banned will be accepted. I think the most likely to be widely accepted answer is flagging without negative karma consequences, just as a way for people to drop a "hey, this is generated to my threshold of unacceptable, heads up" to subsequent viewers. That is largely what the big comment threads now do, and perhaps we can reduce the fighting in the comments + give people some signal about the content they are exposing themselves to.
The other scenarios you provided are different categories, and fairly clearly deliniated ones. If you want to have them treated differently from how they are now, start a new thread.
I suspect that were this policy implemented, such categories would be flagged the same. People tend to look for witches when you give them torches.
As an aside, I'm not certain if it's intended, but your message comes off a bit hostile (since edited). I'm genuinely trying to engage with what you're proposing. I suspect that such a blunt instrument will not be effective, given that it will be used inappropriately and subjectively. I also want these things gone, but I do not feel that this is the way to do it. Spammers will already get cleared, and slop that only serves as engagement bait already gets pretty quickly marked as spam. This additional step will just invite people to debate even more under every post containing what they perceive or do not perceive to be intolerable slop.
I trust the mods to look and remove repeat violations of the policy they intend to enforce. I don't believe that it would make sense for flagging to do anything than alert a moderator of a potential policy violation.
I don't think this is a real problem.
I do think it would make sense to discuss if we should allow vibe coding here, but that's a different thread.
When posting a link to lobste.rs, the link ought to be to human generated content. That means, if you post a link to source code, that source code ought to be written by a human. If you post a link to a blog post about a piece of software, that blog post ought to have been written by a human. I don't see the problem.
I agree. Not everyone does. People will pursue corner cases and gotchas, and even define "human-generated" differently. I'm fairly firm in my "I will not waste time with repos containing .claude" and "I refuse to read machine prose". Others are not, and I worry that moves like this will be rejected, leading to nothing being done instead. That is all.
I think we can treat these as two different categories.
By and large, I'm fine with LLM-assisted software here (though I think Show Lobsters-style posts should now have a higher threshold), but I'm uninterested in reading LLM-assisted/generated writing.
If Lobsters refuses posts about LLM-assisted software, I predict it will be a ghost town in a few years.
Me too! But if you post a link to source code as a lobste.rs submission, that source code better be written by a human. If you post a link to a blog post about some source code, the blog post ought to be written by a human.
It's usually pretty obvious when something is LLM generated, and in many cases I've seen the author has posted about using LLMs elsewhere on their site, even if they haven't disclosed it in the article in question. That tends to make it pretty clear.
The community's slop radar seems pretty accurate too - I can't recall seeing any big comment threads accusing the author of using LLMs when in fact they have not. If nobody can tell then nobody can tell.
I'm happy to proceed assuming good faith in the truly ambiguous cases, because usually it's blatantly obvious and it's the blatantly obvious stuff that's causing problems. Nobody is trying to game lobsters by sneaking in as many undetected LLM written posts as they can.
I’ve called someone out on at least two different stories for adding a “this is LLM slop” style comment, when it turned out they just didn’t like the author’s writing style and no LLM was in sight. There’s probably more that weren’t challenged.
Those weren’t even that hard to disprove, the blog had posts going back 10 years that read identically in style.
You assert that there was no LLM in sight, but this is just as unknowable.
In both cases where you've called people out recently, it was just one person saying the style seemed LLM generated (no broader consensus) and they yielded immediately when challenged. That seems like the ideal outcome to me?
It’s very easy to immediately yell “LLM slop!” on an article that you don’t like. And then, where are we? I want to see on-topic articles that I agree with, and also that I disagree with. That’s healthy.
I’m not sure how to evaluate articles as being “slop.” There are some obvious examples, and there are some not so obvious examples. There’s also the real possibility of legitimate articles appearing to be sloppy because the author happens to use certain style that Lemons tend to copy.
I think that a submitters “overall sloppiness” on “authored by” submissions might be a “fair” way to deal with this. Consistently posted, obvious slop, flags the author as sloppy. Maybe a mod reaches out says “stop” and if they don’t, they get banned.
Not sure if “sloppy submissions” via someone should be counted the same. It seems that the software could cool down, but not ban someone’s ability to post if they’re constantly submitting what appears to be slop. But, forcing every submitter to defend an article’s provenance or else they get banned, wouldn’t make for a good time.
Not every rule is a slippery slope. There are cases of actual obvious LLM slop and those are enough to be moderated. I think that you underappreciate just how antisocial LLM slop writing is. Right now those very obvious cases hover on the front page for multiple days because the AI bandwagoners are upvoting them.
On the other hand, a rule that could maybe apply to anything means flagging becomes even more of a downvote!
Look at https://lobste.rs/s/wee21u/this_is_written_by_llm_comments_should_be -- 5 offtopic flags and 8 spam flags. I mean, that's obviously a meta post so it's on-topic... and obviously not commercial so it doesn't fall under the strict definition of spam in the about page. And yet, people are flagging it! I think it's better not to hand those people a flag reason where they have plausible deniability.
I intentionally did not suggest flagging. I don't particularly care about flags, beyond notifying the moderators that attention is needed. I would be happy with making flags invisible.
What I want is warnings for first time slop posters, which eventually builds to removal if they consistently keep posting.
What I want is warnings for first time slop posters
Getting a warning even if it's your "first time" is a bit harsh. People differ in their ability to detect slop - and even for what I feel is "obvious" slop, there can be a reasonable explanation for posting it[1].
[1] I'm a bit embarassed about my top comment there, it was too toxic towards the submitter. Usually I don't really blame people for submitting slop, stuff slips by (and there can be valuable thoughts behind the slop), but I got a bit too pissed there.
I think there's a difference between a comment from a regular user pointing out something you've posted is slop, and a mod (who has power over you) warning you about it.
Oops, I misunderstood your proposal then. (I can't believe people are flagging this proposal too! I'm soooo going to make a meta post asking for there to be an additional confirmation step before flagging, once this all cools down...)
I didn't dismiss your point, I argued past it because I don't think it's well-supported. You're making a slippery slope argument. You might not have thought about it to yourself that way, but that's what your first and final paragraphs are doing.
If your answer is anything close to “I know it when I see it”—there’s the argument. Again, we can all agree that obvious slop is obvious slop. In the absence of objective evaluation, you’ll get subjective bias.
Extremely high frequency of LLMisms, which change every year-ish but really are distinct from human-written text when published unfiltered. Having way too many em-dashes and "it's not just X, it's Y" and 3-item bullet-point lists and breaking everything down into high school essay format are the tells of about 6 months ago. Human writers do those things, but not with anywhere near the density that LLMs do them.
You don't need an objective evaluation. You only need a good-enough evaluation. There are existing rules that have subjective evaluations. In fact there are vibe-centric items in the posting guidelines. In fact, the very first item on the posting guidelines is vibe-centric: "Lobsters is more of a garden party than a debate club." It even specifically outlines that judgment is often necessary when moderators act: "There isn't a clear-cut line between this and discussing trends and advocating for improvements in the field, so expect frustrating judgement calls." This is normal for rules and guidelines written by mature people for other mature people.
My argument is not "I can tell you when I see it." I gave you a list of specific things. When there are too many of them, then I know it's LLM slop. Really. Yes, it's a statistical argument, but it's not "I know it when I see it".
If you make your LLM avoid emitting any LLMisms, then yeah, you're not going to be able to tell that its output came out of an LLM.
Yes, we're probably going to be cooked eventually.
I've seen actually obvious LLM slop articles that were like 15 pages long and probably seeded from like 1.5 pages of real human writing sit on the front page for like two days. If someone wants to get their thinking across to people, they should do it themselves and respect their readers' time, and also respect their own thoughts. When something gets mechanically expanded to 15 pages by an LLM, the arguments and bits of logic get confused and self-contradictory. Same with LLM-driven machine translation into languages that the author can't read or isn't sufficiently literate it.
That's the kind of low-hanging fruit that an LLM content ban needs to address ASAP. It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.
And yes, moderation is always subjective in the absence of objective rules…
Every good moderation system has a carveout for the mods to deal with people who are intentionally abusing the margins of overly-objective rules. Moderation is always subjective, even when it pretends to be objective. Rules aren't laws! They aren't a program!
It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.
Well said! Thank you so much for writing this. I'm not anti-AI by a long stretch, but I am against people wasting my time, and I love rules that can be evaluated by readers without relying on suspicions. Banning all LLM content is overly broad, but I'm on board with banning "slop" in the pejorative sense.
Spam is a great comparison. It's okay to post an article from your company's blog—even if technically you got paid for writing it! The rule is against spam, not against money. And it's okay to ban an advertisement that's all fluff—even if you don't know for a fact that the author received cash versus a fruit basket in compensation.
Some people believe self / company promotion of any kind is spam in this community. It’s very subjective except in extremely obvious cases, and, honestly, not applied fairly. There are interesting self promotion posts that get squashed, while at the same time a run of 15 in a row self promoted, not very interesting, posts that make it through.
That’s fine! Moderation is imperfect, but it goes to the point that “slop” can’t always so easily be classified either.
I’m not arguing against you. I agree that blatant slop should be removed outright.
However, I am worried that non-blatant slop that people disagree with will just be marked as slop! This already happens. Yet, in this very thread, a person admits that they’re OK with slop if they can’t tell it was slop.
Would that person mark it as slop, still, if they agreed with it?
If they disagree? Yes. And this is why there’s no down vote button…
In the extreme case, do you shoot, and then bury people you disagree with? Of course not. You ignore them.
Disagreeing with an article doesn’t make it spam. Disagreeing with the tools that someone chooses to use to produce a piece of writing doesn’t necessarily make it “slop.”
It could be spam! It could be slop! And I agree that in blatantly obvious cases you mark it as such to protect others from wasting time.
I don’t believe we should blindly encourage anything that appears to be LLM assisted be immediately discarded.
As now repeatedly stated, people are “fine” with LLMs assisted writing as long as they can’t tell.
However, rather than engage in thought, the moment it is accused as “slop,” engagement stops and its suddenly irrelevant hogwash. This is stupid.
or at least use the blender well enough that I can't tell that they did.
I just caught this highlighted in a reply. This is a wild take. “It’s totaaally fine as long as it tricks me.”
How many times have you been catfished? And when it’s finally revealed… do you chuckle a bit and go “Damn! You got me! That’s the third time this week!”
or at least use the blender well enough that I can't tell that they did.
This is a wild take.
It is a take. I find it to be a sensible take. Someone needs to say this here. You don't need to be unnecessarily antagonistic while replying. The person you are replying to is taking the time to put forward their thoughts on an important matter and sharing it with us here. I think they are doing it well and I appreciate their comments. It is alright to disagree with them but you can do it without being so antagonistic.
From your other comment:
So… you agree. Got it.
Sheesh! Can we stop with this type of replies please? This is not the quality of conversation I expect when I come to a meta thread or any thread for that matter. I demand better from our crustaceans.
You have been here for a long time, so you should know this from https://lobste.rs/about already:
Climate: Lobsters is more of a garden party than a debate club. We're learning things we didn't know to be curious about and sharing what we've made. Disagreements are normal but fights are not; it's OK to make your point, share a resource, and let someone be wrong.
It is a take. I find it to be a sensible take. Someone needs to say this here. You don't need to be unnecessarily antagonistic while replying.
This person is admitting that they may get value out of LLM generated content if they are tricked into reading it!
What if they are tricked into reading it, and then they disagree? Do they then go claim it’s slop? If they agreed, is it still slop?
This is the exact point I’ve been trying to make. What is “slop” is too nuanced, except in the very obvious cases… which just means that heavy moderation will result in people trying harder to hide the fact that they didn’t write it.
I’d also invite you to reconsider who is the antagonist.
These things do have a personality baked into the prompt… that’s kind of the idea?
What they aren’t… is a tool designed to write deceptive blog posts by default. Blogging died with the rise of the 140 character “microblog”, for the general public.
If you can enumerate “LLMisms” then I can tell an LLM to mimic a different style that doesn’t use them.
I don't think it's very likely that the kind of "author" who types a 2-sentence prompt into ChatGPT and publishes the result verbatim in their blog will take the time cover their tracks.
They literally gave you examples. If someone makes the effort to mask their LLM slop, is it still slop? Isn't it now just LLM-aided writing at the forefront?
Quality is neither objective nor subjective. Rather, Quality emerges from the interaction between subjects and objects. For the complete explanation, read Pirsig 1974.
Who says analyzing writing styles is impossible? The billions of people over the years that have suffered through a literature class would like a word.
That doesn’t mean that all of those people can definitely determine if something is “slop” or not. The most commonly cited “slop clue” is “It uses em dashes correctly!” I guess that probably does work well, because the vast majority of US Americans read and write at a 4th grade level, so of course they don’t know how to use punctuation…
Do you have an accurate slop detector? Cause I’ve never seen one, and since Lemon squeezers are ultimately optimizers, subtle changes until it’s defeated will just be done.
You don't need one. Just like you don't need absolute accurate detectors for every other rule violation that's currently on the rules. You're being disingenuous.
No. I am pointing out that you can’t “fight fire with fire” if you don’t have a slop detector. You can accept an imperfect detector if you want, but the goal is to continue to have a community that posts quality content to discuss. Too many false positives negatively reduces engagement. Too many false negatives, and all you’ve done is move the slop bar. Moving the slop bar just repeats the cycle.
EDIT: To make my take clearer, I don't like LLMs on principle and will never voluntarily use them to generate content in my personal work. I'd go so far as to say everyone should follow that rule, but for ethical, financial, and environmental reasons that are all off-topic for this site. The only on-topic objection I have is the flood of slop, defined as low-quality content that is low-quality because it was LLM-generated. Note that this does not imply all LLM-generated content is slop.
Empirically, there is quite a lot of this out there; from a community management perspective, it makes sense to have a rule against LLM-generated content in general, not because the goal is to actually enforce a strict ban against LLM-generated content, but because the goal is to enforce a quality bar and people who put in the work to evade the rule-as-written are going to inadvertently satisfy the quality bar, satisfying the rule-in-spirit.
Society has tons of these kinds of norms, sometimes reasonable and sometimes not. The point of censoring certain words on TV isn't to stop people from saying them - it's to signal that they aren't appropriate for "polite" society. The point of requiring people to get and carry driver's licenses in order to drive has little to do with the little bit of plastic - it's to force people to do the learning you're supposed to in order to get the plastic. And the point of banning commercial content as spam isn't to prevent any content that could conceivably generate revenue - it's to signal that content posted for the primary purpose of generating revenue isn't welcome even if it tangentially discusses computing, so works that resemble that sort of content have to meet a higher bar of being clearly relevant than other works.
For these purposes we do not need an objectively verifiable, consistently reliable, philosophically pure method for detecting violators of the written rule. We just need a method that's good enough to cause empirical improvements in the empirical problem the rule is intended to address.
These threads are a travesty because nothing is stopping one person from replying to multiple threads saying "you're wrong, actually". LLMs aren't even a unique issue on this site when it comes to dissent in the comments. It's not exactly rare for a topic to be mentioned on this site, and for arguments against the existence/utility of that topic to be argued to death in every thread related to the topic.
It is not shocking LLMs are one of these topics. It turns out that LLMs become relevant every time it turns out someone used them.
My opinion is that there are no cases where submission of LLM output is justified. Do not pretend to speak languages you do not speak, do not pretend to know things you don't know, and don't pretend to have written things you did not write. It's simply anti-social to insist that your lack of effort (whether to the content, or to following netiquette) be respected and enjoyed the same way as real content.
In order to meet the reality of the situation, I would ultimately engage with "LLM content with declaration of LLM use" as a solution, as it would make avoiding the content as simple as if it were banned. I started with my actual opinion, this is the dilution of said opinion into a tolerable outcome that is less strict.
Additionally, the burden to speak something other than English falls on the native speakers of English.
I'm sorry to hear some folk feel like they don't get to publish content in their own language, fearing that an English-speaking audience will never form due to language barriers. Slopping your native-language text isn't the solution. (I have no solution to prescribe as I got to learn English natively therefore I get to engage with localization of my content into other languages I speak as an enriching activity)
I'd rather prioritize seeing every blog post from every incredible person here than ever see an LLM generated article.
Some don't hesitate to post their own content, but others do. It'd be nice to have a mass list of blogs from people on this site so I could add to my RSS reader list. "Homepage" is in our profile already so maybe there's a way to generate and make that info available.
Yes please. At the very least remove the most blatant AI generated blog posts. If it's not obvious, let it be.
If the author cannot bother type it out himself/herself, I cannot bother reading it. And even if it's the author's own original thoughts, the AI writing style is tiresome.
A lot of rules in this site are already fuzzy, and I don’t see an issue with adding this one as well. It would definitely make the site better in terms of quality content, in my opinion. It would also stop two behaviors that I am tired of seeing:
Endless cascades of comments discussing whether something is LLM generated (“this is AI generated” “and?” “ai bad for environment/thinking/society/etc”). There was a post mentioning this yesterday and I am so tired of seeing this. I also want this content gone! I’m so fucking tired of seeing AI generated crap online, it sucks. If I see those comments I then don’t look at the post because I don’t want to spend more time reading AI slop. But I am also tired of thinking “oh hey this post I dismissed earlier has 30 comments, there is probably some interesting discussion happening”, and then it turns out it’s all discussing the AI usage on a post that has nothing to do either with it.
Slapping the “vibecoding” label on anything that AI touched at all. It’s effectively made this useless. I do use this as a signal for whether it’s about usage of AI, and I am tired of thinking I’ll see something that could be interesting with that tag (e.g. the recent post about curl’s experience with Mythos) only for it to turn out to be a slop article about something else. This is also something that gets complained about ad eternum on comments and it leads to the same situation as the above point.
I saw some people mentioning things like the recent Linux vulnerabilities that were found by AI and how the site about the report was AI generated and could not have been posted. And honestly, good! I am so tired of AI slop sites and that extends to those. It’s not like you couldn’t post about the vulnerability, you could have posted a secondary high quality source like LWN, which is something that happens all the time when the primary source is not allowed to be posted for some reason.
And regarding translation/proofreading/em-dashes/etc: again, rules can be fuzzy, but honestly none of that makes a text “slop” in my opinion, and I am fine with applying the rule in a way that has some false negatives and only excludes obvious slop. There is a massive difference between a post that had some grammar mistakes fixed by an LLM and something that was so completely rewritten that it completely lost its character.
It is not easy to know with perfect accuracy when text is LLM-generated (although in a majority of cases it is obvious).
Occasionally, a post which is somehow "important" or "notable" is LLM-generated. For example the CopyFail report.
Proposal:
LLM-generated content be disallowed except under exceptional circumstances such as high-impact security vulnerabilities.
Judgement of whether text is LLM-generated should be conservative, giving benefit of the doubt in borderline cases.
"Exceptional circumstances" should be at mod discretion, or a list of qualifying circumstances could be specified and iterated on as the policy evolves.
I do agree that LLM Generated text should be filterable and/or flag-able.
The issue with labelling the content as "off-topic" is the case where the post is LLM-Generated, but actually on-topic. This may result in a contradictory use of off-topic. I think in the past a new option for flagging was discussed as well (e.g. https://lobste.rs/s/po97lh/new_tag_suggestion_genai_assisted)
I accept your feedback, I phrased that poorly: It should be disallowed.
I don't particularly care about filterable or flaggable. The users posting it should be removed from the site. Flagging or tagging it is a waste unless it leads to action being taken.
I do agree that LLM Generated text should be filterable
New/casual users don't use filters. While I like the idea of a containment tag that I can filter out to make the site more pleasant to use, it's not a long-term solution. It biases the demographic of new users towards people who are fine with whatever we use the containment tag for, because these who are not will just be put off by it and leave.
The issue with labelling the content as "off-topic" is the case where the post is LLM-Generated, but actually on-topic. This may result in a contradictory use of off-topic.
The contradiction is in the premise, "LLM-generated, but actually on-topic". If we define LLM generated stuff to be off-topic, they're off-topic.
No. I'm talking about not wanting any LLM generated text to be posted, to the best of our ability. Note the people voting it as spam. If I had seen it, I would have been one of them.
More people upvoted it, though, because it does have real content. (As you can see, I did complain about garbage LLM writing in the comments, but also tried to guess-the-prompt with bullet points of the meaningful content from the post)
Yes, there will be mistakes. The problem with rushing to build an unpleasant future is that things tend to get worse. People that shouldn't have to care about certain problems start needing to. We already opted out of the best outcome, now we're trying to find ways to minimize the tech industry's harm.
It may be worth thinking about what the thngs that get built may be used do, and not merely what you hope they will do.
Fixing things after they get broken is hard, and even a good job leaves scars.
There already are witch hunts, see yesterday’s meta post about this. I’d rather have some false negatives in the rules and let some plausibly deniable writing through and just keep out the obvious slop than have nothing at all. And at least then you could say “these are the rules, it’s decided the post is staying/leaving, comments arguing about the rules are off topic” and be done with it.
Also, I’ve never seen someone write like ChatGPT does. Maybe if all they do is post worthless fake positivity entrepreneurial shit on LinkedIn? But that’s a different kind of low effort slop that shouldn’t be allowed.
I'd be fine with that too, honestly. But before AI, we basically would never see low effort content posted here, and AI makes a lot of things low effort, e.g. an entire desktop written in assembly (which was posted a while back). Without AI, seriously impressive. With AI, meh.
Not a single response from @pushcx here. Continues the pattern of weak/absent moderation and inability to set clear guardrails on this issue, or take any action which makes lobste.rs policy clear and calms the community.
I disagree. I think LLM is but a tool. We should ban based on contents and values instead of the tool used to create the article. Spamming accounts are obviously should be banned. Bad quality articles are already subjected to downvoting.
The issue is that you just downvote comments based on a reason, and “AI slop” or “low quality” isn’t one of them. What happens is that those posts just end up flagged as spam (which they technically aren’t) or slapped with the vibecoding tag (which isn’t really accurate).
Agreed, flag as “low quality” which is what we actually care about. Someone could produce an extremely well thought out and high quality article that fits right into this site but could have used a tool to help them. It would be a shame to ban those posts.
I'd like this. What has the process for policy changes on Lobsters been in the past? I've seen multiple proposals of varying magnitude surrounding LLM submissions in the last months, but haven't noticed any official reaction or changes.
I strongly support this. I've discussed why in the past so I won't rehash it here, specially since other people have raised most of my concerns with LLM-generated articles.
I’ve felt for a few years that EM DASH (—) is too narrow a lot of the time. I sometimes use TWO-EM DASH (⸺) or in extremities THREE-EM DASH (⸻).
I went through a phase of using HYPHEN instead of HYPHEN-MINUS about 5–8 years ago, but I gave up on that one. Just a little bit too inconvenient to type, and interacts poorly with too many fonts. But I do always use MINUS SIGN if that’s what I mean.
(The tricky thing with HYPHEN-MINUS is that in various situations it is actually correct and HYPHEN would be wrong. If talking about left-pad or HYPHEN-MINUS, for example, you need to use HYPHEN-MINUS rather than HYPHEN. The existence of straight quotation marks and hyphen-minus is one of the more disappointing parts of typewriter/computer history to me.)
Whole article generation? LLM draft with human finish? Human draft with LLM finish? Is proof reading OK? Or is it permanently tainted the second an LLM touches it?
You're discussing articles specifically. What about projects and code? Low effort vibecoding banned? How about high effort? LLM programming plus human review OK? What about human programming plus LLM review? Or is the project permanently tainted the second you commit a CLAUDE.md file?
It's important for me to know exactly where the limits are so that I know where I stand. Depending on which categories you decide to ban, I may or may not be excluded from ever sharing anything I make.
I like the intention of leaving lobste.rs hand-picked and human-made, but I don't support policies that are nearly impossible to enforce transparently.
Broadly, I agree that LLM-generated text is not something I want to see on a site dedicated to discussion about technology. I believe that discussion of ideas requires the ideas to come from a genuine human source, so an LLM slop article is not something that should be considered on-topic in my opinion.
I am a bit more sympathetic to those who use LLMs for translation of non-English content, however. There is a definite bias towards English-language content in the tech space, and I think it's unfortunately correct that a non-English article/video will generally perform less well and lead to less informed discussion. A carveout for this is therefore probably sensible, but I would want there to be a link to the original content clearly accessible so that people can validate the translation.
As for people using LLMs to "clean up" content: I don't feel this is as violative as purely LLM-generated content, but disclosing it would be wise. I'm generally untrusting of this statement, though. "Cleaning up" content using generative methods that essentially rewrite entire sections can easily change the meaning behind what's written.
I continue to believe that this polarizing topic is best solved not by banning, but marking these submissions as such. Just like we have PDF and Video tags that let me filter out, for example, large file downloads that I don't want on my phone, LLM-generated content should be a meta tag that identifies it as such and lets users filter as they like.
This gets around the usual complaint of "A tag will say it is on topic" because those meta tags don't allow a submission to go through. It also allows users who don't want to read LLM-generated material to hide it, while those who want to read it can continue to do so.
The other complaint is going to be the submitter might not know it is LLM-generated to tag it as such and that's fine. We have a tag submission system for a reason and it is used every day. Just check the mod log.
At no point does this excuse slop, which can still be flagged a spam, whether it was written by a human or a machine.
Yes, it is OK to post videos and PDFs. And unless site rules change, whether you or I like it, LLM-generated content is within bounds. My argument is that rather than argue every time, let's just clearly mark it so folks can filter and be done.
When posting your own work I agree 100%, but I think it’s unfair to expect submitters to become “LLM police”. I speak English well, but it’s not my native language and I cannot reliably tell if something is LLM-generated outside of extremely obvious cases that Pangram also detects…
We should get really clear about what constitutes "LLM generated" (e.g., is it "LLM generated" if an LLM is used for spell check, grammar, punctuation, etc cleanup? Or cleaning up the language on behalf of foreign speakers? Or automatic translation?)
We should also be clear about how we are going to identify whether something is LLM generated or not. Is it just "a lot of people think it is"? I would also like to see a complimentary rule against comments that accuse an article of being LLM generated--we don't need the top 50 comments in every article to be speculating about whether an article is LLM generated or not.
That's impossible to tell, so obviously not. We've also had spellcheckers for decades now that don't come with all the baggage of LLMs.
More broadly, I think trying to figure out exact rules is counterproductive. This rule will never be fully objective, it will by necessity be left to the discretion to the moderators and our community. As a community, we'll never reach consensus on where to draw the line, doing so will just delay the introduction of the rule endlessly. Perfect is the enemy of good.
we don't need the top 50 comments in every article to be speculating about whether an article is LLM generated or not.
If we rely on human judgment to distinguish between LLMs and authentic humans, we're creating a competition that LLMs are already very capable of winning. It will be very difficult for humans to beat LLMs at the task of convincing this community that their content is authentic, especially as LLMs improve. To the extent that we think we can distinguish between LLM content and authentic human content today is the extent to which LLMs are not being prompted to sound like a Lobste.rs or HN submission.
Relying on human judgment will multiply the work required of moderators and make it impossible for human submitters and authors to know whether their content will be flagged by this community. It sets up an arm race between individual people and LLMs, and LLMs (trained on the entire Internet, including every Lobste.rs submission and all the comments, moderation log, etc) are already very capable of winning that race.
Instead of trying to litigate whether an LLM was used, I think we should craft rules and tools around the things that we actually care about. If we hate LLMs because their content is bland or the information is low quality or the content is disinteresting, then we should have tools to moderate those things (and to an extent, we already have upvotes and flags and the hide button and so on). Trying to fight against LLMs via judgment is a losing battle.
The thing that's valuable to me is that it was written by humans.
I hate LLMs because of the dehumanization project that they are a result of. I don't care to debate the quality of output of a machine designed to devalue and replace human creativity. Remember, we successfully automated art before we automated code.
If you support this endeavor, I think you are an asshole, and you should remove yourself from human society. Go spend your time talking to the machines instead, and stop harming people.
LLM text is written in the blood of future generations. I am only being a little overdramatic when I say this. If it works better, our future becomes more grim.
It’s precisely because I agree with you that I don’t want to see us set up an arms race that we will lose. Regulating the use of AI by relying on human judgment about what is/n’t authentic is only viable if AI is unable to match human patterns as well as humans, and we are already past that point. The only reason you can identify some articles as “LLM generated” is because no one asked a frontier model to try to fool you. For all you know, you have been happily upvoting articles that have been written by LLMs that you believe to be written by humans.
This is not a problem that we can solve by simply burying our head deeper in the sand. We have to confront the practical reality that we have no good way to distinguish between LLMs and humans exclusively by examining outputs—any such rule is going to benefit LLMs more than people.
Historically, building a consensus against a technology takes a couple times longer than building a consensus about regulating an abusive monopoly. Arguably, we have an abusive oligopoly here, but still a more promising way.
I dislike LLM slop, but there's a reason LLMs write like the lowest common denominator writer. People write like that too. How are you enforcing this? You might as well try to ban low-quality content altogether.
I see people in support of this rule saying "yes! I'm so tired of reading slop, make it go away!" I'm tired of it too, but this is the reality now, and you can't just legislate it away.
My major gripe with LLMs is that people outsource their thinking to them, including their writing. So I share many folks' frustrations because I want to hear your thoughts, not some shortcut you took. But banning content that sounds vaguely LLM-generated is a shortcut too.
I'd support this if I thought such a rule could be effective. But rules are blunt instruments.
I'm tired of it too, but this is the reality now, and you can't just legislate it away.
The "this is slop" comments have been working pretty nicely so far, many people expressed appreciation for these under the other story. It makes sense to take this a step further.
I respectfully but fundamentally disagree. Too much harm is already done by "No LLM related submissions" policies in communities I was forced to leave because people are too lazy to differentiate neurodivergent writing from neuronormative writing, or even just non-native speakers and their writing from an LLM. By instituting such a policy, people will be pushed out and their dissent is going to be silenced on the back of "you're just posting slop" yet again and I would seriously reconsider if this community is a place I consider safe.
The only cost for you for non-implementation is the minor inconvenience of sometimes you might be looking at slop that hasn't been downvoted yet or that you have to write "LLM slop" below an article you disagree with. The cost of implementation is that everyone who doesn't write like you becomes effectively ostracized and silenced.
We don't have downvotes. Flags barely have any effect on story standing, we've had stories take up the first spot on the front page even though they had around the same amount of flags as upvotes (IIRC - it might've been half the amount of flags).
Could we get any examples of neurodivergent writing that got confused for LLM output? I don't think I've seen anyone bring that up as a response to articles being accused of being slop here. Even aside from this proposal, it'd just be useful to learn what might trigger false positives in my brain's AI detection algorithm.
The neurodivergence argument is persuasive. I got Claude to proofread one of my articles for me and it actually flagged the narrative and style as an actual problem. I ignored that to maintain my authenticity and posted it anyway. Then HN said I got issues and I've been regretting not listening to Claude ever since.
Please recommend platform like lobsters where LLM-haters are disallowed.
I'm really tired of "I think this was written by AI" comments under every article.
I don't care who and how wrote. If it's informative, interesting and useful I want to see it. I want people to discuss and even argue the subject in the comments, and not count how many em-dashes the article contained.
...news.ycombinator.com? Not disallowed, but you won't see pretty much any such comments there, and people are happy to discuss AI-generated things there.
If it's informative, interesting and useful I want to see it.
Exactly. Awhile back I read an article and posted it here, not knowing that it was generated by an LLM. Now that I know, I think that it’s still an interesting article. The author (i.e., the man who directed the machine) conveyed an interesting message.
I think it’s kind of like a speech: while the speaker may not have written the words he is speaking, he ought to have collaborated closely with his writing team, and he is accountable for what he says.
Can you explain why you think think this is a reasonable analogue? The topic here is about how posts that aren't generated by humans should be disallowed. There's a big difference between a data entry method and how the post was generated.
I've seen @Student make the argument that no one would submit a PR with "I used IntelliJ/Eclipse/vi to write this code", then no one should have to disclose the use of an LLM either, as it's "just a tool." I think Student is saying that banning LLM written blog posts/programs is just as silly as banning some one for using a non-QWERTY keyboard. It's a form of satire. Not good satire, but satire nontheless.
There’s no difference whatsoever. In both cases you’re saying nothing about the text just complaining about the tools you imagine were used.
I say “imagine” advisedly because the complaint gets thrown around without any evidence and is often rejected by the authors. It’s most reminiscent of “transvestigstors” throwing around accusations because they want to be in the know about a secret while dunking on an outgroup.
At best it’s deeply tedious and drives out on topic discussion.
I’m on the side of limited LLM generated submissions, and banning those who continue to submit them, fwiw.
However, a part of me thinks that good content is good content, and I don’t necessarily care how it was written. If someone authors a blog post with a spell checker, or a grammar checker, or by speech-to-text, but the thought came from them, we probably can agree that it’s OK.
If an author has a thought, uses that to prompt an LLM to build an argument for or against it in a manner that treats the LLM as an “assistant” … is that OK? Where does the line get drawn exactly?
“Vibeblogging” — we can definitely agree is just slop and ban it. “Write a post about how a panda should have been the Linux mascot.” But, “Help me restructure this argument about why Object Oriented Programming blah blah blahs under the OOPS theorem of blah” … not sure?
The OP, here, points at this, albeit, as you say in a potentially inflammatory way.
I've seen Grammarly advertisements for years now, and I've always had this unease about the product. I don't have an issue with spell checkers, but grammar is something else. Helping a person use "their", "they're" and "there" correctly is one thing, but I found Grammarly to be overtly paternalistic in its advice. There's a line between "helping an author" and "erasing an author's authentic voice" that I feel Grammarly crosses. I even have friends who are writers (one a published author!) who have similar misgivings about the tool. Now that LLMs can act as Grammarly dialed to 11 ... um ... I think something gets lost in translation.
If there were no difference, it wouldn't be immediately obvious that someone basically copy-pasted from an LLM for their blog. On the other hand, I cannot tell that someone wrote a blog post with a different keyboard.
This is silly whataboutism, if I wanted to read what an LLM had to say on a subject, I'd just open a prompt. I come to this website to read what people think and experience. Two things LLM's cannot do, as evidenced by the endless bland nonsense produced by the technology.
Have an idea, prompt an LLM to “write a blog post about”, copy and paste into WordPress …
AND
Have an idea, prompt an LLM to explore the space around it a bit… survey some references… perhaps suggest ways in which to tackle an argument. Then, you write it by hand using the suggestions. Finally, you feed it to an LLM to ask for “obvious problems” (e.g. spelling, bad grammar, etc)
… then I’m not sure we’ll ever get anywhere. If you can’t look past the fact that the author used the “Dvorak keyboard layout” or (heaven forbid!) the nano editor, or (omg what were you thinking?) ispell… to write a blog post that you might actually find something valuable from, god help us all. I’ve actually heard people make arguments such as “he’s using nano? dude is clearly not a serious programmer,” and all this hate feels too familiar.
Look, I get it. LLMs are horrific in sooo many ways, and I wish the bubble would pop, and their champions would get some sort of punishment for their malfeasance along the way. But the reality is, people find value in these things, and I don’t think they’ll completely disappear anytime soon. How we access them will definitely change. But, lest you engage only with content that authors have certified as being inference / LLM free, and then fully trust them to not lie to you, you’re going to get “tricked” into reading something that came from an LLM. Sorry to disappoint you.You’ll be fine—I promise.
The position is, in my view, incoherent; a fact best illustrated by the impossibility of categorically defining "LLM-generated." Any attempt to enforce such a boundary will inevitably devolve into a witch hunt.
High-effort, highly coherent work can be drafted with the aid of an LLM, just as low-effort slop can be produced without one. I'm more than happy to advocate against slop, human or machine-made, but policing the tool rather than the output is counterproductive, performative moralisation.
The only thing I can reliably detect beneath this growing sentiment is unearned moral superiority married to a peculiar blend of Ludditism and elitism. Worst of all is the dismissal of a technology that will, in the final analysis, refuse to be ignored; causing us to forfeit the opportunity to meaningfully shape how it is used and how it evolves.
Nearly 200 comments of almost nothing other than calls for blanket bans. Not a single coherent definition of what "slop" even is, or where the limits should be set. Asking the obvious question is apparently trolling.
It's becoming obvious that this is ideological opposition to AI and LLMs. That's fine but it'd be better if it's simply framed as such instead of using proxies like quality and effort.
Internet_Janitor | a month ago
Sounds good to me. Even if slop occasionally slips through, an explicit policy against LLM-generated content- ideally without carveouts and exceptions to squabble over in the comments- should reduce junk on the front page and provide clear-cut grounds for banning regular offenders.
kristoff | a month ago
Absolutely, people underestimate severely the value of a simple policy that minimizes the amount of debating, even if it might not be able to pull it all the way down to zero.
patryk | a month ago
As much as I hate "hey gork, there are bulletpoints, make it big" articles, I think the distinction is kind of hazy in here. One "edge" case I can think of is using "rephrase" (or something like that, I'm too lazy to check right now) in LanguageTool. Is it LLM-generated? Yeah. Should it be banned? That's trickier question, especially when it comes to non-native English speakers. I don't think stuff I write is LLM-generated, but I consider my English at most mediocre, so I'm using this feature sometimes. And if it's allowed sometimes, we've just made a carve-out and we either need a cut-off or mod's vibe-based
lcamtuf | a month ago
That gets brought up every time, but come on - what percentage of AI articles submitted to aggregators or reshared on social media are malicious slop, versus "oh, I just asked ChatGPT to correct typos and now I'm being unfairly ostracized"? It's something like 1,000-to-1.
It's as if we were arguing that it's not OK to mark emails from Nigerian princes as spam because one of them could be from an actual prince. An excellent hypothetical, a terrible approach to spam filtering.
If someone is using an LLM for translation or minor copyediting, most of the time, it won't even be noticeable, but disclosure is a good way to avoid misunderstandings.
zxtx | a month ago
I have seen people on this very website treated very harshly when they have said they are only using LLMs for translation and minor copyediting. The sentiment is usually, better minor grammatical errors than any use of LLMs at all.
This disclosure is also often used as an excuse to derail discussions. The end result is people who use LLMs to help them write prefer just to not participate in the community at all.
It's not like in the past when someone said English wasn't their native language and they used Google translate to help them write their post, even though fundamentally it's the same underlying technology!
lcamtuf | a month ago
Oh, I've seen people say that. And in almost all instances, it was about getting caught red-handed and then insisting it's not your hand. In one case, it was a blog that was posting one long-form article about AI every single day. Dozens or hundreds of them, perfectly on schedule. Such a remarkable display of human creativity and persistence!
I'm not saying that there are no jerks or misguided people on the internet - I had someone call my own blog "slop" - but again, in the vast majority of cases, slop is slop and there isn't a whole lot of nuance to it.
David_Gerard | a month ago
to be fair, I do literally a blog post and podcast about AI every single day ;-) I've yet to be accused of being a bot though. Artisanal.
zxtx | a month ago
I worry treating all people who claim they only use LLMs for copyediting as dirty rotten liars that are probably spammers will only lead to more toxic discourse. It also won't reduce the slop. Better to expand the spam and self-promotion policies to include LLM use and go from there.
[OP] orib | a month ago
The thing best known for unbounded growth is cancer. The entire point of this proposal is to remove slop posters from the community.
However, if you have alternative ideas that will effectively reduce slop, feel free to suggest them.
zxtx | a month ago
I think articles that are heavily or entirely LLM generated should be flagged and automated systems deployed to detect them when posted.
I think people who are using LLMs should be upfront about it so others can more easily filter their comments. But I think the line between LLM aided posts and slop is fuzzy and benefits from careful moderation.
[OP] orib | a month ago
I'm not sure how this is different from my proposal, other than adding a few details on tools provided to moderators for finding attackers.
I left mechanism unspecified in my post deliberately, because I believe we will change the mechanism for removing slop as we gain experience on what works.
I do, however, believe that the specific threshold for this is a matter of trusting the moderation team to come up with and adjust policy, and I believe warnings and bans should come from people.
zxtx | a month ago
I was more responding to the fairly draconian policy being outlined in https://lobste.rs/c/1ecn7s
A policy that disproportionately harms non-native English speakers doesn't sit well with me. We are already seeing fairly exclusionary language being used elsewhere on the page https://lobste.rs/c/whnsrp and I don't think that's an acceptable price for preventing slop.
[OP] orib | a month ago
I would encourage non-native speakers to post in their native languages. Machine translation improves over time, but me seeing the improvement requires the original text. We may even be able to add a 'translate' button to lobste.rs, that would be nice.
In addition, I have seen people here offer to translate and edit non-native speakers, and I would like to see that continue.
SamRW | a month ago
As a non-native speaker myself, it is worth noting that most readers in sites such as lobste.rs are not receptive to texts written by non-native speakers unless they pass a very high-bar of either technical quality or english profficiency. This sudden defence of "non-native" speakers seems to come out of nowhere really. I don't really understand what non-native speakers have to do with the AI discussion at all.
In fact, from my perspective as a "non-native" speaker, I'd much rather see the text in its original language, or a manual translation by a human, as automatic translations can frequently fail to correctly translate the feelings and intentions behind the text.
epidemian | a month ago
I completely agree with your second paragraph. But this, i don't know...
First, it reads like implicitly asking the site maintainer to implement a complicated system to deal with a complicated social problem. And therefore it shifts the burden of fixing this problem to the maintainer and a hypothetical detection system. What happens when that automated solution inevitably fails and has negative consequences for actual people? After all, this whole thread is precisely about automated systems having unintended consequences on individuals and whole communities.
I'm sure this was not your intention to suggest this. This is simply where my mind went when reading "automated systems deployed to detect them when posted", and i just wanted to expand a bit on this idea.
I don't think an automated system could be a satisfactory solution for a site like Lobsters.
As i see it, this place is built on a foundation of trust between members. For any user here, you can see who invited them. For example, i was invited by someone i know in real life, someone whom i've had shared beers and experiences with. And they probably were invited by someone they knew, and so on.
It's important that we can trust that everyone here is a real person, with real interest in participating in this community. Of course, sometimes this trust is betrayed, and that's why moderation is so important too.
But just as the core of this site is "people-based" —i.e. members can join via invitation only, and you are responsible for not inviting a slew of bots or assholes—, i think that a proper policy against AI slop should also be people-based, and not automation-based.
zxtx | a month ago
These are great points. I was trying to suggest that this is actually a spam problem and needs to be treated that way.
k749gtnc9l3w | a month ago
First of all we could extend me-too flag to the articles. LLMs are built first of all to say again things already said a million times.
Yes, a human article saying again the same things already said in each of the three previous articles on the topic submitted to Lobste.rs this week will also suffer from this flag.
And, I guess, medium-term make a visible switch about how much to weight the flags. I will keep it at flag=downvote, some people will probably hide-on-three-flags.
[OP] orib | a month ago
How does this lead to removing the articles from the feed, and what are the consequences for people repeatedly violating community norms?
Flagging is pointless. It doesn't affect the abuser, and it doesn't improve the quality of the site for users.
matheusmoreira | a month ago
Hi, I'm the one thousandth case. Asked Claude to proofread at least two of my most recent articles. Even ignored its perfectly good advice in order to maintain as much of my "humanity" as possible.
So how do you define slop? 400+ upvotes, nearly 200 comments, and not a single answer to this perfectly reasonable question. Someone even had the gall to flag it as a troll comment as though it was asked in bad faith.
When is the community going to answer the question? You might as well ban my domain from the site depending on how you define it. This "I know it when I see it" nonsense isn't cutting it.
GavinAnderegg | a month ago
I agree in general with this. If someone can't be bothered to write something themselves, I'm not interested in reading it. That said, I don't know of a foolproof way of identifying LLM-generated text. I don't love the idea of people (or sources) being banned because the articles they post might be generated. I've been accused of using LLMs in my writing because I sometimes use em-dashes… even though I've been using them for over 25 years now.
Internet_Janitor | a month ago
The occasional false-positive shouldn't be a problem so long as it isn't a zero-tolerance policy, and I don't see why it would need to be. My reading of the OP is that bans would be at the discretion of mods and in response to a pattern of posting slop repeatedly.
I think it's very important to avoid letting imperfect detection of slop get in the way of having a policy against it. Mistakes will happen from time to time, but we must apply back-pressure against the onslaught of llm-generated garbage flooding the web and choking out human-authored articles.
hailey | a month ago
The em-dash meme needs to go away. Sure it's a trope, but I don't actually care about it. There are far more obvious tells. Reading LLM generated text makes me feel concussed - there are a lot of words in front of me but for all the text I'm reading I am unable to actually pull much meaning out of it.
hoistbypetard | a month ago
I think I understand the spirit of what you're saying, but being concussed is so much worse. It's not really comparable, IMO.
pzel | a month ago
I've never been concussed but I agree -- after reading one or two screenfulls of LLM-generated text, I begin to zone out. Again apologies to actual sufferers, but I imagine this is what it feels to have dyslexia, maybe?
I feel I should be able to take in the writing and ingest it into my conscious cognitive workspace, but it feels like I'm reading words and they just slip out of my mind right after I've read them. It's an extremely uncanny feeling.
vbernat | a month ago
I am also in this case. I don't like slop posts, but I am not able to detect them reliably. I have been caught posting one of them here.
Tangent: what about posts with AI-generated images? Should they be treated differently if the text is not LLM-generated?
radio | a month ago
Unless the image is very necessary to make the point of should be banned too.
dzwdz | a month ago
I think we've lost this battle, with even people like Daniel Lemire shitting up good technical writing with sloppy images. It might be too late for pushback now, I don't think the community here would appreciate losing e.g. Lemire's writing.
Previously, on lobste.rs. Also, a separate anti-anti-social link aggregator could be an interesting experiment.
radio | a month ago
The world is very weird if even Daniel Lemire is so devoid of taste.
hongminhee | a month ago
As a non-native English speaker, I'm worried this proposal will hit translators first.
I write in Korean first and use an LLM to translate into English. Sometimes that's Kagi Translate; sometimes Claude, when the subject needs more background. The thoughts are mine and I don't paste the output verbatim. Even after editing, I've been told my writing smells like slop.
Native speakers are much better at noticing what sounds like slop. I can catch it fairly well in Korean, but not in English. I can revise for hours and still miss the tells. The weak point is my English ear, not the argument.
If the test is “does this sound off to native speakers?”, non-native writers will lose. The rule may say “quality”, but the effect is: people like me post less. That pressure is already here on Lobsters. I feel it.
If the question is who came up with the ideas, then translation should not count against the post. It's no different from a grammar checker or a fluent friend's edits. The style may change. The claim does not.
This comment was also written with LLM assistance. To make that check possible, I'm sharing the Korean original here.
wareya | a month ago
In most cases, "this was partly machine translated from <language X> into english" or something similar at the top of the post would go a long way toward convincing most people that it's not slop. It's not perfect but I think for most cases it would be enough. Maybe I'm just being optimistic though.
gabeio | a month ago
And or having the original language linked so we can translate it ourselves. Again not sure people would be, it also doesn’t hurt since it may expand the readers.
n3t | a month ago
That reminds me of Zig’s policy:
pzel | a month ago
Yeah this is definitely the better approach. Most browsers nowadays have translation built-in, and it feels almost utopian to imagine lobste.rs users posting freely in their native languages and being mutually intelligible.
ELLIOTTCABLE | a month ago
Reddit silently did this recently, and while I'm annoyed about it from like three dimensions (Reddit, as usual, making sweeping changes against the wishes of their users; as a language-learner, the Russian subs I'm are on are all now by default English and require extra taps to reset; the translations are often poor and notably sloppy ...)
... it's also absolutely kinda breathtakingly utopian, too?
Although it's hurt my language-learning efforts, it's also utterly mind-bending to read, at full speed and comprehension, a deep thread on a Russian subreddit about their opinions about women. (That, particular thread, was unpleasant for other reasons, but ...)
It feels, uh, very The Future, indeed.
alexandria | a month ago
Cytunaf yn llwyr!
cultpony | a month ago
Ich bezweifle das ein solches Vorgehen hier wirksam wäre. Moderation wäre zum einen schonmal fast unmöglich das kulturelle Neubegriffe wie "Sprich Deutsch Du Hurensohn" oder "Dieser Kommentarbereich ist nun Eigentum der BRD GmbH" nur schwer zu übersetzen sind, es wird zuviel kultureller Kontext benötigt um dies einheitlich über alle möglichen Sprachen so umzusetzen. Dazu kommt natürlich das nach meiner Erfahrung, solch mehrsprachliche Fäden oder Pfosten meist eher zu Fehlkommunikationen führen. Ich lehne eine solche Einstellung grundsätzlich ab, insbesondere da damit nur noch mehr GSM Nutzung gefördert wird in Form von der ausweitenden Nutzung von solchen Übersetzungstools die alle auf GSMs basieren.
I doubt that this would work. Moderation lacks context for cultural neologisms they do not understand, even just a handful of languages would make this impossible. Additionally, my experience with Reddit's implementation in threads and posts is that it leads to misunderstandings and miscommunications. I therefore reject such a mode of operation fundamentally, especially if it will only lead to more LLM usage in the form of promoting browser translations tools, which are all LLM based.
Aks | a month ago
English is not my native language either. We should stop being so afraid of making grammatical mistakes and trying to get perfect grammar. Our unique voices, even if a bit broken english, are important.
Student | a month ago
Yes but not everyone feels that way. They don’t have your belief in authenticity and that’s ok too.
otde | a month ago
I do not want a tech forum I frequent to be value-neutral on authenticity. It is normal to have values, and to establish rules and norms for the spaces you are a part of based on the values you share with other people who frequent that space.
This comment section is largely a conversation about the values that people in this space have, and the potential establishment of a new norm based on these values. You have different values, and do not want this norm to be established. Your comment reads like you’re against the concept of having norms entirely, when what’s actually happening is that you want a different norm.
[OP] orib | a month ago
They can join the LLMs on "hacker" "news". That place is half posts by LLMs, you'd feel right at home.
radio | a month ago
Ist's also ok for us to ban them.
cole-k | a month ago
I've said this before, but my gripe with LLM-generated prose is that it breaks the (what is now I guess antiquated) social contract that --- without disclosure --- whatever you are reading is text someone has thought about and composed word-for-word.
I personally wouldn't mind a different policy that requires proper disclosure of LLM use for writing in someone's article. So even if the whole thing is completely LLM-generated that's fine to me because at least I can close the tab immediately without feeling deceived. Also machine-translation is IMO a reasonable use of a language model.
Student | a month ago
Would you apply the same thing to ghostwritten content?
[OP] orib | a month ago
Yeah, I always found the idea of ghostwriting strange.
cole-k | a month ago
Well, even if ghostwritten you still get someone thinking and composing the text, so it seems an orthogonal issue.
[OP] orib | a month ago
If you want help editing without LLMs, feel free to reach out. Same goes to anyone willing to put in the effort.
My email is ori@eigenstate.org
fab23 | a month ago
It is not about writing an own thing. It is about when reading another blog posting and if the content looks legit, as a non-native English speaker I may not be able to recognize that it was "improved" with the help of an LLM. So when I submit this then to Lobsters, it does get spam modded and with this possible new rule I even get banned. This is really not a nice way to treat foreign speaking people, who still may have valuable inputs. Rules like this will make me even more hesitant to submit anything on Lobsters.
epidemian | a month ago
Thanks for sharing this. I think this is a very important point, and we should consider it.
Whatever policy gets put in place (or not), i think it's crucial that it leaves some room for mistakes, common human fuckups, and so on. Especially, discouraging newcomers from submitting stories, for fear of getting struck by the moderation hammer, would be a very sad outcome. And more so if this ends up discouraging people form ever writing or creating their own stuff in the first place.
I don't know how any of this will turn out. But for the time being, what i would say to you is: if you have something that you've personally written (/created), or someone whom you know has written, and you feel it's on-topic and want to share it, please do so. Even if it's not in perfect English, or it's too basic, or a "terrible hack", or whatever. I'm sure me and many other people here would appreciate having those genuine stories submitted :)
And, in a way, it's also our collective responsibility to push back against overzealous attempts at shutting people down for ambiguous claims of "this looks like AI-generated" when there's a person on the receiving end.
I guess the more complicated cases would be things like you mention: you read an article that you find interesting and worth sharing here, and it might end up being flagged as AI slop. In those cases, TBH, if you've actually read the thing and didn't catch it, it's a honest mistake, which anyone can do. Again, i would still encourage you to share in such cases. It's better to to be sincere and wrong than to self-censor.
Notice that that case is very different than someone (or a bot) sharing slop they haven't even read, just to promote some site, or to collect karma on their account here for more nefarious purposes. Those are the cases we need to protect against.
addison | a month ago
This is precisely why a karma-neutral flag is likely a useful solution IMO.
[OP] orib | a month ago
I would personally be strongly in support of flagging being an invisible state that requests moderator attention.
gerikson | a month ago
You saved me the effort to write a comment expressing this. Thanks.
tumdum | a month ago
If you are unsure, just don’t submit the link - it’s that simple (I’m also not native).
vbernat | a month ago
When I started blogging, I was like you: writing in French and translating (manually) in English. It is well known, this does not work well, with or without LLMs. I think this is a good idea for you to write in English and translate to Korean instead. Your English will sound better. You can still reach to a translator for a few sentences.
Fingel | a month ago
If tools for using LLMs to translate are ubiquitous, then why not post in your native language and let the readers translate if they want or need to? You should be posting the highest signal form of your writing possible and let others distill it if they want. Who knows, maybe I know Korean, you’ve removed a chance for me to read it.
kel | a month ago
Text which is translated through ml is not the same thing as text which is wholesale generated through prompting an LLM, which is what this thread is about. I don't think anybody cares about the former, it is clearly quite useful (as you can attest) and it carries none of the issues people have with the latter.
matthiasportzel | a month ago
Thanks for sharing. I was able to put your original Korean through several different translation tools to try to get a consensus, which is something that I really don't mind doing. That exercise has helped me empathize with you more. All of the translations are a little bit awkward—you're never going to be able to get rid of that completely. And your translation was different in a couple places from the automated translation—some places were better and some places were worse. (e.g. "not the argument" in your third paragraph has a sort of negative connotation; the machine translation was "not my writing ability" but "not my writing" or "not my ideas" would sound good.) I'm at risk of over-analyzing your writing at this point, for which I apologize, but it does make more sense to me now.
deivid | a month ago
Here is a good example:
https://blog.lyc8503.net/en/post/2023-hardware-collection/
Very good content. Obviously some writing issues (machine-translated), even analogies that are unexpected in English. Big disclaimer about being machine translated, link to the original.
This sidesteps the worry about being generated content completely in my view
0x57696c6c | a month ago
I think people should just write in whatever language they are comfortable, and let others use a translator (and potentially ask for clarification) if they do not understand. We should embrace the diversity of the community. I would enjoy seeing people hold multilingual conversations; it would make the internet feel more world-spanning.
radio | a month ago
If you dont't speak English you shouldn't be participating in English discussions.
thesnarky1 | a month ago
Language knowledge is not a binary off or on. Some people know a lot of a language that is not their mother tongue, others know a little. Some know more than others whose mother tongue is the language in question.
Participating in discussion is the best way to learn any language and that will likely involve using tools, such as dictionaries, to figure out how to say what you want to say.
dzwdz | a month ago
I assume the people using translators do know some English, but aren't very confident in their skills so they prefer to write in their original language and then translate it.
But, more importantly - I'm happy to hear from people (in English) even if their English is at a very basic level, who I fear might feel excluded by your comment (even if that wasn't your intention). There are bloggers out there whose English clearly isn't that good, but whose blogs I've thoroughly enjoyed regardless.
Some of us were lucky (?) enough to be born in an English-speaking country. Some of us (like me) aren't native speakers, but were lucky enough to have had access to quality English education as a kid. There's an entire generation of people in my country that didn't - and, in general, the older you get, the harder it is to learn languages. I wouldn't want to exclude people based on that.
codekobold | a month ago
This is the weirdest form of gatekeeping I have ever seen in the wild
dzwdz | a month ago
SamRW | a month ago
While I do understand that people are downvoting the post above because of the tone, and the tone does indeed come across as gate-keepy, this whole rethoric about non-native speakers is extraordinary to me.
I'm a non-native speaker. For many years I have been reading HN and lobste.rs, and other communities. In this time, I have seen the extremely high standards we apply to foreigners in these forums: if the language is not perfect, it proves that the ideas behind the text are bad and wrong. Any submissions must contain perfect English, specially when written by foreigners.
I mentioned above that this response is extraordinary. I mean this in the most specific meaning of that word. In this post alone, there are more than 10 people claiming this as a legitimate defense for LLM-generated content. In my lifetime, I've never seen such a strong defense of non-native speakers in such a forum. While heartening in current times, and I take it sincerely, I would hope that it does not alter the course of this discussion.
This is because it is in my opinion entirely unrelated to this topic, and this defense of LLM generated content is weak. The logic is that by using an LLM, non-native speakers may sound more native, and therefore improve their writing. I don't think I need to quote a whitepaper on this, but I'm sure I could if needed: asking an LLM to rewrite a text does not improve it, but rather removes specificity and makes it more bland and generic.
madhadron | a month ago
This seems reasonable to me. If someone can't take the time to lay out their thoughts, why should I take the time to read them? If they want to use a chatbot as a rubber duck for working on their argument or checking their grammar, fine. I don't think we even need particular detection, just the expectation of community members and, in blatant cases, removal.
addison | a month ago
I really despise LLM-generated articles and want to see them gone. This extreme case is obvious and likely easy to identify, and I believe there are exceptionally few who would dislike seeing these gone.
Let's suppose someone now submits software where they have accepted some LLM-generated commits. Or, maybe they've generated it entirely with LLMs, but have documented the process as an analysis of doing so. These whataboutisms are me playing devil's advocate, but it's clear that there is a spectrum of tolerance in lobsters. I highly doubt that any content touched by LLMs being banned will be accepted. I think the most likely to be widely accepted answer is flagging without negative karma consequences, just as a way for people to drop a "hey, this is generated to my threshold of unacceptable, heads up" to subsequent viewers. That is largely what the big comment threads now do, and perhaps we can reduce the fighting in the comments + give people some signal about the content they are exposing themselves to.
[OP] orib | a month ago
The other scenarios you provided are different categories, and fairly clearly deliniated ones. If you want to have them treated differently from how they are now, start a new thread.
addison | a month ago
I suspect that were this policy implemented, such categories would be flagged the same. People tend to look for witches when you give them torches.
As an aside, I'm not certain if it's intended, but your message comes off a bit hostile(since edited). I'm genuinely trying to engage with what you're proposing. I suspect that such a blunt instrument will not be effective, given that it will be used inappropriately and subjectively. I also want these things gone, but I do not feel that this is the way to do it. Spammers will already get cleared, and slop that only serves as engagement bait already gets pretty quickly marked as spam. This additional step will just invite people to debate even more under every post containing what they perceive or do not perceive to be intolerable slop.[OP] orib | a month ago
I trust the mods to look and remove repeat violations of the policy they intend to enforce. I don't believe that it would make sense for flagging to do anything than alert a moderator of a potential policy violation.
I don't think this is a real problem.
I do think it would make sense to discuss if we should allow vibe coding here, but that's a different thread.
mort | a month ago
When posting a link to lobste.rs, the link ought to be to human generated content. That means, if you post a link to source code, that source code ought to be written by a human. If you post a link to a blog post about a piece of software, that blog post ought to have been written by a human. I don't see the problem.
addison | a month ago
I agree. Not everyone does. People will pursue corner cases and gotchas, and even define "human-generated" differently. I'm fairly firm in my "I will not waste time with repos containing .claude" and "I refuse to read machine prose". Others are not, and I worry that moves like this will be rejected, leading to nothing being done instead. That is all.
kingmob | a month ago
I think we can treat these as two different categories.
By and large, I'm fine with LLM-assisted software here (though I think Show Lobsters-style posts should now have a higher threshold), but I'm uninterested in reading LLM-assisted/generated writing.
If Lobsters refuses posts about LLM-assisted software, I predict it will be a ghost town in a few years.
mort | a month ago
Why are they two different categories? If you link me to a source file with the expectation that I read it, it better be written by a human.
kingmob | a month ago
Because code is not the same category as prose, and not everyone shares your opinion.
mort | a month ago
How often are people linking to LLM-generated source files without surrounding commentary?
kingmob | a month ago
Very rarely, but that's not what we're talking about.
I'm directly addressing your quote above: "if you post a link to source code, that source code ought to be written by a human".
I'm personally OK with a human writing prose about LLM-assisted code.
mort | a month ago
Me too! But if you post a link to source code as a lobste.rs submission, that source code better be written by a human. If you post a link to a blog post about some source code, the blog post ought to be written by a human.
duncan_bayne | a month ago
Strong upvote from me. Disallowed; repeated posting equals ban; allow flagging as such.
hailey | a month ago
Agreed.
It's usually pretty obvious when something is LLM generated, and in many cases I've seen the author has posted about using LLMs elsewhere on their site, even if they haven't disclosed it in the article in question. That tends to make it pretty clear.
The community's slop radar seems pretty accurate too - I can't recall seeing any big comment threads accusing the author of using LLMs when in fact they have not. If nobody can tell then nobody can tell.
I'm happy to proceed assuming good faith in the truly ambiguous cases, because usually it's blatantly obvious and it's the blatantly obvious stuff that's causing problems. Nobody is trying to game lobsters by sneaking in as many undetected LLM written posts as they can.
caius | a month ago
I’ve called someone out on at least two different stories for adding a “this is LLM slop” style comment, when it turned out they just didn’t like the author’s writing style and no LLM was in sight. There’s probably more that weren’t challenged.
Those weren’t even that hard to disprove, the blog had posts going back 10 years that read identically in style.
hailey | a month ago
You assert that there was no LLM in sight, but this is just as unknowable.
In both cases where you've called people out recently, it was just one person saying the style seemed LLM generated (no broader consensus) and they yielded immediately when challenged. That seems like the ideal outcome to me?
Aks | a month ago
Yes please. I want to read your input, not your word salad generators input.
apg | a month ago
It’s very easy to immediately yell “LLM slop!” on an article that you don’t like. And then, where are we? I want to see on-topic articles that I agree with, and also that I disagree with. That’s healthy.
I’m not sure how to evaluate articles as being “slop.” There are some obvious examples, and there are some not so obvious examples. There’s also the real possibility of legitimate articles appearing to be sloppy because the author happens to use certain style that Lemons tend to copy.
I think that a submitters “overall sloppiness” on “authored by” submissions might be a “fair” way to deal with this. Consistently posted, obvious slop, flags the author as sloppy. Maybe a mod reaches out says “stop” and if they don’t, they get banned.
Not sure if “sloppy submissions” via someone should be counted the same. It seems that the software could cool down, but not ban someone’s ability to post if they’re constantly submitting what appears to be slop. But, forcing every submitter to defend an article’s provenance or else they get banned, wouldn’t make for a good time.
wareya | a month ago
Not every rule is a slippery slope. There are cases of actual obvious LLM slop and those are enough to be moderated. I think that you underappreciate just how antisocial LLM slop writing is. Right now those very obvious cases hover on the front page for multiple days because the AI bandwagoners are upvoting them.
a5rocks | a month ago
On the other hand, a rule that could maybe apply to anything means flagging becomes even more of a downvote!
Look at https://lobste.rs/s/wee21u/this_is_written_by_llm_comments_should_be -- 5 offtopic flags and 8 spam flags. I mean, that's obviously a meta post so it's on-topic... and obviously not commercial so it doesn't fall under the strict definition of spam in the about page. And yet, people are flagging it! I think it's better not to hand those people a flag reason where they have plausible deniability.
[OP] orib | a month ago
I intentionally did not suggest flagging. I don't particularly care about flags, beyond notifying the moderators that attention is needed. I would be happy with making flags invisible.
What I want is warnings for first time slop posters, which eventually builds to removal if they consistently keep posting.
dzwdz | a month ago
Getting a warning even if it's your "first time" is a bit harsh. People differ in their ability to detect slop - and even for what I feel is "obvious" slop, there can be a reasonable explanation for posting it[1].
[1] I'm a bit embarassed about my top comment there, it was too toxic towards the submitter. Usually I don't really blame people for submitting slop, stuff slips by (and there can be valuable thoughts behind the slop), but I got a bit too pissed there.
[OP] orib | a month ago
That's why the warning is useful. If you post slop by accident and nobody points it out, how will you know?
dzwdz | a month ago
I think there's a difference between a comment from a regular user pointing out something you've posted is slop, and a mod (who has power over you) warning you about it.
a5rocks | a month ago
Oops, I misunderstood your proposal then. (I can't believe people are flagging this proposal too! I'm soooo going to make a meta post asking for there to be an additional confirmation step before flagging, once this all cools down...)
apg | a month ago
Of course there are extremes, and if there’s a slop button, people will agree, in the same way “spam” is well filtered today.
You conveniently dismissed my point, which ironically, is the point.
I think you under appreciate just how toxic assuming someone else doesn’t understand is.
wareya | a month ago
I didn't dismiss your point, I argued past it because I don't think it's well-supported. You're making a slippery slope argument. You might not have thought about it to yourself that way, but that's what your first and final paragraphs are doing.
apg | a month ago
Tell me: what are the indicators of a slop post?
If your answer is anything close to “I know it when I see it”—there’s the argument. Again, we can all agree that obvious slop is obvious slop. In the absence of objective evaluation, you’ll get subjective bias.
wareya | a month ago
Extremely high frequency of LLMisms, which change every year-ish but really are distinct from human-written text when published unfiltered. Having way too many em-dashes and "it's not just X, it's Y" and 3-item bullet-point lists and breaking everything down into high school essay format are the tells of about 6 months ago. Human writers do those things, but not with anywhere near the density that LLMs do them.
You don't need an objective evaluation. You only need a good-enough evaluation. There are existing rules that have subjective evaluations. In fact there are vibe-centric items in the posting guidelines. In fact, the very first item on the posting guidelines is vibe-centric: "Lobsters is more of a garden party than a debate club." It even specifically outlines that judgment is often necessary when moderators act: "There isn't a clear-cut line between this and discussing trends and advocating for improvements in the field, so expect frustrating judgement calls." This is normal for rules and guidelines written by mature people for other mature people.
apg | a month ago
If you can enumerate “LLMisms” then I can tell an LLM to mimic a different style that doesn’t use them.
Your argument is “I can tell you when I see it.” We’re cooked.
And yes, moderation is always subjective in the absence of objective rules…
wareya | a month ago
My argument is not "I can tell you when I see it." I gave you a list of specific things. When there are too many of them, then I know it's LLM slop. Really. Yes, it's a statistical argument, but it's not "I know it when I see it".
If you make your LLM avoid emitting any LLMisms, then yeah, you're not going to be able to tell that its output came out of an LLM.
Yes, we're probably going to be cooked eventually.
I've seen actually obvious LLM slop articles that were like 15 pages long and probably seeded from like 1.5 pages of real human writing sit on the front page for like two days. If someone wants to get their thinking across to people, they should do it themselves and respect their readers' time, and also respect their own thoughts. When something gets mechanically expanded to 15 pages by an LLM, the arguments and bits of logic get confused and self-contradictory. Same with LLM-driven machine translation into languages that the author can't read or isn't sufficiently literate it.
That's the kind of low-hanging fruit that an LLM content ban needs to address ASAP. It doesn't actually really matter that much that people who are more "careful" about their LLM writing use are going to slip through the filters. I'm not on a purity crusade here. I just want my time to be respected and for people to put forth their own ideas instead of putting them through a statistical blender for no good reason, or at least use the blender well enough that I can't tell that they did.
Every good moderation system has a carveout for the mods to deal with people who are intentionally abusing the margins of overly-objective rules. Moderation is always subjective, even when it pretends to be objective. Rules aren't laws! They aren't a program!
bitshift | a month ago
Well said! Thank you so much for writing this. I'm not anti-AI by a long stretch, but I am against people wasting my time, and I love rules that can be evaluated by readers without relying on suspicions. Banning all LLM content is overly broad, but I'm on board with banning "slop" in the pejorative sense.
Spam is a great comparison. It's okay to post an article from your company's blog—even if technically you got paid for writing it! The rule is against spam, not against money. And it's okay to ban an advertisement that's all fluff—even if you don't know for a fact that the author received cash versus a fruit basket in compensation.
apg | a month ago
Some people believe self / company promotion of any kind is spam in this community. It’s very subjective except in extremely obvious cases, and, honestly, not applied fairly. There are interesting self promotion posts that get squashed, while at the same time a run of 15 in a row self promoted, not very interesting, posts that make it through.
That’s fine! Moderation is imperfect, but it goes to the point that “slop” can’t always so easily be classified either.
[OP] orib | a month ago
And yet, we try to remove it. Let's do the same with the output of the dehumanizer.
k749gtnc9l3w | a month ago
Rules explicitly say that we do not in fact try to fully remove self-promotion, only to put some conditions on it.
apg | a month ago
I’m not arguing against you. I agree that blatant slop should be removed outright.
However, I am worried that non-blatant slop that people disagree with will just be marked as slop! This already happens. Yet, in this very thread, a person admits that they’re OK with slop if they can’t tell it was slop.
Would that person mark it as slop, still, if they agreed with it?
If they disagree? Yes. And this is why there’s no down vote button…
[OP] orib | a month ago
People already do this with spam. Do you believe we should welcome spam in response?
apg | a month ago
In the extreme case, do you shoot, and then bury people you disagree with? Of course not. You ignore them.
Disagreeing with an article doesn’t make it spam. Disagreeing with the tools that someone chooses to use to produce a piece of writing doesn’t necessarily make it “slop.”
It could be spam! It could be slop! And I agree that in blatantly obvious cases you mark it as such to protect others from wasting time.
I don’t believe we should blindly encourage anything that appears to be LLM assisted be immediately discarded.
As now repeatedly stated, people are “fine” with LLMs assisted writing as long as they can’t tell.
However, rather than engage in thought, the moment it is accused as “slop,” engagement stops and its suddenly irrelevant hogwash. This is stupid.
apg | a month ago
I just caught this highlighted in a reply. This is a wild take. “It’s totaaally fine as long as it tricks me.”
How many times have you been catfished? And when it’s finally revealed… do you chuckle a bit and go “Damn! You got me! That’s the third time this week!”
repl | a month ago
It is a take. I find it to be a sensible take. Someone needs to say this here. You don't need to be unnecessarily antagonistic while replying. The person you are replying to is taking the time to put forward their thoughts on an important matter and sharing it with us here. I think they are doing it well and I appreciate their comments. It is alright to disagree with them but you can do it without being so antagonistic.
From your other comment:
Sheesh! Can we stop with this type of replies please? This is not the quality of conversation I expect when I come to a meta thread or any thread for that matter. I demand better from our crustaceans.
You have been here for a long time, so you should know this from https://lobste.rs/about already:
apg | a month ago
This person is admitting that they may get value out of LLM generated content if they are tricked into reading it!
What if they are tricked into reading it, and then they disagree? Do they then go claim it’s slop? If they agreed, is it still slop?
This is the exact point I’ve been trying to make. What is “slop” is too nuanced, except in the very obvious cases… which just means that heavy moderation will result in people trying harder to hide the fact that they didn’t write it.
I’d also invite you to reconsider who is the antagonist.
apg | a month ago
So… you agree. Got it.
mk12 | a month ago
If it were that simple then the labs would bake “avoid LLMisms” into the system prompt.
apg | a month ago
These things do have a personality baked into the prompt… that’s kind of the idea?
What they aren’t… is a tool designed to write deceptive blog posts by default. Blogging died with the rise of the 140 character “microblog”, for the general public.
fleebee | a month ago
I don't think it's very likely that the kind of "author" who types a 2-sentence prompt into ChatGPT and publishes the result verbatim in their blog will take the time cover their tracks.
LenFalken | a month ago
They literally gave you examples. If someone makes the effort to mask their LLM slop, is it still slop? Isn't it now just LLM-aided writing at the forefront?
apg | a month ago
There are people in this thread that believe any LLM usage is slop and would prefer it not be allowed.
LenFalken | a month ago
Yeah, but I'm kind of arguing in your favor too: any well masked LLM output is effectively good enough to share if no human can know it's from an LLM.
Otherwise, you get the ban :)
Corbin | a month ago
Quality is neither objective nor subjective. Rather, Quality emerges from the interaction between subjects and objects. For the complete explanation, read Pirsig 1974.
apg | a month ago
That’s clearly slop!
Here’s what the slop machine I used said, instead: “Quality is what you feel when something is good but you cannot fully explain why.”
David_Gerard | a month ago
I keep being surprised by people who resolutely disbelieve that others can tell writing styles apart, and will insist repeatedly that's not possible.
apg | a month ago
Who says analyzing writing styles is impossible? The billions of people over the years that have suffered through a literature class would like a word.
That doesn’t mean that all of those people can definitely determine if something is “slop” or not. The most commonly cited “slop clue” is “It uses em dashes correctly!” I guess that probably does work well, because the vast majority of US Americans read and write at a 4th grade level, so of course they don’t know how to use punctuation…
matheusmoreira | a month ago
When exactly does something become slop? Is it just a case of classifying everything that comes out of an LLM as slop?
Does the LLM's output cease to be slop if I edit it? If I write something and have an LLM edit it, does my writing become slop in the process?
xyproto | a month ago
We could just fight fire with fire and use an LLM slop detector.
apg | a month ago
Do you have an accurate slop detector? Cause I’ve never seen one, and since Lemon squeezers are ultimately optimizers, subtle changes until it’s defeated will just be done.
wareya | a month ago
You don't need one. Just like you don't need absolute accurate detectors for every other rule violation that's currently on the rules. You're being disingenuous.
apg | a month ago
No. I am pointing out that you can’t “fight fire with fire” if you don’t have a slop detector. You can accept an imperfect detector if you want, but the goal is to continue to have a community that posts quality content to discuss. Too many false positives negatively reduces engagement. Too many false negatives, and all you’ve done is move the slop bar. Moving the slop bar just repeats the cycle.
dulaku | a month ago
Moving the bar far enough can be fine, see https://xkcd.com/810/
EDIT: To make my take clearer, I don't like LLMs on principle and will never voluntarily use them to generate content in my personal work. I'd go so far as to say everyone should follow that rule, but for ethical, financial, and environmental reasons that are all off-topic for this site. The only on-topic objection I have is the flood of slop, defined as low-quality content that is low-quality because it was LLM-generated. Note that this does not imply all LLM-generated content is slop.
Empirically, there is quite a lot of this out there; from a community management perspective, it makes sense to have a rule against LLM-generated content in general, not because the goal is to actually enforce a strict ban against LLM-generated content, but because the goal is to enforce a quality bar and people who put in the work to evade the rule-as-written are going to inadvertently satisfy the quality bar, satisfying the rule-in-spirit.
Society has tons of these kinds of norms, sometimes reasonable and sometimes not. The point of censoring certain words on TV isn't to stop people from saying them - it's to signal that they aren't appropriate for "polite" society. The point of requiring people to get and carry driver's licenses in order to drive has little to do with the little bit of plastic - it's to force people to do the learning you're supposed to in order to get the plastic. And the point of banning commercial content as spam isn't to prevent any content that could conceivably generate revenue - it's to signal that content posted for the primary purpose of generating revenue isn't welcome even if it tangentially discusses computing, so works that resemble that sort of content have to meet a higher bar of being clearly relevant than other works.
For these purposes we do not need an objectively verifiable, consistently reliable, philosophically pure method for detecting violators of the written rule. We just need a method that's good enough to cause empirical improvements in the empirical problem the rule is intended to address.
apropos | a month ago
~Irene, is this "community consensus" enough for you? https://lobste.rs/top/10y
[OP] orib | a month ago
Three of the top ten are about reducing the impact of AI.
addison | a month ago
Oh man I was kind of wondering where this stacked up. Surprisingly high.
ambee | a month ago
These threads are a travesty because nothing is stopping one person from replying to multiple threads saying "you're wrong, actually". LLMs aren't even a unique issue on this site when it comes to dissent in the comments. It's not exactly rare for a topic to be mentioned on this site, and for arguments against the existence/utility of that topic to be argued to death in every thread related to the topic.
It is not shocking LLMs are one of these topics. It turns out that LLMs become relevant every time it turns out someone used them.
My opinion is that there are no cases where submission of LLM output is justified. Do not pretend to speak languages you do not speak, do not pretend to know things you don't know, and don't pretend to have written things you did not write. It's simply anti-social to insist that your lack of effort (whether to the content, or to following netiquette) be respected and enjoyed the same way as real content.
In order to meet the reality of the situation, I would ultimately engage with "LLM content with declaration of LLM use" as a solution, as it would make avoiding the content as simple as if it were banned. I started with my actual opinion, this is the dilution of said opinion into a tolerable outcome that is less strict.
Additionally, the burden to speak something other than English falls on the native speakers of English.
I'm sorry to hear some folk feel like they don't get to publish content in their own language, fearing that an English-speaking audience will never form due to language barriers. Slopping your native-language text isn't the solution. (I have no solution to prescribe as I got to learn English natively therefore I get to engage with localization of my content into other languages I speak as an enriching activity)
pyj | a month ago
I'd rather prioritize seeing every blog post from every incredible person here than ever see an LLM generated article.
Some don't hesitate to post their own content, but others do. It'd be nice to have a mass list of blogs from people on this site so I could add to my RSS reader list. "Homepage" is in our profile already so maybe there's a way to generate and make that info available.
DanOpcode | a month ago
Yes please. At the very least remove the most blatant AI generated blog posts. If it's not obvious, let it be.
If the author cannot bother type it out himself/herself, I cannot bother reading it. And even if it's the author's own original thoughts, the AI writing style is tiresome.
pta2002 | a month ago
A lot of rules in this site are already fuzzy, and I don’t see an issue with adding this one as well. It would definitely make the site better in terms of quality content, in my opinion. It would also stop two behaviors that I am tired of seeing:
I saw some people mentioning things like the recent Linux vulnerabilities that were found by AI and how the site about the report was AI generated and could not have been posted. And honestly, good! I am so tired of AI slop sites and that extends to those. It’s not like you couldn’t post about the vulnerability, you could have posted a secondary high quality source like LWN, which is something that happens all the time when the primary source is not allowed to be posted for some reason.
And regarding translation/proofreading/em-dashes/etc: again, rules can be fuzzy, but honestly none of that makes a text “slop” in my opinion, and I am fine with applying the rule in a way that has some false negatives and only excludes obvious slop. There is a massive difference between a post that had some grammar mistakes fixed by an LLM and something that was so completely rewritten that it completely lost its character.
cyberia | a month ago
Concerns:
Proposal:
FrostKiwi | a month ago
Excellent decision, full support from me as well.
Helithumper | a month ago
Related: https://lobste.rs/s/wee21u/this_is_written_by_llm_comments_should_be
Some Examples:
I do agree that LLM Generated text should be filterable and/or flag-able.
The issue with labelling the content as "off-topic" is the case where the post is LLM-Generated, but actually on-topic. This may result in a contradictory use of off-topic. I think in the past a new option for flagging was discussed as well (e.g. https://lobste.rs/s/po97lh/new_tag_suggestion_genai_assisted)
I still think that a new flag option is a better option instead of abuse of the off-topic flag as discussed in https://lobste.rs/s/rkjpob/proposal_add_ai_generated_as_flag_reason.
[OP] orib | a month ago
I accept your feedback, I phrased that poorly: It should be disallowed.
I don't particularly care about filterable or flaggable. The users posting it should be removed from the site. Flagging or tagging it is a waste unless it leads to action being taken.
dzwdz | a month ago
New/casual users don't use filters. While I like the idea of a containment tag that I can filter out to make the site more pleasant to use, it's not a long-term solution. It biases the demographic of new users towards people who are fine with whatever we use the containment tag for, because these who are not will just be put off by it and leave.
The contradiction is in the premise, "LLM-generated, but actually on-topic". If we define LLM generated stuff to be off-topic, they're off-topic.
k749gtnc9l3w | a month ago
You are talking about selected worthless examples, but for all the garbage quality of LLM-generated text in https://lobste.rs/s/hfnps5/osmand_s_faster_offline_navigation it actually has unique on-topic content too…
[OP] orib | a month ago
No. I'm talking about not wanting any LLM generated text to be posted, to the best of our ability. Note the people voting it as spam. If I had seen it, I would have been one of them.
k749gtnc9l3w | a month ago
More people upvoted it, though, because it does have real content. (As you can see, I did complain about garbage LLM writing in the comments, but also tried to guess-the-prompt with bullet points of the meaningful content from the post)
atk | a month ago
Well, I'd say this is a good idea, but I also think it's going to turn into witch hunts.
The reason LLMs write like they do is because someone out there writes like that.
Or close enough to it.
I don't know what to advise. Other than caution.
[OP] orib | a month ago
Yes, there will be mistakes. The problem with rushing to build an unpleasant future is that things tend to get worse. People that shouldn't have to care about certain problems start needing to. We already opted out of the best outcome, now we're trying to find ways to minimize the tech industry's harm.
It may be worth thinking about what the thngs that get built may be used do, and not merely what you hope they will do.
Fixing things after they get broken is hard, and even a good job leaves scars.
pta2002 | a month ago
There already are witch hunts, see yesterday’s meta post about this. I’d rather have some false negatives in the rules and let some plausibly deniable writing through and just keep out the obvious slop than have nothing at all. And at least then you could say “these are the rules, it’s decided the post is staying/leaving, comments arguing about the rules are off topic” and be done with it.
Also, I’ve never seen someone write like ChatGPT does. Maybe if all they do is post worthless fake positivity entrepreneurial shit on LinkedIn? But that’s a different kind of low effort slop that shouldn’t be allowed.
atk | a month ago
If the goal is to ban low effort content, why not just ban that?
pta2002 | a month ago
I'd be fine with that too, honestly. But before AI, we basically would never see low effort content posted here, and AI makes a lot of things low effort, e.g. an entire desktop written in assembly (which was posted a while back). Without AI, seriously impressive. With AI, meh.
james | a month ago
Not a single response from @pushcx here. Continues the pattern of weak/absent moderation and inability to set clear guardrails on this issue, or take any action which makes lobste.rs policy clear and calms the community.
sluongng | a month ago
I disagree. I think LLM is but a tool. We should ban based on contents and values instead of the tool used to create the article. Spamming accounts are obviously should be banned. Bad quality articles are already subjected to downvoting.
pta2002 | a month ago
The issue is that you just downvote comments based on a reason, and “AI slop” or “low quality” isn’t one of them. What happens is that those posts just end up flagged as spam (which they technically aren’t) or slapped with the vibecoding tag (which isn’t really accurate).
erock | a month ago
Agreed, flag as “low quality” which is what we actually care about. Someone could produce an extremely well thought out and high quality article that fits right into this site but could have used a tool to help them. It would be a shame to ban those posts.
blinry | a month ago
I'd like this. What has the process for policy changes on Lobsters been in the past? I've seen multiple proposals of varying magnitude surrounding LLM submissions in the last months, but haven't noticed any official reaction or changes.
SamRW | a month ago
I strongly support this. I've discussed why in the past so I won't rehash it here, specially since other people have raised most of my concerns with LLM-generated articles.
rs86 | a month ago
Sounds good. Slop posting is today’s version of let me google that for you
oceanhaiyang | a month ago
All forms of dashes are illegal now
icefox | a month ago
I don't know why people seem to think em dashes are some kind of smoking gun. Pandoc's HTML output will generate them from markdown
--, for example.bitshift | a month ago
First they came for the hyphens, and I did not speak out, because I used en-dashes for numeric ranges.
Then they came for the en-dashes, and I did not speak out, because I used em-dashes for the cutting phrases in my haiku…
Then they came for the horizontal rules, and there was no one left to speak out for me.
chrismorgan | a month ago
I’ve felt for a few years that EM DASH (—) is too narrow a lot of the time. I sometimes use TWO-EM DASH (⸺) or in extremities THREE-EM DASH (⸻).
I went through a phase of using HYPHEN instead of HYPHEN-MINUS about 5–8 years ago, but I gave up on that one. Just a little bit too inconvenient to type, and interacts poorly with too many fonts. But I do always use MINUS SIGN if that’s what I mean.
(The tricky thing with HYPHEN-MINUS is that in various situations it is actually correct and HYPHEN would be wrong. If talking about left-pad or HYPHEN-MINUS, for example, you need to use HYPHEN-MINUS rather than HYPHEN. The existence of straight quotation marks and hyphen-minus is one of the more disappointing parts of typewriter/computer history to me.)
singpolyma | a month ago
This presumes the user has any way of knowing this is the case.
matheusmoreira | a month ago
Please define "LLM generated".
Whole article generation? LLM draft with human finish? Human draft with LLM finish? Is proof reading OK? Or is it permanently tainted the second an LLM touches it?
You're discussing articles specifically. What about projects and code? Low effort vibecoding banned? How about high effort? LLM programming plus human review OK? What about human programming plus LLM review? Or is the project permanently tainted the second you commit a
CLAUDE.mdfile?It's important for me to know exactly where the limits are so that I know where I stand. Depending on which categories you decide to ban, I may or may not be excluded from ever sharing anything I make.
vlnn | a month ago
I like the intention of leaving lobste.rs hand-picked and human-made, but I don't support policies that are nearly impossible to enforce transparently.
sporiff | a month ago
Broadly, I agree that LLM-generated text is not something I want to see on a site dedicated to discussion about technology. I believe that discussion of ideas requires the ideas to come from a genuine human source, so an LLM slop article is not something that should be considered on-topic in my opinion.
I am a bit more sympathetic to those who use LLMs for translation of non-English content, however. There is a definite bias towards English-language content in the tech space, and I think it's unfortunately correct that a non-English article/video will generally perform less well and lead to less informed discussion. A carveout for this is therefore probably sensible, but I would want there to be a link to the original content clearly accessible so that people can validate the translation.
As for people using LLMs to "clean up" content: I don't feel this is as violative as purely LLM-generated content, but disclosing it would be wise. I'm generally untrusting of this statement, though. "Cleaning up" content using generative methods that essentially rewrite entire sections can easily change the meaning behind what's written.
thesnarky1 | a month ago
I continue to believe that this polarizing topic is best solved not by banning, but marking these submissions as such. Just like we have PDF and Video tags that let me filter out, for example, large file downloads that I don't want on my phone, LLM-generated content should be a meta tag that identifies it as such and lets users filter as they like.
This gets around the usual complaint of "A tag will say it is on topic" because those meta tags don't allow a submission to go through. It also allows users who don't want to read LLM-generated material to hide it, while those who want to read it can continue to do so.
The other complaint is going to be the submitter might not know it is LLM-generated to tag it as such and that's fine. We have a tag submission system for a reason and it is used every day. Just check the mod log.
At no point does this excuse slop, which can still be flagged a spam, whether it was written by a human or a machine.
dzwdz | a month ago
They still imply that it's fine to post things covered by these tags.
thesnarky1 | a month ago
Yes, it is OK to post videos and PDFs. And unless site rules change, whether you or I like it, LLM-generated content is within bounds. My argument is that rather than argue every time, let's just clearly mark it so folks can filter and be done.
mrexodia | a month ago
When posting your own work I agree 100%, but I think it’s unfair to expect submitters to become “LLM police”. I speak English well, but it’s not my native language and I cannot reliably tell if something is LLM-generated outside of extremely obvious cases that Pangram also detects…
weberc2 | a month ago
We should get really clear about what constitutes "LLM generated" (e.g., is it "LLM generated" if an LLM is used for spell check, grammar, punctuation, etc cleanup? Or cleaning up the language on behalf of foreign speakers? Or automatic translation?)
We should also be clear about how we are going to identify whether something is LLM generated or not. Is it just "a lot of people think it is"? I would also like to see a complimentary rule against comments that accuse an article of being LLM generated--we don't need the top 50 comments in every article to be speculating about whether an article is LLM generated or not.
dzwdz | a month ago
That's impossible to tell, so obviously not. We've also had spellcheckers for decades now that don't come with all the baggage of LLMs.
More broadly, I think trying to figure out exact rules is counterproductive. This rule will never be fully objective, it will by necessity be left to the discretion to the moderators and our community. As a community, we'll never reach consensus on where to draw the line, doing so will just delay the introduction of the rule endlessly. Perfect is the enemy of good.
I wonder if we could somehow make it so these comment threads start out collapsed. I don't know how this would be achieved. It's been argued that comments are better for accountability than anonymous flags, which is probably useful for such a controversial topic.
weberc2 | a month ago
If we rely on human judgment to distinguish between LLMs and authentic humans, we're creating a competition that LLMs are already very capable of winning. It will be very difficult for humans to beat LLMs at the task of convincing this community that their content is authentic, especially as LLMs improve. To the extent that we think we can distinguish between LLM content and authentic human content today is the extent to which LLMs are not being prompted to sound like a Lobste.rs or HN submission.
Relying on human judgment will multiply the work required of moderators and make it impossible for human submitters and authors to know whether their content will be flagged by this community. It sets up an arm race between individual people and LLMs, and LLMs (trained on the entire Internet, including every Lobste.rs submission and all the comments, moderation log, etc) are already very capable of winning that race.
Instead of trying to litigate whether an LLM was used, I think we should craft rules and tools around the things that we actually care about. If we hate LLMs because their content is bland or the information is low quality or the content is disinteresting, then we should have tools to moderate those things (and to an extent, we already have upvotes and flags and the hide button and so on). Trying to fight against LLMs via judgment is a losing battle.
[OP] orib | a month ago
The thing that's valuable to me is that it was written by humans.
I hate LLMs because of the dehumanization project that they are a result of. I don't care to debate the quality of output of a machine designed to devalue and replace human creativity. Remember, we successfully automated art before we automated code.
If you support this endeavor, I think you are an asshole, and you should remove yourself from human society. Go spend your time talking to the machines instead, and stop harming people.
LLM text is written in the blood of future generations. I am only being a little overdramatic when I say this. If it works better, our future becomes more grim.
weberc2 | a month ago
It’s precisely because I agree with you that I don’t want to see us set up an arms race that we will lose. Regulating the use of AI by relying on human judgment about what is/n’t authentic is only viable if AI is unable to match human patterns as well as humans, and we are already past that point. The only reason you can identify some articles as “LLM generated” is because no one asked a frontier model to try to fool you. For all you know, you have been happily upvoting articles that have been written by LLMs that you believe to be written by humans.
This is not a problem that we can solve by simply burying our head deeper in the sand. We have to confront the practical reality that we have no good way to distinguish between LLMs and humans exclusively by examining outputs—any such rule is going to benefit LLMs more than people.
[OP] orib | a month ago
Yes, but shutting down the data centers first requires building consensus that the slop purveyors have worn out their welcome in society.
I'm happy to change tools and approaches for making slop unacceptable, but the consensus that slop is unacceptable needs to be the starting point .
And, I still think that a human should be in the loop to pick thresholds and hand out bans.
k749gtnc9l3w | a month ago
Historically, building a consensus against a technology takes a couple times longer than building a consensus about regulating an abusive monopoly. Arguably, we have an abusive oligopoly here, but still a more promising way.
[OP] orib | a month ago
Absolutely. Sounds like we'd better get a start as soon as possible, then.
Let's get to work.
coby | a month ago
I dislike LLM slop, but there's a reason LLMs write like the lowest common denominator writer. People write like that too. How are you enforcing this? You might as well try to ban low-quality content altogether.
I see people in support of this rule saying "yes! I'm so tired of reading slop, make it go away!" I'm tired of it too, but this is the reality now, and you can't just legislate it away.
My major gripe with LLMs is that people outsource their thinking to them, including their writing. So I share many folks' frustrations because I want to hear your thoughts, not some shortcut you took. But banning content that sounds vaguely LLM-generated is a shortcut too.
I'd support this if I thought such a rule could be effective. But rules are blunt instruments.
dzwdz | a month ago
The "this is slop" comments have been working pretty nicely so far, many people expressed appreciation for these under the other story. It makes sense to take this a step further.
[OP] orib | a month ago
We currently ban spam. I would treat this identically. Sometimes spam slips through, but on the whole we have little spam.
cultpony | a month ago
I respectfully but fundamentally disagree. Too much harm is already done by "No LLM related submissions" policies in communities I was forced to leave because people are too lazy to differentiate neurodivergent writing from neuronormative writing, or even just non-native speakers and their writing from an LLM. By instituting such a policy, people will be pushed out and their dissent is going to be silenced on the back of "you're just posting slop" yet again and I would seriously reconsider if this community is a place I consider safe.
The only cost for you for non-implementation is the minor inconvenience of sometimes you might be looking at slop that hasn't been downvoted yet or that you have to write "LLM slop" below an article you disagree with. The cost of implementation is that everyone who doesn't write like you becomes effectively ostracized and silenced.
dzwdz | a month ago
We don't have downvotes. Flags barely have any effect on story standing, we've had stories take up the first spot on the front page even though they had around the same amount of flags as upvotes (IIRC - it might've been half the amount of flags).
Could we get any examples of neurodivergent writing that got confused for LLM output? I don't think I've seen anyone bring that up as a response to articles being accused of being slop here. Even aside from this proposal, it'd just be useful to learn what might trigger false positives in my brain's AI detection algorithm.
matheusmoreira | a month ago
The neurodivergence argument is persuasive. I got Claude to proofread one of my articles for me and it actually flagged the narrative and style as an actual problem. I ignored that to maintain my authenticity and posted it anyway. Then HN said I got issues and I've been regretting not listening to Claude ever since.
dpc_pw | a month ago
Please recommend platform like lobsters where LLM-haters are disallowed.
I'm really tired of "I think this was written by AI" comments under every article.
I don't care who and how wrote. If it's informative, interesting and useful I want to see it. I want people to discuss and even argue the subject in the comments, and not count how many em-dashes the article contained.
dzwdz | a month ago
...news.ycombinator.com? Not disallowed, but you won't see pretty much any such comments there, and people are happy to discuss AI-generated things there.
rau | a month ago
Exactly. Awhile back I read an article and posted it here, not knowing that it was generated by an LLM. Now that I know, I think that it’s still an interesting article. The author (i.e., the man who directed the machine) conveyed an interesting message.
I think it’s kind of like a speech: while the speaker may not have written the words he is speaking, he ought to have collaborated closely with his writing team, and he is accountable for what he says.
pta2002 | a month ago
stig | a month ago
I understand where this is coming from, but it's a tall order asking moderators to enforce it.
[OP] orib | a month ago
I would happily volunteer to help, and I think others would as well.
Student | a month ago
Dvorak keyboard generated posts should be banned. I think users posting them regularly should be banned from the site.
Edit: I have flagged this post as spam for being written on a Dvorak keyboard.
hoistbypetard | a month ago
Can you explain why you think think this is a reasonable analogue? The topic here is about how posts that aren't generated by humans should be disallowed. There's a big difference between a data entry method and how the post was generated.
spc476 | a month ago
I've seen @Student make the argument that no one would submit a PR with "I used IntelliJ/Eclipse/vi to write this code", then no one should have to disclose the use of an LLM either, as it's "just a tool." I think Student is saying that banning LLM written blog posts/programs is just as silly as banning some one for using a non-QWERTY keyboard. It's a form of satire. Not good satire, but satire nontheless.
Student | a month ago
There’s no difference whatsoever. In both cases you’re saying nothing about the text just complaining about the tools you imagine were used.
I say “imagine” advisedly because the complaint gets thrown around without any evidence and is often rejected by the authors. It’s most reminiscent of “transvestigstors” throwing around accusations because they want to be in the know about a secret while dunking on an outgroup.
At best it’s deeply tedious and drives out on topic discussion.
hoistbypetard | a month ago
That's inaccurate. Saying that a text resembles the output of an LLM is exactly saying something about the text.
apg | a month ago
This is a contentious comment (I saw it at -10, and now at -8), but it’s a fair criticism of the idea as far as I’m concerned.
Rovanion | a month ago
Care to explain why you think that? I read it as both inflammatory and an apples to oranges comparison.
apg | a month ago
I’m on the side of limited LLM generated submissions, and banning those who continue to submit them, fwiw.
However, a part of me thinks that good content is good content, and I don’t necessarily care how it was written. If someone authors a blog post with a spell checker, or a grammar checker, or by speech-to-text, but the thought came from them, we probably can agree that it’s OK.
If an author has a thought, uses that to prompt an LLM to build an argument for or against it in a manner that treats the LLM as an “assistant” … is that OK? Where does the line get drawn exactly?
“Vibeblogging” — we can definitely agree is just slop and ban it. “Write a post about how a panda should have been the Linux mascot.” But, “Help me restructure this argument about why Object Oriented Programming blah blah blahs under the OOPS theorem of blah” … not sure?
The OP, here, points at this, albeit, as you say in a potentially inflammatory way.
spc476 | a month ago
I've seen Grammarly advertisements for years now, and I've always had this unease about the product. I don't have an issue with spell checkers, but grammar is something else. Helping a person use "their", "they're" and "there" correctly is one thing, but I found Grammarly to be overtly paternalistic in its advice. There's a line between "helping an author" and "erasing an author's authentic voice" that I feel Grammarly crosses. I even have friends who are writers (one a published author!) who have similar misgivings about the tool. Now that LLMs can act as Grammarly dialed to 11 ... um ... I think something gets lost in translation.
Student | a month ago
How nice that you don’t need assistive technology.
kel | a month ago
If there were no difference, it wouldn't be immediately obvious that someone basically copy-pasted from an LLM for their blog. On the other hand, I cannot tell that someone wrote a blog post with a different keyboard.
This is silly whataboutism, if I wanted to read what an LLM had to say on a subject, I'd just open a prompt. I come to this website to read what people think and experience. Two things LLM's cannot do, as evidenced by the endless bland nonsense produced by the technology.
apg | a month ago
If you can’t see the difference between:
AND
… then I’m not sure we’ll ever get anywhere. If you can’t look past the fact that the author used the “Dvorak keyboard layout” or (heaven forbid!) the nano editor, or (omg what were you thinking?) ispell… to write a blog post that you might actually find something valuable from, god help us all. I’ve actually heard people make arguments such as “he’s using nano? dude is clearly not a serious programmer,” and all this hate feels too familiar.
Look, I get it. LLMs are horrific in sooo many ways, and I wish the bubble would pop, and their champions would get some sort of punishment for their malfeasance along the way. But the reality is, people find value in these things, and I don’t think they’ll completely disappear anytime soon. How we access them will definitely change. But, lest you engage only with content that authors have certified as being inference / LLM free, and then fully trust them to not lie to you, you’re going to get “tricked” into reading something that came from an LLM. Sorry to disappoint you.You’ll be fine—I promise.
nrdxp | a month ago
The position is, in my view, incoherent; a fact best illustrated by the impossibility of categorically defining "LLM-generated." Any attempt to enforce such a boundary will inevitably devolve into a witch hunt.
High-effort, highly coherent work can be drafted with the aid of an LLM, just as low-effort slop can be produced without one. I'm more than happy to advocate against slop, human or machine-made, but policing the tool rather than the output is counterproductive, performative moralisation.
The only thing I can reliably detect beneath this growing sentiment is unearned moral superiority married to a peculiar blend of Ludditism and elitism. Worst of all is the dismissal of a technology that will, in the final analysis, refuse to be ignored; causing us to forfeit the opportunity to meaningfully shape how it is used and how it evolves.
matheusmoreira | a month ago
Completely agree.
Nearly 200 comments of almost nothing other than calls for blanket bans. Not a single coherent definition of what "slop" even is, or where the limits should be set. Asking the obvious question is apparently trolling.
It's becoming obvious that this is ideological opposition to AI and LLMs. That's fine but it'd be better if it's simply framed as such instead of using proxies like quality and effort.