Tag: Writing

  • Human intent in AI is the part benchmarks miss

    Human intent in AI is the part benchmarks miss

    Caleb Gross’s “You can just say it” makes a clean argument about human intent in AI: defending people by saying they still outperform models is a weak move. The stronger claim is simpler. Humans matter before the comparison starts, and creative work should be judged by more than surface polish.

    The short version

    • Gross argues that tying human worth to better output than AI is fragile because model capability keeps moving.
    • His sharper definition of AI slop is work with form but little readable intent, not merely bad work or machine-made work.
    • The Hacker News discussion mostly found the intent framing useful, especially for writing, email, and AI-assisted coding.
    • The hard question is whether readers can still feel a person’s judgment when AI has cleaned up every sentence.

    What happened

    Caleb Gross published “You can just say it” on May 28, 2026. The essay pushes back on a common defense of human value in the age of generative AI: people are special because they can still do some things better than machines.

    That argument may feel reassuring for a while. It also makes human dignity depend on the next benchmark run. Gross’s alternative is intentionally plain: humans are valuable. You do not need to attach that claim to writing speed, design quality, coding productivity, or any other measure of output.

    The essay then moves from human value to creative quality. Gross describes creation as intent taking form. A resignation letter, a drawing, a design, a piece of code, or a message all carry some mix of what the maker meant and what the maker produced. Generative AI changes that balance because it can produce convincing form from a thin prompt.

    That is where the essay’s useful definition of AI slop appears. Slop is not automatically “content made with AI.” It is output where the intent is hard to find. A human can make it. A person using AI can avoid it. The difference is whether judgment, taste, and purpose remain visible.

    Why this is worth watching: human intent in AI

    The phrase human intent in AI can sound abstract until you apply it to ordinary work. Think about the email example in the essay. If someone uses a model to turn a blunt request into a long, polite message, the result may be smoother. It may also make the recipient work harder to infer what the sender actually wants.

    That matters for product teams and app builders. AI writing tools often sell polish: clearer tone, better structure, faster drafting. Polish is useful. The risk is that a product can make every message sound finished while removing the cues that tell the reader what the sender chose, cared about, or understood.

    The same applies to AI-assisted coding. A generated patch can look complete. The better question is whether the prompts, review comments, tests, and edits add up to a coherent specification. If they do, AI is helping a human express intent. If they do not, the model may be producing code-shaped material that nobody fully owns.

    For more coverage of AI product and developer-tool debates, see the IT & AI archive.

    What Hacker News readers are arguing about

    The main Hacker News thread was unusually substantive for an AI culture argument: 383 points and more than 200 extracted comments. The most productive camp liked the essay because it separated a complaint about AI misuse from a blanket complaint about AI itself.

    One widely upvoted line of discussion treated the essay’s slop definition as a better mental model for AI-assisted coding. The useful distinction was between a chain of prompts that forms a real specification and a chain of retries that amounts to “it does not work, try again.” In the first case, the human is still steering. In the second, the human may be outsourcing responsibility.

    Another cluster focused on communication. Several commenters reacted to the quoted line about preferring the raw prompt over an AI-written email. The shared irritation was not that a machine touched the prose. It was that the sender might be asking the reader to decode a polished message the sender did not bother to write or fully understand.

    There was also pushback. Some readers disliked the essay’s religious reference to Genesis as support for human value, even when they agreed with the broader claim. Others argued over whether “valuable” was the right word at all, since it can imply something measurable. “Invaluable” felt closer to what some commenters wanted to say.

    The liveliest disagreement was about intent itself. One commenter prompted Claude to make something unconstrained and asked how anyone could be sure there was no intent in the result. Replies split between people who saw that as anthropomorphism and people who thought dismissing machine intent by saying “it is numbers” was too glib. That argument is not settled by Gross’s essay, but the essay gives readers a cleaner vocabulary for having it.

    The practical read

    If you are building with generative AI, the practical test is not “did AI touch this?” That question is already too blunt. Ask whether a reader, user, or teammate can still see the human intent in AI-assisted work.

    For writing tools, that means preserving the user’s point rather than inflating it into generic professional language. For coding tools, it means making review, tests, and constraints visible enough that the generated output has a responsible owner. For content teams, it means rejecting pieces that look finished but do not seem to come from anyone in particular.

    This is also a useful editorial standard. Bad AI output is easy to mock. Polished, empty output is harder to catch because it passes a quick scan. Gross’s essay is worth reading because it names that problem without pretending the answer is to avoid every AI tool.

    Human intent in AI is not nostalgia for manual labor. It is the part that tells another person, “someone meant this.” When that disappears, even technically competent output starts to feel cheap.

    Sources

  • LLM smells are getting easy to spot

    LLM smells are getting easy to spot

    LLM smells are the tiny tells that make AI-assisted writing or AI-built websites feel oddly familiar. A short post by Shiv After Dark put a useful name on the pattern: punchline-heavy prose, repeated sentence shapes, monospace-heavy pages, badges, cards, and step sections that keep appearing across unrelated work.

    The short version

    • LLM smells are not proof that a piece of work is bad. They are signs that the draft may still be too close to the model’s default style.
    • The clearest writing tells are punchline sentences, repeated short sentences, “X is the Y of Z” metaphors, and tidy contrast formulas.
    • The web design tells are just as visible: JetBrains Mono, step layouts, badge dots, familiar cards, and generic call-to-action buttons.
    • The useful editorial move is to treat AI output as a draft, then add concrete details, uneven human rhythm, and product-specific design choices.
    • Hacker News readers mostly pushed the argument toward code quality: AI output looks strongest when you do not yet know enough to judge it.

    What happened

    Shiv After Dark published “Various LLM smells” on May 28, 2026, after noticing that prose once polished by an LLM had started to resemble a lot of other writing on the web. The post is short, but the examples are sharp: aphoristic one-liners, strings of clipped sentences, metaphor templates, and the familiar “not merely X” style of contrast.

    The second half moves from prose to AI-generated websites. The author points to the same stack of visual habits showing up again and again: monospace typography, step sections, cards, buttons, blinking badge dots, and footnote-style flourishes. None of those choices are wrong by themselves. They become LLM smells when they arrive as a bundle, without much relationship to the product or audience.

    If you follow AI writing and web tooling, this fits a larger pattern. Models are good at producing plausible defaults. Plausible defaults are useful for a first pass. They are also easy to recognize once enough people publish them unchanged. For more English briefs on AI tooling and product craft, see the IT & AI archive.

    Why this is worth watching

    LLM smells are worth watching because they are an editing problem, not a purity test. The author is not arguing that people should stop using AI for creative work. The better reading is more practical: if a model gives you a draft in seconds, you still need to remove the model’s house style before the work feels like yours.

    For writing, that means checking whether a sentence adds information or only adds mood. Punchy lines can work, but a whole page of them starts to feel assembled. The same goes for neat metaphors. “X is the visible signature of Y” may sound elegant the first time. By the tenth version, it reads like a preset.

    For web teams, LLM smells are a useful QA category. A landing page can be clean and still generic. If the typography, cards, steps, icons, and microcopy could belong to any AI startup, the page probably needs one more design pass. App builders should pay special attention here, because store listings, onboarding screens, and extension directories reward clarity, but punish sameness.

    What Hacker News readers are arguing about

    The Hacker News discussion quickly widened from writing to competence. One of the strongest recurring points was that LLM output looks best in domains where the user is least able to judge it. That explains the split many people see in coding threads: beginners may experience the model as a dramatic productivity boost, while experienced engineers see the rework, missing context, and bad abstractions.

    Several commenters gave concrete coding examples. One described an assistant proposing a security-dangerous approach that would have bypassed a WebAssembly sandbox and executed submitted Python in the application container. Others complained about agent-generated codebases growing too large because each feature gets built in isolation: every modal is different, every button drifts, and business logic ends up scattered.

    There was a more positive camp too. Some readers said LLMs are genuinely useful for format conversions, API mappings, learning unfamiliar concepts, or getting past small obstacles. The practical distinction was not “use AI” versus “do not use AI.” It was whether the user has enough taste, tests, and domain knowledge to catch the smells before they harden into the final product.

    LLM smells checklist

    Before the final edit, look for the repeated shapes: punchline stacking, metaphor templates, tidy contrast lines, generic cards, and typography that says more about the model than the product.

    The practical read

    Use LLM smells as a checklist before publishing. In prose, look for punchline stacking, repeated short sentences, decorative metaphors, tidy contrast formulas, and abstract claims that do not name a real example. Replace them with specifics. Add the thing you actually saw, measured, built, shipped, or changed.

    In interface work, scan for the default AI landing page kit: monospace labels, gradient cards, step grids, badge dots, identical buttons, and generic hero copy. Keep the pieces that fit. Cut the ones that only make the page look “AI polished.” The goal is not to hide the tool. The goal is to make the result specific enough that the tool is no longer the most visible author.

    The same rule applies to code. AI can get you moving, especially on routine or verifiable tasks. But if you cannot review the output, you are outsourcing judgment. That is where LLM smells stop being cosmetic and start turning into maintenance work.

    Sources