Markdown Hard-Wrapping Is an Inherited Convention That Hurts LLM Workflows

If you build with Claude or any LLM as an active collaborator rather than autocomplete, this one is worth a look.

TL;DR

Hard-wrapping markdown at 80 characters is a 50-year-old terminal convention that LLMs now reproduce by default because their training data is full of it. In an LLM-collaborative workflow it actively hurts you: diffs balloon, grep misses matches, retrieval chunks badly, and model output drifts when half your files use the convention and half don't. The fix is to stop hard-wrapping anywhere in your project, pick semantic paragraphs (one paragraph per line) or semantic line breaks (one sentence per line) per folder based on how that folder's content gets read and edited, and let your editor handle soft-wrap for display. Treat project files as a runtime, not as documentation.

The discovery

I came across a Claude system prompt posted to Reddit, dense and content-rich but only 71 lines long, with each paragraph sitting on a single line. Then I looked at my own project files. Around 100 markdown files of decision logs, architecture specs, reference docs, and glossaries, and most paragraphs were broken across multiple lines with manual newlines at roughly 70-80 characters. Same amount of text, two to three times the line count.

The system prompt was written for a machine to read. Mine were written by following an inherited convention without anyone noticing.

How the convention got there

The 80-column rule is roughly 50 years old. Punch cards were literally 80 columns, VT100 terminals inherited the width, and early Unix tooling assumed fixed-width displays everywhere. Email reinforced it with RFC 2822 recommending 78 characters. By the time Markdown was invented in 2004, every developer had internalized "wrap at 80" as a virtue, and Gruber wrapped his own examples that way. The spec never required it, GitHub renders unwrapped markdown identically, but the convention propagated through every README, every dotfiles repo, every "how to write good docs" tutorial.

If you've learned ICM from

@Jake Van Clief

and traced context engineering back to its origins, this should feel familiar. Same lineage, same source. The text-handling defaults we live with today were shaped by hardware that hasn't existed since most of us were kids.

How it got into LLM outputs

LLMs learn formatting from training data, and the vast majority of public markdown (open source READMEs, documentation sites, technical blogs, Stack Overflow answers, GitHub wikis) is hard-wrapped because developers wrote it following the inherited convention. So when you ask Claude or GPT or Gemini to write markdown, the model produces hard-wrapped output by default. It isn't choosing to wrap, it's pattern-matching to what "markdown" statistically looks like in its training corpus.

Which means every LLM is silently re-encoding a 50-year-old terminal convention into your project files, every time it writes one. If you don't notice and correct it, your files inherit the convention without anyone making a decision about it.

Anthropic's own system prompt isn't wrapped at 80. They want every paragraph readable as a single semantic unit.

Why unwrapped is better for LLM workflows

Diffs become accurate. Change one word in a hard-wrapped paragraph and the wrapping reflows every following line, so the diff shows eight lines changed for a one-word edit. Unwrapped, the diff is the change.

Grep starts working. Hard wraps break phrases across lines, so a search for a multi-word phrase in a wrapped file may return no hits because the phrase straddles a newline. Unwrapped text is searchable.

Retrieval and chunking match author intent. Most retrieval systems split documents on paragraph boundaries (blank lines), and hard wraps create false paragraph-like breaks inside what is logically one paragraph. Unwrapped, the chunks match what the author actually wrote. Better chunking has a downstream effect on hallucination risk too, since the model sees complete context units rather than fragments and has less reason to fill gaps with plausible-sounding inventions.

Model outputs become more consistent. If half your files are wrapped and half aren't, Claude pattern-matches to whichever style appears first in its context window and produces inconsistent output. Consistent input gives consistent output.

Input vs output

The wins above apply when an LLM is reading your files, but the story is different for what it writes back.

If the model is writing into a markdown file that lives in your project, it should write unwrapped to match the source convention. If the model is generating prose for the chat window or anywhere a renderer will soft-wrap on its own, hard wraps in the output don't help, they just clutter.

The only places hard-wrapped output genuinely helps are destinations that lack soft-wrap, like git commit messages (50/72 still applies), some plain-text email contexts, or certain CLI-bound text. Match the wrap convention to whoever consumes the output, and for LLM-as-consumer, unwrapped wins.

Semantic paragraphs vs semantic sentences

The two conventions aren't interchangeable, they fit different content types.

Semantic paragraphs (one paragraph per line) is the right default for prose that reads continuously and gets edited in chunks: README intros, design rationale, narrative documentation, explainers. The paragraph is the unit of meaning and the unit of change, so paragraph-level diffs match how you actually edit. Raw view stays compact, soft-wrap handles display.

Semantic line breaks (one sentence per line) is the right default for content where individual sentences are discrete claims that get amended in isolation: decision logs, specs, glossaries, rule lists, anything append-only where you might come back and correct one sentence six months later. Sentence-level diffs match sentence-level edits, and git blame stays precise.

You can mix both in the same project, you just shouldn't mix them inside the same file. Pick the one that fits the read-and-edit pattern of the content.

Putting it in your folder context

This is where ICM folder-level steering files do real work. A project-wide CLAUDE.md is the wrong place to set the rule because the rule depends on what's in the folder. Better to put it in the folder context file for each section of the project.

In a decisions/ folder README or context file:

Files in this folder use semantic line breaks. One sentence per line, no mid-sentence wraps. Append new entries above any living section.

In a docs/ or explainers/ folder:

Files in this folder use semantic paragraphs. One paragraph per line, soft-wrapped by the editor. No hard-wrapping at 80 characters.

At the project-level CLAUDE.md, all you need is the pointer:

Markdown wrap convention is set per folder. Check the folder context file before writing. Never hard-wrap at 80 characters anywhere in this project.

That last line is the one project-wide rule worth keeping at the top. Hard-wrapping is always wrong for this stack. Which of the two unwrapped conventions to use is a folder-scoped decision.

If your ICM structure doesn't have folder-level context files yet, this is a good reason to add them. Until then, put both conventions and which folders use which directly in the project-root CLAUDE.md so the rule still travels with the work.

Pair any of this with "prettier.proseWrap": "never" in your editor settings and an on-touch migration approach where you normalize a wrapped file as part of editing it, and the corpus converges over a few weeks without a big-bang reformat.

The bigger principle

File formatting is a context engineering concern, not an aesthetic one. Most developers think of markdown formatting as how it looks, but if your workflow involves LLMs reading and writing those files, formatting becomes how it parses, diffs, retrieves, and re-enters context.

A useful test for any inherited convention: what does this optimize for, and is that still my constraint? Hard-wrapping optimizes for fixed-width terminals. My constraint is clean diffs and LLM-readable context.

This applies well beyond markdown. Tabs vs spaces, 80-char limits in code, file naming, directory structures, most teams follow rules whose original justification evaporated a decade ago. In an LLM-collaborative workflow, it's worth auditing every default with one question: does this still serve the system I'm actually building in?

Caveats

Doesn't matter for short or rarely-touched files, so don't bulk-convert a 30-line README that gets edited twice a year. Some tools still expect hard wraps in older email clients, certain static site generators, or some review tools, so check before mass-reformatting.

Takeaway

I assumed hard-wrapped markdown was a neutral default. It isn't. It's an inherited convention from terminal-era tooling, propagated into modern LLM outputs through training data, that actively works against the workflows I run. Anthropic's own prompt is dense, unwrapped, sentence-economical, and mine were padded out 2-3x in line count for no benefit. Once I switched, diffs got cleaner, search started working, and model outputs became more consistent because the input was.

Treat your project files as a runtime, not as documentation. They're the context window, so format them like it.