Legal Prompts: AI for Document Review & Contract Summary

Table of Contents

What AI Is Actually Doing in Document Review

When people say AI is used for document review, what they often imagine is a machine reading a contract and instantly understanding its meaning — like a legal assistant with infinite patience. That’s… not what’s happening.

Thank you for reading this post, don't forget to subscribe!

The reality is closer to this: A language model (like ChatGPT or Claude) is being prompted with a massive chunk of legal text. The prompt instructs the model to either summarize key sections (ex: “termination clause”), identify obligations, or flag specific risk language. AI isn’t interpreting the law; it is pattern-matching based on mountains of training data. Huge difference.

Let’s take a lease agreement as an example. You paste the lease into a prompt window and ask: “Does this lease require the tenant to maintain HVAC systems?” Sometimes, it returns a clear yes with a quote from the document. Other times, it gives you a foggy, noncommittal answer like “It appears the tenant is responsible… but consult an attorney.” Not super confidence-inspiring.

Main issue: Long contracts often exceed typical input token limits of AI models. When that happens, either the model skips sections, or worse — truncates mid-clause (I’ve seen this happen with a 40-page supplier agreement; the AI just cut off right in the definition of “Force Majeure” and started imagining clauses that weren’t there 🤯).

Some tools solve this with a pre-parsing approach: they chunk documents into readable segments and then summarize each chunk before merging insights. Others embed vector chunking — meaning it rewrites parts into mathematical embeddings and searches definitions instead of raw text.

AI Capability	How It Behaves in Legal Review
Clause Recognition	Works well if exact language is standard. Poor with creative or customized wording.
Obligation Extraction	Often accurate, but may miss conditional dependencies (e.g., “unless X happens”).
Summarization	Great for overviews, but frequently omits detail needed for compliance/legal risk.

To wrap up, AI in document review is more like a context-aware scanner than a silver-bullet lawyer — useful, but not yet a final authority.

Prompt Engineering That Actually Works for Contracts

Not all prompts are created equal. I’ve tested plain questions like “Summarize this contract” and more granular ones like “Extract only lease obligations of the lessee, ignoring background.” The second one gets drastically better results.

The best prompt format I’ve found (after dozens of tweaks) is:
"You are a Legal Analyst trained in identifying actionable obligations in business contracts. Given the document below, extract each obligation with party name and clause number. Do NOT summarize. Focus on lessee’s responsibilities only."

That structure usually triggers the model to behave more like a structured parser and less like a storyteller. Avoid open-ended prompts. Do not say “Is this contract safe?” — it will hallucinate an opinion and waffle around liability terms. Ask very specific questions like “Does Clause 11 require prior written notice for termination?”

Things that almost always improve prompt quality:

Use role-based instructions (“You are a contract analyst…”)
List output format (“Return in table with three columns: Clause, Party, Obligation”)
Tell it what to ignore (reduces noise hallucination)

Interesting discovery: When I used the word “compliance” in prompts, especially in enterprise vendor agreements, the AI would almost always veer into analyzing GDPR or data security. Removing that word brought it back to the actual clauses.

Ultimately, contract-focused prompts have to be treated like configuration files — tuned carefully, not written casually.

What Happens When You Feed Complex Contracts

This is one of the biggest traps: long, highly negotiated contracts (like SaaS master service agreements or partnership deals) don’t just break prompt windows. They break comprehension.

Here’s what happened in my test with a SaaS MSA that had over 25 pages:

Pasting the whole thing into GPT-4 as one text block exceeded token limits — got clipped at midpoint.
Chunking it manually worked, but context broke: the model didn’t connect Exhibit B (full of obligations) with Definitions on page 2.
I tried embedding tools like LangChain to split and maintain vector index. Better context tracking, but significantly slower responses, and required building a basic interface in Streamlit or similar.

Best result came from doing hybrid prompting: break the contract into logical sections manually (Definitions, Payment Terms, Termination), then run prompts on each, saving responses into a flat doc. At the end, do a meta-prompt: “Summarize cross-section obligations and conflicts.”

To sum up, large contract handling requires setup time — if you skip setup, the AI just politely misleads you.

Comparing AI Tools for Legal Doc Review

I ran the same set of prompts on four different assistant platforms using the exact same vendor agreement. Here’s what actually came out:

Tool	Strength	Weakness	Output Style
ChatGPT-4	Consistent structure, understands roles clearly	Sometimes truncates long contracts	Formal summary with bullet points
Claude	Accepts longer document input	Misses references to footnotes or exhibits	Natural language, some paraphrasing
Harvey	Domain-tuned for legal use (uses models trained on law data)	Not public access; enterprise only	Redlined outputs based on risk flags
Juro AI	Contracts-first UX; previews key terms visually	Limited support for complex custom contracts	Structured term pairs and summaries

Overall, raw AI APIs like GPT work well if you manually prepare doc chunks and prompts. Specialized tools like Harvey or Juro skip that step, but with limited flexibility.

To wrap this section, the best choice comes down to your workload volume and how much prep you’re willing to do manually.

Setting Up a Stable AI Prompting Environment

If you’re using AI for legal review regularly, doing it through a basic chatbot window is going to fail you eventually. Trust me.

Instead, I’ve moved to a local notebook + API sync flow. Here’s the setup:

Upload contract to a tool like PDFPlumber (Python) that preprocesses text
Use LangChain or LlamaIndex to chunk and embed logic
Store prompt templates in a YAML or JSON config
Trigger GPT-4 or Claude via API call with retrievable chunks, not the whole text

Benefits:

No manual re-pasting errors
You get to log outputs, catch errors, re-run chunks
You can version your prompts as you discover better phrasing

Setup time: about an afternoon. But after that, it saves hours per contract.

Situations where this helped me the most: high-volume NDAs where only clause triggers mattered (“Is there a unilateral termination?”). Also in vendor reviews where we had to extract warranties across 30 contracts in bulk.

The bottom line is, once you’re using AI for legal doc review routinely, a structured environment prevents hallucinations and dropped obligations.

When AI Legal Review Goes Wrong: Real Errors

Here’s a fun one (painful at the time): I was reviewing a reseller agreement and used ChatGPT to extract all revenue share clauses.

The contract had this line buried in the appendix: “Additional incentives are calculated net of applicable taxes and fees.” GPT completely missed that. It interpreted the previous section (“Reseller receives up to 30% commission”) as the entire answer. Missed post-deductions entirely 😤.

Another time, a clause defined “termination for cause” uniquely. GPT ignored that custom definition and applied a generic interpretation — dangerously wrong during dispute resolution planning.

Guidance to prevent this:

Always check for Appendix or Footnote references — AI often skips these unless explicitly asked.
Pre-define all uncommon terms before prompting (e.g., “In this contract, ‘Service Tail’ means what?”)
Use version control to test prompt responses across slightly different contracts

This also happens when the formatting breaks — like weird bullet points, tables, or track changes. If your contract came in DOCX form, convert it to clean Markdown or plain text before prompting.

As a final point, assume the AI will miss nuance unless proven otherwise — always double-check any automated outcome involving risk or payment duties.