Self Hosted AI for Law Firms | Attorney-Client Privilege Protected

Cloud AI tools void attorney-client privilege. The SDNY ruled it in February 2026. Private AI for contract review, legal research, and document analysis that never leaves your infrastructure.

private AI legal document review·on-premise LLM attorney-client privilege·AI contract analysis private deployment·legal AI without data exposure

Self Hosted AI for Law Firms

In February 2026, a U.S. District Court for the Southern District of New York ruled that documents created using commercial AI platforms are not protected by attorney-client privilege or the work product doctrine. A self hosted AI system, running entirely inside your firm's infrastructure, is the only architecture that preserves privilege by construction. Client data never reaches a third-party platform, so the confidentiality is never broken.

If your attorneys use commercial AI tools on client matters, work product created through those tools may now be discoverable. That is a concrete exposure problem, not a theoretical one.


The SDNY ruling every law firm needs to understand#

The legal industry's AI problem is no longer theoretical. It has a case citation.

What the February 2026 ruling actually said#

Judge Jed S. Rakoff of the U.S. District Court for the Southern District of New York ruled that materials created using a commercial AI platform are not protected by attorney-client privilege or the work product doctrine (Debevoise Data Blog, February 2026). His reasoning was direct: public AI tools collect user inputs, may disclose data to third parties, and cannot form the confidential relationship that privilege requires. An intermediary that lacks confidentiality breaks the privilege. Simple as that.

The implication for any firm using commercial AI, including cloud-based chatbots, AI-assisted drafting tools, and third-party legal research platforms, is concrete. Work product created using those tools may be discoverable. Communications that passed through those platforms may not be privileged. How much exposure your firm has depends on what your attorneys used those tools for, and what data they fed them.

Why commercial AI tools cannot satisfy confidentiality requirements#

Privilege is a legal doctrine, not a commercial arrangement. The moment client information passes through a third-party platform, the confidentiality is broken, regardless of what the vendor's terms of service say. A data processing agreement does not restore privilege to communications that have already transited a non-confidential system.

Enterprise tiers, zero-data-retention claims, and SOC 2 certifications address security and contractual liability. None of that touches the legal test for privilege, which requires the communication to have been confidential in the first instance. Data that transits a third-party server, even briefly, has left the confidential channel.

ABA Formal Opinion 512 and Model Rule 1.6: the ethics layer#

ABA Formal Opinion 512 (July 2024) states that under Model Rule 1.6, lawyers must understand how a generative AI platform uses client data and implement adequate safeguards. Boilerplate engagement letter consent is explicitly insufficient; informed, specific consent is required (American Bar Association, 2024). That standard has not yet been fully litigated in the context of AI tool usage. Given the SDNY ruling, the risk of a court deciding that routing client data through commercial AI failed the "reasonable efforts" test is no longer speculative.

A self hosted system eliminates both problems by design: client data never transits a third-party platform, so privilege is preserved by construction, and there is no third-party AI usage to disclose or obtain consent for.


AI-assisted contract review cuts analysis time by 25-50% and flags risks that human reviewers miss, according to LegalOnTech's 2025 survey (LegalOnTech, 2025). 78% of corporate legal departments and law firms are actively using, evaluating, or exploring AI for document work. The question is not whether to use it. The question is whether the architecture you pick creates privilege exposure.

Contract review and clause extraction#

The model reviews your contract documents, extracts clause types, flags non-standard provisions, and surfaces risk based on your firm's own precedent library. Everything runs inside your network. Results are indexed by matter. No client document leaves your environment.

Legal research AI runs over your internal precedent database and can be configured against specific jurisdictional sources. Research queries, citations, and analysis stay inside the firm. The model does not transmit queries to an external API, and it does not pull results from a third-party platform that logs your firm's research patterns.

Document drafting and summarization#

AI-assisted drafting for briefs, memos, and correspondence works from your firm's own precedents and style guides. The model learns from your documents, not a generic training corpus. Discovery summaries, deposition transcripts, regulatory filings: all generated locally. Nothing leaves.

Due diligence document processing at volume#

Due diligence reviews involve large volumes of documents under time pressure. A self hosted system processes those sets at scale: extracting key terms, flagging anomalies, categorizing by type and risk. Results are reviewable by matter and exportable to your document management system. None of the underlying documents transit an external platform.


How we build it#

Step 1: infrastructure and compliance scope#

We start by mapping your current environment: document management system, matter organization, existing IT infrastructure, and your firm's specific risk tolerance around privilege and data handling. That produces a written deployment scope your general counsel and IT director can review before we build anything.

Step 2: model selection and document corpus preparation#

We select the model appropriate for legal language. General-purpose LLMs perform well on most legal tasks; specialized legal models are available for specific workflows. We prepare your document corpus for indexing, which means deciding what goes into the retrieval pipeline (precedents, templates, matter files, research memos) and what stays in standard document management.

Step 3: deployment, RAG pipeline, and access controls#

We deploy the inference layer inside your network, build the retrieval-augmented generation pipeline over your document corpus, and configure role-based access controls. Access is structured by practice group, matter type, or individual attorney. Your IT director defines the policy; we implement it. Every access decision is logged.

Step 4: validation, documentation, and team handoff#

Before the system goes into regular use, we run validation against representative legal workloads: contract review tasks, research queries, drafting prompts. We document every component, write the operational runbook, and run training sessions for the attorneys and staff who will use it. Your team should be able to operate and maintain this independently once we hand it over.


Tech stack#

Every component in this stack runs inside your network. No component makes outbound calls to cloud services.

Inference: Ollama or vLLM running inside your network#

Ollama is our default for most law firm deployments: manageable by internal IT, GPU-optional for smaller workloads, and well-suited to document-heavy workflows. For firms with high document volume or many concurrent users, we deploy vLLM for its throughput under load. Both run on hardware inside your perimeter.

RAG and retrieval: LangChain, ChromaDB, or pgvector over your document corpus#

The retrieval layer indexes your firm's actual documents, including precedents, templates, matter files, and research, then makes them accessible to the AI model at inference time. We use LangChain for pipeline orchestration and either ChromaDB or pgvector for vector storage, depending on your existing database infrastructure. The index lives on your hardware and is updated by your IT team.

Interface: Open WebUI with role-based access, or a custom matter portal#

For firms that want a familiar chat-style interface, Open WebUI has LDAP/Active Directory authentication and role-based access controls your IT team can administer. For firms that want deeper integration with their matter management or document systems, we build a custom portal that surfaces AI capabilities inside existing workflows. Either approach is fully self hosted.

Audit and logging: full request tracing, no data leaving the perimeter#

Every query, every response, every user session is logged with a complete audit trail. The log is retained inside your environment according to your firm's records policy. There is no external telemetry, no usage data sent to a third-party analytics platform, and no mechanism by which query content could be accessed by anyone outside your network.


What this costs and what it replaces#

Self hosted legal AI deployments typically range from $12,000 to $50,000 for initial build-out, depending on firm size, document corpus volume, number of practice groups being served, and whether the interface is Open WebUI or a custom matter portal.

What it replaces: per-seat SaaS subscriptions for AI drafting and research tools, document review vendor costs for large due diligence matters, and the ongoing compliance exposure of routing client data through platforms whose privilege status is now in question.

For firms that use AI on any regular basis, the fixed cost of a self hosted deployment typically amortizes against per-seat SaaS fees within 12-18 months for mid-size configurations. The privilege protection is not a cost. It is a condition of professional responsibility.

We scope every engagement before quoting. Contact us to start with an infrastructure and compliance assessment.


FAQ#

Does using ChatGPT or other cloud AI void attorney-client privilege?

Based on the February 2026 SDNY ruling, documents created using commercial AI platforms may not be protected by attorney-client privilege, because public AI tools lack the confidentiality required for privilege to attach. The ruling was specific to the SDNY and may be litigated differently in other jurisdictions, but it establishes a material risk that work product created through commercial AI could be discoverable. Your privilege counsel should assess your firm's specific exposure.

What is the SDNY ruling on AI and attorney-client privilege?

In February 2026, Judge Jed S. Rakoff ruled that materials created using a commercial AI platform are not protected by attorney-client privilege or the work product doctrine, because public AI tools collect user inputs, may disclose data to third parties, and cannot form the confidential relationship required for privilege to attach. A self hosted AI system, where all processing occurs inside the firm's own infrastructure, is the only architecture that preserves privilege by construction.

What AI tools can law firms use without privilege waiver risk?

Any AI tool where inference runs entirely inside the firm's own infrastructure, with no client data transiting a third-party server, eliminates the privilege waiver risk at the architecture level. That means running the LLM locally (using tools like Ollama or vLLM), building retrieval pipelines over your own document corpus, and confirming that no component of the workflow makes outbound API calls carrying client data.

Can law firms run AI locally to protect confidential client data?

Yes. The LLM runs on hardware inside your network. Document retrieval indexes your own files. Staff access the system through a web interface that connects to your local inference server. Nothing in the workflow routes client data to an external API.

How much does private self hosted AI cost for a law firm?

Deployments typically range from $12,000 to $50,000 for initial build-out, depending on firm size, workflow scope, and interface requirements. Ongoing support and model maintenance are available as a separate retainer. Every engagement starts with a scoped assessment. We do not quote from a standard price list because firm environments vary significantly.

Does ABA Formal Opinion 512 apply to self hosted AI?

ABA Formal Opinion 512 requires that lawyers understand how a generative AI platform uses client data and implement adequate safeguards. With a self hosted deployment running inside your firm's own infrastructure, client data does not pass through any generative AI platform operated by a third party. The opinion's concerns about data retention, training use, and third-party disclosure do not apply when the AI runs on your hardware under your control.


Request an infrastructure and compliance assessment#

The fastest way to understand your firm's exposure is to map where client data currently goes when attorneys use AI tools. We identify every exit point, assess the privilege risk for each workflow, and scope what a self hosted deployment would require for your document environment and practice groups.

Request a compliance and infrastructure assessment, or contact us directly to discuss your specific workflows.

See also:

Last updated: March 16, 2026

[ How It Works ]

Free Automation Audit

We find the 20% of your manual work that costs you the most, then show you exactly how to eliminate it.

STEP 1.0
Tell Us What Hurts

Tell Us What Hurts

A 30-minute call. Walk us through your daily operations and we'll spot the bottlenecks you've stopped noticing.

STEP 2.0
We Rank the Wins

We Rank the Wins

We score every opportunity by impact and effort, so you can see where AI saves the most time and money.

STEP 3.0
You Get the Playbook

You Get the Playbook

A prioritized roadmap you can act on. Execute it with us or on your own. Yours to keep either way.