The Contract You Signed May Not Describe the AI Reading Your Data

Most companies believe they know where their data goes. They signed an agreement. They read the security page. They moved on.

In 2026, that confidence is getting tested. Not by a breach. By a series of quiet default changes that redraw the data boundary without anyone leaving their desk.

The pattern is simple. A tool you already use adds AI. The setting that decides whether your content trains that AI is turned on by default. Somewhere in the fine print, the answer to a question you never asked has changed.

A default changed, and your data moved with it

Consider two announcements from this year.

Atlassian confirmed that starting August 17, 2026, it will begin using data from its cloud products, including Jira and Confluence, to train its AI offerings. The change affects roughly 300,000 customers. The control to stop it already exists in the admin settings, but it is something you have to go and switch off. Coverage of the change also noted that contributed data can flow to the United States, including to third-party model providers.

GitHub announced a similar shift for Copilot. Starting April 24, 2026, interaction data from its Free, Pro, and Pro Plus tiers, including code and surrounding context, would be used to train models by default unless the user opts out. Business and Enterprise customers were exempt under their existing contracts.

Notice what these have in common. This is not a data breach, and it is not illegal. It is a policy change. The burden to notice it, understand it, and act on it sits with you.

For most content, that is a manageable decision. For source code, patient-adjacent records, financial models, or anything covered by a data residency rule, a changed default is a governance event. Your source code alone can carry proprietary logic, internal system references, and business rules you would never hand to anyone on purpose.

The subprocessor you never approved

Even when you read the contract, the contract may not tell you the whole story.

A 2026 report from the privacy platform DataGrail analyzed 2,400 popular business software providers. It found that about 63.6 percent of vendors advertising AI capabilities did not disclose a third-party AI subprocessor in their legal documentation.

Sit with that number. For most AI-enabled software you buy, there is a better than even chance the vendor has not told you which model is actually processing your data behind the interface.

The data processing agreement is supposed to be the document that lets your team evaluate risk. When most vendors have not updated it for the AI they have already added, the document stops doing its job. The software says AI. The paperwork says nothing.

This is not a reason to fear the cloud

It would be easy to read all this and conclude that public AI tools are unsafe and everything should be locked down. That conclusion is wrong, and it is expensive.

Cloud AI is not the problem. Frontier models are not the problem. Vendors adding AI features is not the problem. The problem is unmanaged dependence. It is sending data to an environment before anyone decided that environment was appropriate for it.

Plenty of work belongs in the cloud, on frontier models, with major providers. Marketing copy, public research, routine drafting, general questions. For that work, convenience is a feature, not a risk.

The trouble starts when the same default carries sensitive work along with the routine work, because no one drew a line between them.

Privacy is a routing decision, not a blanket ban

The better model is not lockdown. It is routing.

Sensitive work should be routed according to clear policy before it reaches a model, not after. That means classifying work by how sensitive it is, then sending each class to an approved environment. Keep the most sensitive work local or in a private, company-controlled environment. Route permitted work outward through paths you have reviewed. Let the routine work use the convenient tools it was always fine to use.

This turns privacy from a wall into a decision. Instead of one anxious yes-or-no about the cloud, you get a repeatable rule: this kind of work goes here, that kind goes there, and the boundary holds even when a vendor changes a default.

The data boundary becomes something you control, rather than something a settings page controls for you.

Questions to ask before you sign, and after

You do not need a legal team to start protecting the boundary. You need a short list of questions that every AI tool has to answer.

Do you use any third-party AI models or providers to process our data?
Are those providers named in the data processing agreement as subprocessors?
Where is our data processed, and does it leave our primary region?
Is our data used to train or fine-tune any model, and is that on or off by default?
If you change these defaults later, how and when will we be told?

The companies making real progress with AI are not the ones with the thickest policy binders. They are the ones that set these standards early, so a vendor’s quiet default change becomes a routing rule they already anticipated instead of a fire drill they did not.

Your content may belong to you. Whether it stays inside the boundary you intended is a separate question, and it is the one worth governing.

Know where your work goes before you send it.

Explore ThinkFreely Talk to Buildtelligence

Think Freely.