AI and GDPR Compliance: A Practical Guide

Author

William Newmark

Read time

10 mins

Published

Apr 20, 2026

Summary

Using personal data in AI systems triggers GDPR protections and obligations. This guide breaks down what counts as processing, where risks exist, and how to build practical privacy and consent compliance into data collection, training, deployment, and governance.

The rise of generative AI, automated decision-making, and increasingly large training datasets has created new data processing activities that existing privacy measures weren’t designed to handle.

In order to maintain GDPR compliance in the face of evolving AI systems and regulatory requirements for data privacy, businesses must stay informed and ready to adapt.

This article maps GDPR obligations across the full AI lifecycle, from data sourcing to deployment, to help you understand where the data privacy law applies and what you need to do to stay compliant.

At a Glance

Key Takeaways

AI systems fall under the scope of the GDPR whenever they process personal data, including during collection, model training, inference, profiling, or behavioral monitoring.
GDPR compliance for AI must be assessed across the full lifecycle, as data collection, training, deployment, and automated decision-making are separate processing activities with distinct obligations.
Personal data in AI environments goes far beyond obvious identifiers and can include indirect identifiers, linkable data, interaction logs, prompt histories, outputs, and inferred traits.
Teams that use AI can’t rely on a legal basis alone for compliance. They also need clear notices, RoPAs, DPIAs, retention rules, transfer safeguards, and workable processes for handling data subject rights.
Operational AI compliance depends on embedding privacy controls into day-to-day workflows, which includes documentation, review gates, consent records, and ongoing monitoring.

One of the most important GDPR questions facing marketers and compliance teams today is how the regulation applies to the use of AI. The short answer: AI systems are subject to GDPR compliance when they process or use personal data.

“In practice, that’s across the entire AI lifecycle,” explains Usercentrics CMO Adelina Peltea.

“From collecting data for training, to generating predictions, to profiling individuals through automated decision-making, these activities and others qualify as data processing under the GDPR. AI amplifies the need for lawful basis, transparency, accountability, and meaningful user control.”

The organization deploying an AI tool is almost always the data controller per GDPR definitions. The AI vendor is typically a processor, and the two are bound by a data processing agreement.

Joint controllership becomes relevant when a vendor uses customer data to improve its own models. What’s more, some AI tools can reidentify anonymous datasets or infer new personal data about a subject’s private traits, both of which are considered forms of processing that require a valid legal basis.

What Counts as Personal Data in AI Systems?

Art. 4 GDPR defines personal data as “any information relating to an identified or identifiable natural person”.

This definition is intentionally broad, and it covers three tiers of data: direct identifiers, indirect identifiers, and linkable data.

Direct identifiers

Unique data points that directly link to an individual, like names, email addresses, ID numbers

Indirect identifiers

Attributes that can be combined with other data to re-identify a person, like IP addresses, device IDs, location data

Linkable data

Information that, when combined with other datasets, can single out an individual even if no single field does so on its own

In terms of AI use, these tiers cover more ground than most organizations expect.

Training datasets can contain personal data that, even when it appears anonymized, can sometimes be traced back to individuals.
Interaction logs, prompt histories, and output records frequently capture identifiable content.
Predictions about a person’s health, finances, political views, or purchasing behavior count as personal data, even if the individual never volunteered that information.

What Are the Lawful Bases for AI Data Processing?

Lawful basis, as outlined by Art. 6 G DPR, needs to be assessed at each phase of AI data processing. Collecting data, training a model on it, and running inference against live users all count as distinct processing activities that may each require their own justification.

Where consent is the lawful basis, it must be freely given, specific, and easily revocable. However, this lawful basis is difficult to maintain across a model’s lifetime. New consent must also be obtained every time stated processing purposes change, which could be often for AI uses.

Legitimate interest is more flexible, but it requires a documented balancing test that demonstrates that the organization’s interests don’t override individuals’ rights. That test needs to hold up at every stage, not just at the point of collection.

“One of the most common GDPR pitfalls in AI adoption is assuming that existing data can simply be repurposed for training models without reassessing the lawful basis,” explains Peltea. “Purpose limitation still applies, and you may need to request new consent.”

If personal data is involved, training an AI model is considered processing under the GDPR.

Art. 4 GDPR defines processing as “any operation or set of operations which is performed on personal data or on sets of personal data, whether or not by automated means.” Model training fits cleanly into this definition.

Sourcing training data is collection. Loading and preparing it is retrieval and use. Sharing data with a fine-tuning vendor is disclosure. And how you retire a dataset at the end of a training run is a retention and erasure question. Each of these activities requires a lawful basis and appropriate safeguards.

The use of AI introduces a number of responsibilities for businesses that process the personal data of people residing in the EU/EEA. Some of these GDPR obligations include:

Updated privacy policiesthat explain AI-driven processing in plain language
Records of Processing Activities (RoPAs)for each AI processing activity
Data Protection Impact Assessments (DPIAs)for high-risk uses
Legitimate Interest Assessments (LIAs) where legitimate interest is the legal basis
Retention policies should go beyond source data to cover logs, outputs, and model artifacts, with separate retention periods documented for each
Standard Contractual Clauses (SCCs) apply to AI just as they do to any other processing. If a vendor hosts or accesses data outside the EU/EEA — including remote access by engineers in third countries — a transfer impact assessment is required
Supplementary technical measures, such as end-to-end encryption, for transfers to countries without an adequacy decision

Different GDPR obligations apply to different processing activities at each stage of the lifecycle. What’s required at the data collection phase is different from what’s required during training, and requirements change again once a system is live.

Below, we map the key obligations to each stage.

Data Collection for AI

Lawful basis: Identify and document a valid Art. 6 GDPR basis before collection begins. Consent, legitimate interests, and contract are the most common, but the basis must be realistic and maintainable for the full lifecycle of the AI system.
Privacy notices: Notices must disclose, in plain language, that data will be used for AI processing. If AI use wasn’t covered in the original notice, it needs to be updated before that processing begins.
Data minimization: Collect only what the specific AI use case requires. Broad collection “for future AI projects” is not a valid purpose under the GDPR.
ePrivacy/cookie consent: Tracking data used as AI input requires cookie consent under the ePrivacy Directive, separate from and in addition to the GDPR lawful basis.

Model Training

Dataset governance: Maintain records of what data was used, where it came from, how it was prepared, and who had access.
Purpose compatibility: Assess whether using data for training is compatible with the purpose that you disclosed to individuals. If not, a fresh legal basis is required before training begins.
Training run documentation: Log what data was used, how it was processed, and what model version was produced. These records should be retained for the operational life of the model.

Deployment

Transparency: Users must be told when AI is processing their data, what it is being used for, and the logic behind any automated decision-making.
Logging policies: Define retention periods for interaction logs and enforce them. Logs should be retained only as long as necessary for their purpose, not indefinitely by default.
Rights request handling: You need documented procedures for handling access, erasure, and rectification requests. They should cover interaction history, inferred data, and model artifacts.
Automated decision-making safeguards: Where AI materially influences decisions that affect individuals, Art. 22 GDPR protections apply.

The GDPR is part of a larger picture when it comes to AI governance. Compliance teams need to understand where frameworks overlap and how to avoid duplicating work across them.

The most relevant regulation is the EU AI Act, which introduces a risk-tiered framework for AI systems operating in the EU. For example, high-risk AI systems, including those used in hiring, healthcare, and law enforcement, require explicit consent and transparency.

These obligations run parallel to, and in some cases extend beyond, GDPR responsibilities. However, documentation artifacts can often be used to prove compliance with multiple regulations.

For example, a DPIA conducted for GDPR purposes can be modified for the EU AI Act requirements. You can adapt the risk registers, data flow maps, and transparency disclosures to meet the requirements of both regulations.

Beyond the GDPR and the EU AI Act, sector-specific rules are starting to emerge, which layer additional requirements on top of both frameworks. When organizations in those sectors, like financial services or healthcare, use AI to make or inform decisions about individuals, the cumulative obligations mean that an in-depth compliance review may be necessary.

Standard GDPR data subject rights — like access, erasure, rectification, and portability — apply in full when AI processes personal data. But enforcing them in these cases can be more complicated than with conventional data processing.

For example, deleting a record from a training dataset doesn’t remove its influence from a trained model. Full retraining is often the cleanest solution, but it’s costly and resource intensive. Machine unlearning techniques (PDF) may help, but are not perfect.

Where technical erasure isn’t feasible, controllers should suppress certain outputs and document the limitation to compensate.

These operational challenges compound when AI is used to make or support decisions about individuals. In those cases, organizations also need to consider Art. 22 GDPR, which governs automated individual decision-making, including profiling.

According to the Article: “The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her.”

In many cases, this means that purely automated decisions, like a job application screening or an insurance quote generated by an AI model, can’t be the default where the outcome significantly affects an individual. Businesses must give data subjects the option to opt out of this AI-powered automated processing, request human review of the decision, or both.

Understanding GDPR obligations in theory is a start, but you need to build a program that holds up in practice. GDPR compliance isn’t a one-time checkbox exercise, especially when it comes to AI use.

Models evolve, use cases expand, and the regulatory landscape is still developing. The following steps are a practical foundation for organizations that want to stay ahead of privacy compliance obligations and implement responsible AI use.

1. Conduct a DPIA Before Launching AI Systems

Run a DPIA before deploying any AI tools to assess the risk and identify what measures you need to take. Treat it as a living document, and review it whenever the system changes materially.

Your RoPA should reflect AI systems as distinct entries and document the legal bases, data categories, processors, retention periods, and any international transfers for each.

Build in privacy from the start by applying data minimization at the training stage, pseudonymizing data where full identification isn’t necessary, and limiting the use of personal data to what each use case genuinely requires.

4. Add Compliance Gates to Your AI Delivery Workflow

Build structured review points into your delivery process at key stages, such as before data sourcing, before training, before deployment, and before any significant change in scope.

5. Monitor Automated Decision-Making Practices

Maintain a clear inventory of where AI makes or influences decisions that affect individuals. Be sure to provide data subjects the option to opt out, and verify that human review mechanisms are functioning.

Every AI processing activity needs a documented, defensible legal basis. If consent is your legal basis, you need to keep records of how and when it was obtained.

7. Continuously Monitor AI Use as It Evolves

Models are retrained, data sources change, and vendors update their own practices. Build in a regular review cadence to ensure your compliance practices always reflect your current AI use.

“The most effective way to operationalize GDPR compliance for AI is to treat it as a lifecycle issue. Conduct DPIAs before deployment, map data flows across systems, and continuously monitor how models evolve over time.” ”

Adelina Peltea

— CMO of Usercentrics

AI and data privacy are deeply intertwined, and understanding your GDPR obligations is the first step towards achieving and maintaining privacy compliance. Enforcing those requirements at an operational level is where most organizations struggle.

Usercentrics helps you bridge that gap. Consent is often the correct basis for AI data processing, and the Usercentrics Consent Management Platform (CMP) supports compliant and auditable data collection and use. It passes consent signals downstream so that AI systems only operate on data that individuals have actually consented to sharing.

It also maintains secure, accessible records of when and how consent was given over time to enable proof of compliance in the event of an audit.

When you automate consent management with Usercentrics, you reduce your reliance on manual governance processes and gain confidence that the data feeding your AI systems complies with GDPR requirements.

Simplify consent management for GDPR-ready AI

Usercentrics captures and passes compliant consent signals downstream to reduce manual governance and help prove lawful AI data processing at scale.

Learn more

William Newmark

Read bio

Stay in the loop

Join our growing community of data privacy enthusiasts now. Subscribe to the Usercentrics newsletter and get the latest updates right in your inbox.

Frequently asked questions

Not fully. OpenAI offers GDPR-aligned structures for enterprise customers, including a Data Processing Addendum, EU data residency controls, and SOC 2 Type 2 certification. But unresolved enforcement matters around data accuracy, consent, minors, and model training highlight significant compliance gaps.

Italy’s data protection authority (Garante) fined OpenAI EUR 15 million in December 2024 for the original GDPR violations that prompted a temporary ban in 2023.

Organizations using ChatGPT to process personal data remain data controllers under the GDPR and must assess their own compliance obligations independently.

Yes. If personal data is used at any stage of AI training — including sourcing, labeling, or fine-tuning — the GDPR applies in full.

Controllers must identify a valid lawful basis under Art. 6 GDPR, respect data minimization and purpose limitation principles, and honor data subjects’ rights. The use of publicly available data does not create an exemption. Where special category data is involved, the stricter conditions of Art. 9 GDPR apply.

No single lawful basis applies to all AI data processing. The appropriate basis depends on the specific activity and context.

In December 2024, the European Data Protection Board (EDPB) published Opinion 28/2024, confirming that AI model development and deployment can rely on legitimate interest under Art. 6(1)(f) GDPR, provided organizations carry out and document a three-step legitimate interest assessment.

Consent under Art. 6(1)(a) GDPR is also available but must be freely given, specific, and informed — conditions that can be difficult to satisfy at AI scale. Where processing produces legal or similarly significant effects through automated decision-making, additional requirements under Art. 22 GDPR apply.

Art. 22 GDPR applies when three conditions are all met: the decision is based solely on automated processing, with no meaningful human involvement; it includes profiling; and it produces legal effects or similarly significant effects on the individual.

Examples include automated credit scoring, AI-driven recruitment screening, and insurance claim denials without human review.

Where Art. 22 applies, controllers must meet one of the permitted conditions — contractual necessity, legal authorization, or explicit consent — and implement safeguards including the right to human review and the right to contest the decision.

Not automatically. Truly anonymized data falls outside the GDPR’s scope, but the EDPB’s Opinion 28/2024 states that whether an AI model is anonymous must be assessed on a case-by-case basis: for a model to qualify, it must be very unlikely to directly or indirectly identify individuals whose data was used, or to extract such personal data through queries.

Pseudonymized data remains personal data under the GDPR and requires a lawful basis for processing. Organizations should conduct a documented risk assessment before treating any data set as anonymized for AI purposes.

The EDPB Opinion 28/2024 addresses this directly: regulatory focus to date has mostly been on the providers training the AI models, but the EDPB Opinion also emphasized the need for controllers deploying the models to carry out an appropriate assessment of whether the model was developed lawfully. Organizations often assume deployment is someone else’s problem; it isn’t.

In principle, yes, but enforcing this right against a trained AI model is technically complex. Deleting a record from a training dataset does not remove its influence from a model that has already been trained on it. Full retraining is the most reliable solution, but it is resource-intensive and not always practical.

Where full erasure is not technically feasible, controllers should suppress outputs that could be traced to the individual and document the limitation and the steps taken. Machine unlearning techniques may help reduce residual influence, but are not yet reliable enough to be treated as equivalent to erasure.

Data subjects retain the right to make erasure requests regardless of these technical constraints. Controllers cannot use implementation difficulty as grounds to refuse a valid request outright. They must demonstrate that erasure is genuinely not possible, explain what compensatory measures have been taken, and keep records of both.

All resources

All resources

AI and GDPR Compliance: A Practical Guide

At a Glance