Vendor lock-in risk in AI service agreements

What legal rights help companies escape AI vendor lock-in?

The best legal lever is usually the EU Data Act for cloud-style services, while GDPR portability only helps with personal data. In the United States, counsel should treat portability as a negotiated contract and architecture issue unless a specific statute applies.

Perhaps the closest thing to an anti-lock-in statute for AI services is not an AI statute at all. It is Chapter VI of Regulation (EU) 2023/2854, the Data Act. Article 23 requires providers of data processing services to remove “commercial, technical, contractual and organisational obstacles” to switching. Article 25 requires exportable data and digital assets to move without undue delay and within a transition period that may not exceed 30 calendar days. Article 29 adds that, from January 12, 2027, providers shall not impose any switching charges.

That regime was drafted for data processing services, not foundation-model APIs as such. Still, the practical fit is obvious enough. Many enterprise AI products are sold as managed cloud services, even when the model is the visible surface. In that setting, the strongest present legal argument against technical and contractual lock-in comes through cloud-switching law rather than model law.

GDPR Article 20 does something narrower. It gives the data subject a right to receive personal data in a “structured, commonly used and machine-readable format” and to transmit those data to another controller without hindrance. That matters where an AI deployment processes customer or employee personal data. It does not create a general enterprise right to port prompt libraries, system instructions, evaluation history, ranking models, vector indexes, or fine-tuning deltas just because they sit inside an AI product.

The rest is mostly contract and system design. In the source set, there is no U.S. federal analogue that plays the same role as the Data Act, and no reported appellate decision squarely answers whether prompts, embeddings, or fine-tuned artifacts must be made portable absent express contract language. That is why public vendor commitments and law-firm commentary do so much of the practical work here.

RPC is the most direct. Its June 10, 2025 procurement checklist says AI sourcing often requires significant onboarding and fine-tuning, and states flatly that “vendor lock-in is a risk that is heightened when procuring AI due to its complexity”. RPC's companion piece on AI-as-a-service then treats exit as a knowledge-transfer and interoperability problem, not just a termination-right problem.

DLA Piper comes at the same issue from one layer lower in the stack. Its 2023 piece says the “typical procurement and due diligence process should now be adjusted” for generative AI. The point is not only that the direct vendor's paper matters. The upstream model's terms, restrictions on competitive use, ownership language around generated material, and internal AI policies can all become hidden switching costs.

Alston & Bird and Hogan Lovells supply the stronger legal analogy. Alston's September 2025 note reads the Data Act as requiring cloud contracts to support switching within a 30-day transition and to specify the data and digital assets that move on exit. Hogan Lovells emphasizes the same statutory purpose in slightly broader terms: removing “commercial, technical, contractual and organizational obstacles” and enabling even simultaneous use of multiple providers.

There is not much real disagreement across these firms. The disagreement, if any, is about where to look first. RPC emphasizes onboarding knowledge and interoperability. DLA Piper emphasizes the upstream vendor chain. Alston and Hogan emphasize statutory switching mechanics. Together they point to the same conclusion: AI lock-in usually arrives through technical dependency that the contract merely ratifies.

Perhaps the first unresolved issue is coverage. The EU Data Act clearly reaches data processing services, but it is still not fully tested how neatly that category maps onto foundation-model APIs sold through different commercial forms. The better the AI product looks like managed cloud infrastructure, the stronger the analogy appears. The more bespoke the arrangement, the less certain that fit becomes.

Sources for this answer

Primary law

A.1 Regulation (EU) 2023/2854, art. 23

Supports the cited proposition. (Regulation (EU) 2023/2854, art. 23)

commercial, technical, contractual and organisational obstacles

See Regulation (EU) 2023/2854, art. 23.

Primary law

A.4 Regulation (EU) 2016/679, art. 20(1)

Supports the cited proposition. (Regulation (EU) 2016/679, art. 20(1))

structured, commonly used and machine-readable format

See Regulation (EU) 2016/679, art. 20(1).

Commentary

A.5 RPC commentary

Supports the cited proposition. (RPC commentary)

vendor lock-in is a risk that is heightened when procuring AI due to its complexity

See RPC, Procuring AI – commercial considerations checklist.

Law-firm commentary

A.7 DLA Piper commentary

Supports the cited proposition. (DLA Piper commentary)

typical procurement and due diligence process should now be adjusted

See DLA Piper, Before creating or acquiring a technology solution that is generated by AI, consider your contract terms.

Law-firm commentary

A.2 Hogan Lovells commentary

Supports the cited proposition. (Hogan Lovells commentary)

commercial, technical, contractual and organizational obstacles

See Hogan Lovells, EU Data Act Series (part 7): Easy switching between data processing services (SaaS, IaaS, PaaS…).

Law-firm commentary

A.3 Alston & Bird commentary

The EU Data Act establishes a regulatory framework for cloud service providers, including extraterritorial application, mandatory switching requirements, and enforcement mechanisms by Member State authorities.

The purpose of the Data Act is to promote the EU’s digital economy by making data more accessible and usable, as well as by increasing fairness and competition.

See Alston & Bird, The Data Act: Switching Requirements for Cloud Services Providers.

Commentary

A.6 RPC commentary

While AI-as-a-Service (AIaaS) shares many contracting characteristics with traditional SaaS, it presents unique challenges regarding performance assurances, liability, and regulatory transparency that require careful consideration during procurement.

Artificial Intelligence-as-a-Service (AIaaS), in the same vein as Software-as-a-Service and Infrastructure-as-a-Service, refers to cloud-based tools that allow businesses to gain access to an AI model hosted by a third party provider.

See RPC, AI-as-a-service – key issues.

Law-firm commentary

A.8 Hogan Lovells commentary

The Hogan Lovells commentary summarizes recent international regulatory developments in digital transformation, including the enforcement of the EU Data Act, reporting obligations under the EU AI Act, and new AI labeling requirements in China.

Under the AI Act, the provider is required to report any serious incident to the market surveillance authorities of the Member States where that incident occurred.

See Hogan Lovells, The month in 5 bytes | October.

Can prompts and fine-tuned AI models move to another vendor?

Usually no; text prompts may export, but behavior, custom model artifacts, and evaluation baselines may not move cleanly. Ownership language does not by itself create operational portability.

The contract can be portable while the system is not. Prompt libraries export as text. What often fails to travel is the behavior those prompts assume. Provider documentation still reflects model-specific conventions around tool use, message structure, or reasoning scaffolds. Abstraction frameworks help with the common denominator, but they also acknowledge provider-specific fields and integrations. The result is that prompt engineering can start to look less like customer data and more like application logic. A provider switch may therefore preserve the words while still requiring retuning, regression testing, and new eval baselines.

Fine-tunes split into two markets. OpenAI and Anthropic are customer-favorable on ownership of inputs and outputs and on not training on customer content by default. That matters, but it is not the same as promising export of the resulting custom model weights or a portable fine-tuning artifact. In the source set, Google's public documentation is friendlier to export for some model artifacts, Azure exposes downloadable artifacts for open models in its catalog, and AWS is the clearest on the open-weight path because Bedrock supports import of customized open-source models in hosted form. The consequence is that a closed-provider custom model can be yours in the ordinary commercial sense while still not being portable in the operational sense.

Prompt portability is also still contested in practice. One side says frontier models are now similar enough that most prompt assets move with limited revision, especially when the workload stays near ordinary chat, tools, and retrieval. The other side points to provider-specific tool semantics, reasoning scaffolds, and safety behavior that make heavily optimized systems much harder to move than a text export suggests. We think both are true at different depths of integration.

Sources for this answer

Vendor documentation

B.1 Anthropic Docs, Features overview

The Anthropic Claude platform categorizes features by availability, distinguishing between stable, generally available tools recommended for production and beta features that are subject to change or discontinuation.

Generally available (GA) | Feature is stable, fully supported, and recommended for production use.

See Anthropic Docs, Features overview.

Vendor documentation

B.2 DeepSeek-R1 model card

The DeepSeek-R1 model series is released under an MIT license that explicitly permits commercial use, modification, and the creation of derivative works such as distilled models.

DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs.

See DeepSeek-R1 model card.

Vendor documentation

B.3 LangChain Docs, Models

LangChain provides a standardized interface for integrating and swapping various large language model providers while supporting advanced capabilities like tool calling, multi-step reasoning, and local execution.

LangChain supports all major model providers through dedicated integration packages. Each provider package implements the same standard interface, so you can swap providers without rewriting application logic.

See LangChain Docs, Models.

Vendor documentation

B.4 LlamaIndex Docs, Available LLM integrations

LlamaIndex provides native support for a wide range of large language model integrations, including major providers like OpenAI, Anthropic, and Google.

We support integrations with OpenAI, Anthropic, Google, Hugging Face, and more.

See LlamaIndex Docs, Available LLM integrations.

Vendor documentation

B.5 OpenAI, OpenAI Services Agreement

Under the OpenAI Services Agreement, customers retain ownership of their input and are assigned ownership of the output generated by the services, while assuming responsibility for the input's legality and the output's appropriateness.

As between Customer and OpenAI, to the extent permitted by applicable law, Customer: (a) retains all ownership rights in Input; and (b) owns all Output.

See OpenAI, OpenAI Services Agreement.

Vendor documentation

B.6 Anthropic, Commercial Terms of Service

Anthropic's Commercial Terms of Service disclaim warranties regarding the accuracy or completeness of AI-generated outputs and limit the company's liability for damages arising from the use of its services.

Customer acknowledges, and must notify its Users, that factual assertions in Outputs should not be relied upon without independently checking their accuracy, as they may be false, incomplete, misleading or not reflective of recent events or information.

See Anthropic, Commercial Terms of Service.

Vendor documentation

B.7 Google Cloud, Export model artifacts for inference and explanation

Vertex AI requires that model artifacts be exported in specific formats and comply with framework-specific requirements when using prebuilt containers for inference.

To use one of these prebuilt containers, you must save your model as one or more model artifacts that comply with the requirements of the prebuilt container.

See Google Cloud, Export model artifacts for inference and explanation.

Vendor documentation

B.8 Microsoft Learn, Explore Microsoft Foundry Models in Azure Machine Learning

Microsoft distinguishes between its own first-party AI models and third-party models in Azure Machine Learning, with the latter being subject to separate terms and conditions as non-Microsoft products.

Models from providers other than Microsoft are Non-Microsoft Products as defined in Microsoft Product Terms and are subject to the terms provided with the models.

See Microsoft Learn, Explore Microsoft Foundry Models in Azure Machine Learning.

Vendor documentation

B.9 AWS Docs, Use Custom model import to import a customized open-source model into Amazon Bedrock

Amazon Bedrock allows users to import customized open-source foundation models from external environments like Amazon SageMaker AI, provided the import complies with applicable model licenses and uses the required Hugging Face weights format.

You can create a custom model in Amazon Bedrock by using the Amazon Bedrock Custom Model Import feature to import Foundation Models that you have customized in other environments, such as Amazon SageMaker AI.

See AWS Docs, Use Custom model import to import a customized open-source model into Amazon Bedrock.

Vendor documentation

B.10 LangChain Docs, Chat model integrations

LangChain provides a standardized interface for chat models that supports various provider-specific features, including the use of routers and proxies to simplify multi-provider integrations.

Chat models are language models that use a sequence of messages as inputs and return messages as outputs .

See LangChain Docs, Chat model integrations.

Why do retrieval systems make AI vendor switching more expensive?

RAG lock-in usually comes from embeddings, chunking, metadata, reranking, guardrails, and evaluation history rather than the raw corpus. Migration often means re-embedding and retesting retrieval quality.

RAG lock-in is usually embedding lock-in plus evaluation lock-in. The raw corpus is rarely the hard part. The hard part is the combination of chunking, metadata, embedding space, reranking, guardrails, and evaluation history built around one stack. Google currently documents 3072-dimensional Gemini text embeddings. AWS Bedrock knowledge-base materials list different supported embedding models and dimensions across providers and regions. A migration can therefore mean re-embedding the corpus, changing vector-store assumptions, and revalidating retrieval quality. Companies that kept raw corpora, chunking logic, and eval sets outside the managed retrieval layer still face work. Companies that did not face a much more expensive kind of work.

Sources for this answer

Vendor documentation

C.1 Google Cloud, Get text embeddings

Google Cloud's Vertex AI provides dense vector embedding models that facilitate semantic search by representing text meaning rather than relying on direct keyword matching.

Dense vector embedding models use deep-learning methods similar to the ones used by large language models.

See Google Cloud, Get text embeddings.

Vendor documentation

C.2 AWS Docs, Supported models and Regions for Amazon Bedrock knowledge bases

Amazon Bedrock Knowledge Bases provides flexible configuration options for model selection, cross-region inference, and data parsing, while noting that specific feature availability and data handling practices vary by region.

Amazon Bedrock Knowledge Bases also supports the use of inference profiles for parsing data or when generating responses.

See AWS Docs, Supported models and Regions for Amazon Bedrock knowledge bases.

Can AI residency and audit logs create vendor lock-in?

Yes. Residency commitments and audit-log exports can narrow the replacement vendor set even when they look like compliance features.

Residency can become a lock-in term even when no one calls it that. OpenAI now distinguishes between storage residency and inference residency for business customers. Anthropic exposes supported regional controls, but not every product path or feature set carries the same geography promise. AWS, Google, and Azure offer broader regional deployment patterns, but those patterns can move the dependence upward from the model vendor to the cloud vendor. The practical effect is that a compliance-approved design can narrow the substitute set before price or capability enters the analysis.

Observability does not travel cleanly either. OpenAI exposes audit-log and admin APIs. Anthropic offers usage and cost history. Google, Azure, and AWS all publish some mix of audit logs, billing export, CloudTrail, Activity Log, or Log Analytics pathways. That is real portability, but only of one layer. It does not recreate provider-side moderation outcomes, workspace analytics semantics, evaluation history, or the exact way the original platform rendered and classified activity. Historical visibility moves. System behavior usually does not.

Sources for this answer

Vendor documentation

D.1 OpenAI Help Center, Data residency and inference residency for ChatGPT

OpenAI provides data and inference residency features for ChatGPT Enterprise and Education customers to restrict the geographic storage and GPU-based processing of customer content, though these controls do not apply to all system data or metadata.

Data residency for ChatGPT allows customers to keep their customer content stored at rest in a specific geographic region.

See OpenAI Help Center, Data residency and inference residency for ChatGPT.

Vendor documentation

D.2 Anthropic Docs, Pricing and data residency notes

Anthropic's pricing structure for its API models incorporates various modifiers, including data residency multipliers, batch processing discounts, prompt caching, and session-based billing for managed agents.

For Claude Opus 4.7, Claude Opus 4.6, and newer models, specifying US-only inference via the inference_geo parameter incurs a 1.1x multiplier on all token pricing categories, including input tokens, output tokens, cache writes, and cache reads.

See Anthropic Docs, Pricing and data residency notes.

Vendor documentation

D.3 AWS Docs, Data protection - Amazon Bedrock

AWS advises users to avoid inputting sensitive information into free-form text fields when using Amazon Bedrock and clarifies that model providers do not have access to customer prompts or completions.

We strongly recommend that you never put confidential or sensitive information, such as your customers' email addresses, into tags or free-form text fields such as a Name field.

See AWS Docs, Data protection - Amazon Bedrock.

Vendor documentation

D.4 Microsoft Learn, Data, privacy, and security for Azure Direct Models in Microsoft Foundry

Microsoft's Azure Direct Models in Foundry maintain strict data privacy protections by ensuring customer prompts, completions, and training data are not used to train foundation models or improve third-party services without explicit authorization.

Your prompts (inputs) and completions (outputs), your embeddings, and your training data: - are NOT available to other customers. - are NOT available to OpenAI or other Azure Direct Model providers. - are NOT used by Azure Direct Model providers to improve their models or services.

See Microsoft Learn, Data, privacy, and security for Azure Direct Models in Microsoft Foundry.

Vendor documentation

D.5 OpenAI API Reference, audit logs and admin endpoints

The OpenAI API provides mechanisms for logging and tracking request identifiers to facilitate troubleshooting and auditability of API interactions.

OpenAI recommends logging request IDs in production deployments for more efficient troubleshooting with our support team, should the need arise.

See OpenAI API Reference, audit logs and admin endpoints.

Vendor documentation

D.6 Anthropic Docs, Usage and Cost API

Anthropic provides a dedicated Admin API that allows organizations to programmatically access granular, historical usage and cost data for monitoring and financial reconciliation purposes.

The Usage & Cost Admin API provides programmatic and granular access to historical API usage and cost data for your organization.

See Anthropic Docs, Usage and Cost API.

Vendor documentation

D.7 Google Cloud, Vertex AI audit logging information

Vertex AI audit logs are categorized into Admin Activity logs, which are enabled by default, and Data Access logs, which require explicit configuration to be recorded.

Admin Activity audit logs are always enabled; you can't disable them.

See Google Cloud, Vertex AI audit logging information.

Vendor documentation

D.8 Microsoft Learn, Activity log in Azure Monitor

Azure Monitor activity logs provide a default, 90-day record of management operations on Azure resources, serving as the exclusive source for identifying the creator of a resource.

Azure Monitor activity logs record management operations on your Azure resources.

See Microsoft Learn, Activity log in Azure Monitor.

Vendor documentation

D.9 AWS Docs, Monitor Amazon Bedrock API calls using CloudTrail

Amazon Bedrock integrates with AWS CloudTrail to log both management and data plane API operations, providing an audit trail of actions taken within the service.

Amazon Bedrock is integrated with AWS CloudTrail, a service that provides a record of actions taken by a user, role, or an AWS service in Amazon Bedrock.

See AWS Docs, Monitor Amazon Bedrock API calls using CloudTrail.

Do open models and abstraction layers prevent AI vendor lock-in?

They can improve leverage, but they do not eliminate switching cost. The company still needs portable artifacts, evaluations, routing rules, and governance outside the vendor boundary.

Open models improve the exit option without making it free. Meta, Mistral, and DeepSeek are now credible enough to matter in negotiations and emergency planning. But an open model becomes a true exit only if the company preserved the surrounding assets outside the provider boundary: source corpus, prompt assets, tokenizer assumptions, safety settings, evaluation data, and deployable model artifacts. Otherwise the lock-in has not disappeared. It has moved from the service agreement to hosting, tuning, and governance work.

Abstraction layers change the location of dependence more than they erase it. LangChain and LlamaIndex are right that a common interface lowers direct API switching cost. Box's multi-model AI posture shows the same idea at the application layer: keep the content and governance layer stable and let the inference provider vary. That is a meaningful reduction in one kind of lock-in. It is not the same thing as zero lock-in. The dependency often moves upward into routing rules, provider adapters, tracing, message schemas, and evaluation harnesses.

Open-weight models may reduce bargaining asymmetry without eliminating total switching cost. They preserve a more credible exit path where weights and artifacts are preserved. They also shift cost into hosting, safety review, monitoring, and regional deployment. That looks less like no lock-in and more like a different lock-in curve.
The last open question is whether abstraction should be understood as lower lock-in or relocated lock-in. It probably depends on what was scarce in the first place. If the bottleneck was a single model API, a neutral orchestration layer can help a lot. If the bottleneck becomes one framework's tracing, routing, and eval conventions, the dependency has not disappeared. It has changed address.

Sources for this answer

Vendor documentation

E.4 LangChain Docs, LangChain overview

LangChain provides a standardized framework for model integration and agent development that facilitates vendor interoperability, rapid deployment, and observability.

LangChain standardizes how you interact with models so that you can seamlessly swap providers and avoid lock-in.

See LangChain Docs, LangChain overview.

Vendor documentation

E.5 LlamaIndex Docs, Using LLMs

LlamaIndex offers a standardized interface for integrating various LLMs and managing tokenization requirements to facilitate application development.

LlamaIndex provides a unified interface for defining LLM modules, whether it’s from OpenAI, Hugging Face, or LangChain, so that you don’t have to write the boilerplate code of defining the LLM interface yourself.

See LlamaIndex Docs, Using LLMs.

Vendor documentation

E.9 DeepSeek-R1 model card

The DeepSeek-R1 model series is released under an MIT license that explicitly permits commercial use, modification, and the creation of derivative works such as distilled models.

DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs.

See DeepSeek-R1 model card.

Vendor documentation

E.8 AWS Docs, Use Custom model import to import a customized open-source model into Amazon Bedrock

You can create a custom model in Amazon Bedrock by using the Amazon Bedrock Custom Model Import feature to import Foundation Models that you have customized in other environments, such as Amazon SageMaker AI.

See AWS Docs, Use Custom model import to import a customized open-source model into Amazon Bedrock.

Commentary

E.1 Meta, Llama

Meta's Llama models provide flexible, open-source artificial intelligence capabilities that support diverse enterprise use cases ranging from multimodal data processing to automated content generation.

The open-source AI models you can fine-tune, distill and deploy anywhere.

See Meta, Llama.

Commentary

E.2 Mistral AI Docs, API Access with AI Studio - Serverless / Other Options

Mistral AI's Studio platform provides developers with programmatic API access to models for building AI applications and managing administrative tasks like key generation and usage monitoring.

Studio gives you programmatic access to Mistral models for text generation, agents, data processing, and more.

See Mistral AI Docs, API Access with AI Studio - Serverless / Other Options.

Vendor documentation

E.3 DeepSeek-V3.2 model materials

The DeepSeek-V3.2 model documentation specifies technical limitations regarding output parsing, role-based constraints, and variant-specific functional capabilities.

The output parsing function included in the code is designed to handle well-formatted strings only. It does not attempt to correct or recover from malformed output that the model might occasionally generate. It is not suitable for production use without robust error handling.

See DeepSeek-V3.2 model materials.

Commentary

E.6 Box, Box AI

Box AI maintains enterprise-grade security and privacy standards by ensuring that user data is protected and not utilized for model training without explicit written consent.

Rely on Box’s existing security and privacy for your most sensitive content generated by Box AI.

See Box, Box AI.

Commentary

E.7 Box Developer Docs, Supported AI models

Box categorizes its supported AI models by access level and capability tier, with certain models requiring specific administrative activation and potentially involving additional data processing terms.

Customer-enabled models | Require activation by Box admins in the Admin Console or a request to Box. Some models may be subject to additional terms or pricing.

See Box Developer Docs, Supported AI models.