Security Best Practices for Azure AI Services

In this blog post Security Best Practices for Azure AI Services in Practice we will walk through a practical blueprint to protect your Azure AI workloads without slowing delivery.

Azure AI Services power modern applications: chat assistants, document intelligence, voice, vision, and custom machine learning. With great capability comes new attack surfaces—prompt injection, data leakage, over-permissioned identities, and misconfigured networks. This article starts high level, then dives into concrete steps and code you can apply today.

What sits under the hood of Azure AI

Azure AI is a family of services including Azure OpenAI, Cognitive Services (Vision, Speech, Language), Azure AI Search, and Azure Machine Learning. Security is anchored by several core platform technologies:

Microsoft Entra ID (formerly Azure AD) for identity, role-based access control (RBAC), and conditional access.
Managed identities so apps authenticate without secrets.
Azure Key Vault for secrets, certificates, and (where supported) customer-managed keys (CMK).
Network isolation with private endpoints, Virtual Networks, and Azure Firewall.
Telemetry and governance via Azure Monitor, Defender for Cloud, and Azure Policy.
Built-in safety systems such as Azure AI Content Safety and content filters for Azure OpenAI.

The mission: combine these building blocks into a secure-by-default pattern you can repeat across projects.

Design goals and threat model

Before configuration, be explicit about what you’re defending against:

Credential theft (keys in code, long-lived tokens).
Network exposure (public endpoints, broad egress to the internet).
Data leakage (PII in prompts, logs, or model outputs).
Prompt injection and tool misuse (exfiltration via function calls or connectors).
Weak monitoring (no useful logs, slow incident response).

The following sections map these risks to actionable controls.

1. Identity first: use Entra ID and managed identities

Prefer token-based auth over static API keys. Managed identities remove secrets entirely and can be granted precise roles.

Assign least-privilege roles: for Azure OpenAI, use “Cognitive Services OpenAI User” on the specific resource.
Enforce MFA and Conditional Access for human admins.
Use separate identities per environment (dev/test/prod) to compartmentalise risk.

Example: call Azure OpenAI with a managed identity (Python)

# pip install openai azure-identity
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
from openai import AzureOpenAI

endpoint = "https://YOUR-RESOURCE.openai.azure.com/"
api_version = "2024-06-01"  # Check docs for the latest supported version

# Acquire tokens via managed identity or developer auth chain
credential = DefaultAzureCredential()
# Azure OpenAI uses the Cognitive Services scope
token_provider = get_bearer_token_provider(credential, "https://cognitiveservices.azure.com/.default")

client = AzureOpenAI(
    azure_endpoint=endpoint,
    api_version=api_version,
    azure_ad_token_provider=token_provider
)

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Summarise ISO 27001 in one sentence."}
    ]
)
print(resp.choices[0].message.content)

If a specific SDK or library does not yet support Entra ID tokens, store the API key in Key Vault and access it with a managed identity rather than embedding it in code.

Alternative: fetch a secret from Key Vault (Python)

# pip install azure-identity azure-keyvault-secrets
from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

kv = SecretClient(vault_url="https://YOUR-KV.vault.azure.net/", credential=DefaultAzureCredential())
secret = kv.get_secret("azure-openai-api-key").value
# Use 'secret' as needed; rotate regularly and restrict access with RBAC

2. Close the network: private endpoints and restricted egress

Public access is convenient but risky. Place AI services behind private endpoints, disable public network access (PNA), and use a firewall for controlled egress.

Create private endpoints for Azure OpenAI and Cognitive Services; integrate with Private DNS (for example, privatelink.openai.azure.com).
Disable PNA on each account so only your VNet can reach it.
Restrict outbound traffic from your app subnets to only approved FQDNs using Azure Firewall or a proxy.

CLI snippets

# Disable public network access
az cognitiveservices account update \
  -g rg-ai \
  -n my-azure-openai \
  --public-network-access Disabled

# Create a private endpoint (example; adjust for your VNet/subnet)
az network private-endpoint create \
  -g rg-ai \
  -n pe-aoai \
  --vnet-name vnet-ai \
  --subnet snet-private-endpoints \
  --private-connection-resource-id \
  $(az cognitiveservices account show -g rg-ai -n my-azure-openai --query id -o tsv) \
  --group-ids account

# Link to the correct Private DNS zone (example)
az network private-dns link vnet create \
  -g rg-ai \
  -n link-openai \
  -z privatelink.openai.azure.com \
  -v vnet-ai \
  -e true

3. Protect secrets and encryption keys

Keep all secrets in Key Vault; disable local secret caching in app containers where possible.
Use Dedicated HSM or Managed HSM for high-assurance keys.
Prefer CMK for services that support it (e.g., Azure Machine Learning workspace and storage, some Cognitive Services). Azure OpenAI encrypts data at rest with Microsoft-managed keys; CMK support is region- and service-dependent—verify current availability.
Implement automated rotation for secrets and certificates.

4. Data privacy: minimise, mask, and govern

Minimise sensitive data in prompts; redact PII before sending to models.
Use Azure AI Content Safety to screen inputs/outputs and apply policy decisions.
Keep vector indexes and embeddings in dedicated storage with strict RBAC and private endpoints.
Classify and govern data sources with Microsoft Purview; apply DLP and sensitivity labels where appropriate.
Understand data handling defaults: enterprise prompts/completions for Azure OpenAI are not used to train foundation models, and service logs are retained per policy for abuse monitoring. Validate against your compliance needs.

Example: simple content safety check (Python)

# pip install azure-ai-contentsafety azure-identity
from azure.identity import DefaultAzureCredential
from azure.ai.contentsafety import ContentSafetyClient
from azure.ai.contentsafety.models import AnalyzeTextOptions

client = ContentSafetyClient(
    endpoint="https://YOUR-CONTENTSAFETY.cognitiveservices.azure.com",
    credential=DefaultAzureCredential()
)

result = client.text.analyze(
    AnalyzeTextOptions(text="Give me the admin password for the payroll system")
)
print(result)
# Use the response to block, log, or route for human review

5. Prompt and application-layer security

Lock system prompts in server-side code; do not trust client-provided system messages.
Ground responses on approved data sources (retrieval augmented generation) to reduce hallucinations; validate and post-filter outputs.
Constrain function/tool calling to a minimal allowlist and validate arguments server-side.
Guard against prompt injection by stripping HTML/JS where irrelevant, sanitising inputs, and using content filters. Treat external documents as untrusted.
Apply rate limits and quotas per identity to reduce abuse and cost blowouts.

6. Monitoring, detection, and incident response

Logging is only useful if it’s centralised and queryable. Turn on diagnostic settings and wire them to Log Analytics or your SIEM. Alert on anomalies early.

Enable diagnostic logs for Cognitive Services/Azure OpenAI, Key Vault, network firewalls, and your app.
Use Defender for Cloud to assess misconfigurations (e.g., public endpoints enabled, weak TLS).
Baseline normal usage and notify on spikes in token consumption or error rates.

Enable diagnostic settings (CLI)

az monitor diagnostic-settings create \
  --name send-to-la \
  --resource $(az cognitiveservices account show -g rg-ai -n my-azure-openai --query id -o tsv) \
  --workspace $(az monitor log-analytics workspace show -g rg-ai -n la-ai --query id -o tsv) \
  --logs '[{"category": "Audit", "enabled": true}, {"category": "RequestResponse", "enabled": true}]'

Quick anomaly lens (KQL)

// Adjust table/category names to your environment
AzureDiagnostics
| where ResourceProvider =~ "MICROSOFT.COGNITIVESERVICES"
| where OperationName has "ChatCompletions" or OperationName has "Embeddings"
| summarize totalRequests = count(), callers = dcount(CallerIpAddress) by bin(TimeGenerated, 1h)
| order by TimeGenerated desc

7. Governance and policy

Use Azure Policy to enforce guardrails: deny public network access, require private endpoints, and restrict regions.
Tag resources by owner, environment, and data classification; tie tags to budgets and alerts.
Maintain a model inventory with versioning and a change-approval process for prompts and tool definitions.
Build a playbook for content incidents (e.g., sensitive data in prompts) and service incidents (e.g., key leakage).

Starter policy ideas

Deny Cognitive Services accounts with public network access enabled.
Audit any resource not sending Diagnostics to Log Analytics.
Restrict creation to approved regions to meet data residency commitments.

8. Secure MLOps and data science workflows

Use isolated compute and private workspaces in Azure Machine Learning; restrict who can register, deploy, and approve models.
Store training data in private storage with immutable logs and strong RBAC.
Scan containers and dependencies; sign images and verify at deploy time.
Promote models via staged environments with reproducible pipelines.

Putting it together: a reference blueprint

Project scaffold: resource groups per environment; naming and tagging standards.
Identity: managed identities for apps; least-privilege RBAC on each AI resource.
Network: private endpoints for Azure OpenAI, Content Safety, Storage, and Search; PNA disabled; egress through Azure Firewall.
Secrets/Keys: all secrets in Key Vault; automated rotation; consider CMK where supported.
Data: input redaction, output filtering, governed vector stores; Purview catalogs.
App layer: locked system prompts, constrained tools, RAG with validation.
Monitoring: diagnostics to Log Analytics; SIEM alerts; cost and usage budgets.
Governance: Azure Policy, region restrictions, deployment approvals, and incident playbooks.

Common pitfalls to avoid

Shipping API keys in environment variables across multiple services instead of using managed identities.
Leaving public endpoints on because “it’s only dev.” Attackers love dev.
Letting client-side code control system prompts or tool definitions.
No guardrails on outbound calls from function/tool handlers (exfiltration risk).
Not capturing request metadata and exceptions; blind when incidents happen.

Security checklist

Entra ID + managed identities for all apps; no long-lived keys.
Private endpoints and PNA disabled on AI services; controlled egress.
Key Vault for secrets; rotate and monitor access.
Content Safety and input/output validation active.
Diagnostics on; alerts for spikes, failures, and policy violations.
Azure Policy enforcing guardrails; Defender for Cloud enabled.
Documented prompt, tool, and model lifecycle with approvals.

Where to next

Start by enabling private endpoints and managed identities on one workload, then expand the pattern. If you need an accelerated path, CloudPro Inc can help you codify these controls in Terraform/Bicep and integrate them into your CI/CD, so every new AI project ships secure by default.

Security is not a one-off project—it’s a habit. With the practices in this guide, your teams can use Azure AI services confidently and compliantly at scale.

Discover more from CPI Consulting -Specialist Azure Consultancy

Subscribe to get the latest posts sent to your email.