Beyond E5: the extra Purview features worth knowing about
Microsoft Purview now has 15+ features that bill through Azure consumption, not your E5 licence. Most organisations have no idea they exist - or that some are already running. Here is what to watch and how to track the cost.
The PAYG layer most people ignore
Your E5 licence covers a lot. Information protection, DLP, Insider Risk, eDiscovery. But Microsoft has been quietly building a parallel billing layer - pay-as-you-go features that bill through Azure consumption instead of your licence.
Most E5 admins think they have everything. They do not. Here is the full list, grouped by category:
Data security:
- Data Security Investigations - storage and compute for investigation workspaces
- Information Protection - sensitivity labels applied to non-M365 sources (AWS, Azure SQL, Fabric, Box, Dropbox, Google Drive)
- Insider Risk Management - cloud and generative AI policy indicators for non-M365 locations
AI and generative AI compliance (M365 Copilot excluded - these apply to non-Microsoft AI like ChatGPT, Gemini, and third-party agents):
- Audit - audit logs for user interactions with non-Microsoft AI apps
- Communication Compliance - detect inappropriate or risky AI interactions (standard and premium tiers)
- Data Lifecycle Management - retention policies for AI prompts and responses
- eDiscovery - storage and export of non-M365 AI interaction data
Other:
- On-demand classification - scan historical files that auto-labelling never touched
- OCR - read sensitive content inside images and scanned PDFs
- Collection policies - control which signals get ingested into Purview
- Network Data Security - monitor data sent to websites, cloud apps, and AI apps via a SASE partner (Microsoft's own Global Secure Access is currently file-only). See Inline web DLP explained for the deep dive.
- DLP for cloud apps in Edge for Business - DLP enforcement in the browser for unmanaged cloud apps. Managed Entra-connected apps sit under Enterprise apps and Devices instead. See Inline web DLP explained.
- Data Security for Gen AI Applications - classification and protection of sensitive content in non-M365 AI interactions
See the full billing models breakdown on Microsoft Learn for units of measure and pricing details. Use the Azure pricing calculator to estimate costs for your tenant.
Why are these pay-as-you-go? Microsoft does not explicitly say, but it comes down to compute and storage. These features require processing that goes beyond what a flat per-user licence covers. On-demand classification pulls content out of cold storage and runs it through classifiers. OCR processes every image through text extraction. AI governance intercepts and analyses interactions with third-party services. The processing cost to Microsoft varies massively between tenants depending on data volume, so consumption-based billing makes sense for them.
Three of these are worth breaking down because they fill the biggest gaps for most E5 tenants. But first - the AI meters are worth understanding.
The AI meters catching people off guard
Microsoft is making AI governance features consumption-based rather than licence-included. If your organisation uses ChatGPT, Google Gemini, Copilot Studio agents, or any non-Microsoft AI tools and you want the same compliance controls you have for M365, you pay per interaction.
Audit logging, Communication Compliance scanning, retention policies, and eDiscovery for non-Microsoft AI interactions all have their own PAYG meters. Network Data Security and DLP for Edge for Business add further meters for monitoring data sent to unmanaged cloud and AI apps from endpoints.
One thing to be clear on: M365 Copilot interactions are excluded from these PAYG charges. Your existing licence covers Copilot. Everything else, you pay for.
Verdict: If you are not governing non-Microsoft AI usage yet, this is not urgent. But if you are - or plan to - budget for it before you turn anything on.
On-demand classification
The gap: Auto-labelling is event-driven - it only evaluates files when they are created or modified. Anything sitting untouched in SharePoint for years has never been classified. With Copilot now surfacing everything a user has access to, that unclassified data is a live risk.
What it does: Targeted scans against SharePoint, OneDrive, and endpoints. You define the scope, pick your classifiers, and Purview estimates the cost before you commit. Billed per 10,000 assets scanned.
What to watch out for:
- This is not a quiet background scan. Every scanned file gets evaluated against all your active DLP, Information Protection, Data Lifecycle, and Insider Risk policies. A broad scan on historical data can trigger thousands of alerts, apply labels, start retention clocks, and generate IRM signals - all on documents users have not touched in years
- Files in cold storage are invisible. SharePoint moves inactive content to cold storage automatically. On-demand classification cannot reach it. Your oldest, most dormant files - the ones most likely to be unclassified - may be silently skipped
- Scans are capped at 50,000 locations and 20 million files. Large tenants need multiple scans
Verdict: Worth it if you have years of unclassified content and are preparing for Copilot. Start with your highest-risk sites, not all of SharePoint.
OCR
The gap: Without OCR, someone photographs a screen with credit card numbers and DLP sees nothing. Scanned contracts, ID cards, receipts - all invisible to classification. This is one of the easiest DLP bypasses and users do not even need to be malicious.
What it does: Extends your existing SITs, trainable classifiers, and document fingerprints to scan text inside images. Works across Exchange, SharePoint, OneDrive, Teams, and endpoints. No new classifiers needed.
What it costs: $1 per 1,000 images. Each PDF page counts separately. 2,500 images per month free. Purview caches results per location.
What to watch out for:
- The cost estimator can only be run once per tenant, ever. You get one 30-day window within 90 days. Dashboards reset if you restart. Reports are deleted after 90 days. Plan your scope before you enable it - you will not get a second chance
- The estimator has blind spots. It cannot estimate images in PDFs for SharePoint and OneDrive, only Exchange and Teams. Your actual bill may be higher than the estimate
- Volume adds up. Enable OCR across all locations and every image in every email, every screenshot on every device gets scanned
Verdict: Essential if you run DLP on endpoints or Exchange. Start there, use the estimator carefully, expand later.
Collection policies
The gap: Enable endpoint monitoring or expand to non-M365 sources and you start ingesting a firehose of events. Every file copy, print, and cloud upload. Activity Explorer fills with noise. Insider Risk processes signals you do not care about.
What it does: Collection policies filter which events from which data sources get ingested into Purview. A policy is built from three parts:
- Conditions - what data to detect. Content containing specific classifiers, file extensions, or document size thresholds
- Activities - what user actions to capture. 19 device activities (USB copy, print, cloud upload, Bluetooth, delete) plus text and files sent to or received from unmanaged apps
- Data sources - where to apply. Devices, Copilot experiences, Enterprise AI, unmanaged cloud apps (via browser or network detection)
What to watch out for:
- Collection policies override Insider Risk indicators. If your collection policy filters out an activity type, IRM cannot see it - even if the indicator is enabled. Audit your IRM indicators against your collection policy filters before deploying
- No conditions on devices means everything gets collected. Other data sources default to classifier matches only, but devices collect all data regardless. Be explicit
- Some data sources are PAYG. Copilot experiences, Enterprise AI, and unmanaged cloud app activity via network data security all bill through Azure consumption
Verdict: Necessary if you have enabled Always Audit on endpoints or are expanding to non-M365 sources. Without collection policies, you either collect everything and drown in noise, or collect nothing from sources outside M365.
Track it in the Usage Center
Most of these features can be monitored and controlled from one place. The Purview Usage Center (currently in preview) gives you two views:
Pay-as-you-go tab - shows total units consumed by feature and workload. Pausable features (On-demand classification, OCR, Information Protection, Audit, Communication Compliance, Data Lifecycle Management, eDiscovery) can be toggled between active and paused directly from this page. If something is running up costs, you can stop it without deleting the policy.
Premium usage tab - shows two things most admins miss. Policy Scoping tells you how many unlicensed users are in scope of premium features - useful for catching under-licensing before Microsoft does. License Assignment shows licensed users who are not under any policy scope - meaning you are paying for licences nobody is using.
What to do: Check the Usage Center monthly. Set Azure budget alerts on your Purview resource group. If you turned something on for testing and forgot about it, this is where you will find it.
The bottom line - it comes down to risk appetite
The FAIR framework is worth thinking about here. Instead of asking "should we turn this on?" ask "what is the financial exposure if we do not?" 500,000 unclassified files in SharePoint with Copilot about to surface them - the cost of an on-demand scan is trivial against a data exposure incident. DLP that cannot see text in scanned documents - the cost of OCR is a rounding error against a regulatory fine.
For most organisations deploying Copilot or handling regulated data, you cannot afford to leave these gaps open. Turn features on deliberately, monitor them in the Usage Center, and pause anything that is not earning its keep.
Scope tightly. Estimate first. Set Azure budget alerts. Expand gradually.
Browse all 400+ built-in classifiers and plan which ones you need.
Try the Data Classifier Explorer