Microsoft Purview compliance portal: Object Character Recognition (OCR) support for embedded images in Exchange Online and Microsoft Teams

🚨 The Signal: Microsoft Purview now performs Optical Character Recognition (OCR) on embedded images within various file types, including PDFs, Office documents, and archives. This enhances content searchability and eDiscovery for previously unindexed visual information.

The Impact

Security and compliance teams are affected by increased discoverability of sensitive data in images, requiring review of existing eDiscovery scopes.

  • Security teams: Increased risk of sensitive data exposure if images contain unmanaged information.
  • Compliance teams: Enhanced ability to identify and retrieve relevant information for legal and regulatory requests.
  • Data owners: Previously 'hidden' data in images is now discoverable, requiring awareness.
  • Legal teams: Broader scope for eDiscovery searches, potentially impacting review effort.

The Action

  1. Review existing Microsoft Purview eDiscovery and Content Search scopes for unintended data exposure.
  2. Evaluate current data loss prevention (DLP) policies to ensure embedded image content is adequately protected.
  3. Communicate to data owners about the enhanced discoverability of embedded image content.
  4. Update information governance policies to account for OCR capabilities in embedded images.

Domain: Purview · Impact: medium · Workload: Microsoft Purview