SharePoint: Optical Character Recognition (OCR) Backfill for SharePoint Sites

🚨 The Signal: SharePoint Online now supports backfilling Optical Character Recognition (OCR) for existing documents. This improves content discoverability and data loss prevention (DLP) coverage for previously unindexed files, enhancing data governance and security posture.

The Impact

Tenant admins are affected, with a low security risk if not leveraged, but a high benefit if implemented to improve data discoverability and protection.

  • Tenant Admins: Can improve data discoverability and DLP coverage for previously unindexed content.
  • Security Teams: Benefit from enhanced Purview policy enforcement on a broader range of documents.
  • Compliance Officers: Gain better visibility and control over sensitive information within legacy documents.

The Action

  1. Identify SharePoint Online sites with legacy documents requiring OCR processing.
  2. Contact the Microsoft feature team to enable the private preview for your tenant.
  3. Utilise the new admin capability to initiate OCR backfill on selected SharePoint sites.
  4. Verify OCR processing completion and improved searchability/DLP coverage for affected documents.

Domain: SharePoint · Impact: medium · Workload: SharePoint