Mastering AI: Unveiling the Power of MS Purview Trainable Classifiers

Luc Marolt

October 24, 2023

In today’s landscape, engaging in a substantive debate on any topic invariably involves delving into the profound influence of Artificial Intelligence (AI) and its potential implications within that specific context.

In the realm of Information Governance (IG), Microsoft has undeniably demonstrated the influence and importance of AI through MS Purview. In this article, we will zero in on a specific tool that unquestionably represents a paradigm shift when it comes to detecting breaches in the utilisation and dissemination of information within an organization, in accordance with public, ethical, and corporate standards. Utilizing Trainable Classifiers (TC) to bolster information protection, retention, and communication compliance serves as a practical illustration of how AI can unlock an array of unparalleled possibilities.

Technically speaking a Microsoft Purview trainable classifier is a tool you can train to recognize various types of content by giving it samples to look at. Once trained, you can use it to identify items for application of Office sensitivity labels, Communications compliance policies, and retention label policies.

What does this mean on the practical side of Information Governance (IG) and Protection?

In the catalogue of MS Purview’s conventional tools for steering Information Governance (IG) and data protection, Sensitive Information Types (SIT) have played (and play) a pivotal role. To put it simply, a SIT comprises a blend of keywords, patterns, and a probability score, enabling the identification of valuable data such as Credit Card numbers, Driver’s License details, Medical Records, and much more. These predefined, structured combinations leave no room for artificial intervention. However, as we venture into the realm of Trainable Classifiers, the landscape transforms into a playground for AI to redefine the rules.

Consider a scenario where your objective is to be informed when a user transmits an email containing ‘abusive or discriminatory language,’ or perhaps you intend to proactively intercept such emails for blocking. Or what happens when a user shares a document that exhibits characteristics of ‘corporate sabotage’? Is it even possible to block storing or sharing any document in SharePoint or OneDrive that contains adult, racy, or gory imagery? Such intricate requirements often lie beyond the capabilities of traditional Sensitive Information Types (SITs). This is precisely where the dynamic potential of Trainable Classifiers comes into play.

Within MS Purview, you’ll discover a robust library of approximately 60 pre-trained classifiers, and the flexibility to craft your own custom classifiers. While the majority of these classifiers are designed for English-language content, a substantial and continually expanding subset accommodates multiple languages, with Microsoft promising further language support enhancements in the future. These classifiers are subject to automatic updates and refinements by Microsoft, ensuring ongoing precision and performance. Beyond the categories previously highlighted, you’ll also find classifiers tailored for various specific purposes, such as Customer Complaints, Freight Documents, Financial Statements, Legal Agreements, and many more.

How do we create our own custom Trainable Classifier?

After most of the hard work is done by AI, it is important during the “Test + review” cycle to test and improve the accuracy of the classifier before publishing and start using it. You can view the number of matches a trainable classifier has in Content Explorer and Trainable classifiers. Additionally, you can provide feedback on whether an item is indeed a match or not by using the Match or Not a Match feedback mechanism and utilize that feedback to fine-tune your classifiers.

Practical use of Trainable Classifiers

Classifiers are available to use as a condition for:

Office auto-labelling with sensitivity labels
Auto-apply retention label policy based on a condition
Communication compliance to detect regulatory compliance and business conduct violations
Sensitivity labels can use classifiers as conditions
Data loss prevention

Does your company need some support to design and launch compliance initiatives? Contact us and talk to one of our experts or request a demo. We are excited to show you Infotechtion teamwork and information governance solutions!

By submitting this form you agree that Infotechtion will store your details and send future resources. You may opt-out any time.

Auto-Apply, Communication policy, compliance, Data governance, Data Loss Prevention, Information Protection, Retention label policy, Sensitivity labels