When configuring Microsoft Purview Data Loss Prevention (DLP) policies or Microsoft Purview Information Protection auto-labelling policies, leveraging Sensitive Information Types (SITs) is crucial.
SITs play a pivotal role in identifying sensitive data by recognising specific patterns within text. These patterns are supported by evidence such as keywords, character proximity, and confidence levels
When considering the need for a custom Sensitive Information Type that focuses on exact or nearly exact data values, rather than generic patterns, Microsoft Purview offers a powerful solution: Exact Data Match (EDM) based classification.
Here are the key benefits of creating a custom SIT using EDM
Dynamic and Easily Refreshed: EDM-based SITs can adapt to changes in data over time.
Reduced False Positives: By pinpointing exact or highly similar data values, EDM minimises false positives.
Structured Data Compatibility: EDM works seamlessly with structured sensitive data, such as specific formats, codes, or identifiers.
Enhanced Security: Unlike some other solutions, EDM doesn’t share sensitive information with anyone, including Microsoft.
Integration with Microsoft Cloud Services: EDM-based custom SITs can be utilised across various Microsoft cloud services, providing consistent protection and compliance.
For instance, Exact Data Match can identify an exact match of a customer’s credit card number, rather than relying solely on generic patterns. This precision enhances detection accuracy and significantly reduces false positives.
The following diagram shows the fundamental workings of EDM classification
What's different in an EDM SIT?
When working with EDM SITs, gaining familiarity with a few unique concepts is beneficial:
- Schema
- XML file to determine whether or not your data contains strings that match those that your sensitive information types are designed to detect.
- Sensitive information source table
- Contains the values that the EDM SIT looks for. The table is made up of columns and rows. The column headers are the field names, the rows are instances of items.
- Rule package
- To define the various components of your EDM SIT
You supply your own schema and data
MS Purview includes a range of built-in Sensitive Information Types. However, when working with EDM SITs, you take on the responsibility of defining the schema, as well as specifying the primary and secondary fields that identify sensitive items. Notably, all data is encrypted, ensuring that only hashed values are uploaded to the service, thus maintaining robust data security.
How matching works
Consider an example where you need to detect U.S. Social Security numbers (SSNs). To enhance match confidence, your supporting elements include first name, last name, and date of birth (DoB).
Your source table might resemble the following:
SSN FirstName LastName DoB |
987-65-4320 Isaiah Langer 05-05-1960 |
078-05-1120 Ana Bowman 11-24-1971 |
When searching for matching supporting elements in a protected file, your EDM SIT examines each supporting element individually and in combination once the primary element is identified.
When you have an Address field containing values like 1 Microsoft Way, Redmond, WA or 123 Main Street, New York, NY. you select multi-token matching as the match option.
SSN Name Street Address |
987-65-4320 Isaiah Langer 1432 Lincoln Road |
078-05-1120 Ana Bowman 8250 First Street |
With multi-token matching, the Name and Street Address fields are matched both as independent supporting element strings and in combination as individual fields.
Where used
Services that EDM supports include Microsoft Purview Data Loss Prevention, Microsoft Defender for Cloud Apps, Auto-labelling (service and client side), eDiscovery and Insider Risk Management
Leverage EDM as a valuable tool in achieving GDPR compliance
By seamlessly integrating EDM functionality into your compliance framework, organisations can strengthen their data governance strategies, mitigate risks, and foster trust with stakeholders. Demonstrating a proactive commitment to safeguarding individual privacy rights under GDPR, EDM empowers organizations to handle sensitive data accurately and securely.
Feel free to contact us at contact@infotechtion.com if you need any help configuring similar scenarios.