Enhancing Data Security: The Advantages of Microsoft Purview Exact Data Match

Luc Marolt


When configuring Microsoft Purview Data Loss Prevention (DLP) policies or Microsoft Purview Information Protection auto-labelling policies, leveraging Sensitive Information Types (SITs) is crucial.

SITs play a pivotal role in identifying sensitive data by recognising specific patterns within text. These patterns are supported by evidence such as keywords, character proximity, and confidence levels

When considering the need for a custom Sensitive Information Type that focuses on exact or nearly exact data values, rather than generic patterns, Microsoft Purview offers a powerful solution: Exact Data Match (EDM) based classification.

Here are the key benefits of creating a custom SIT using EDM

Dynamic and Easily Refreshed: EDM-based SITs can adapt to changes in data over time.

Reduced False Positives: By pinpointing exact or highly similar data values, EDM minimises false positives.

Structured Data Compatibility: EDM works seamlessly with structured sensitive data, such as specific formats, codes, or identifiers.

Enhanced Security: Unlike some other solutions, EDM doesn’t share sensitive information with anyone, including Microsoft.

Integration with Microsoft Cloud Services: EDM-based custom SITs can be utilised across various Microsoft cloud services, providing consistent protection and compliance.

For instance, Exact Data Match  can identify an exact match of a customer’s credit card number, rather than relying solely on generic patterns. This precision enhances detection accuracy and significantly reduces false positives.

The following diagram shows the fundamental workings of EDM classification

What's different in an EDM SIT?

When working with EDM SITs, gaining familiarity with a few unique concepts is beneficial:

  • Schema
    • XML file to determine whether or not your data contains strings that match those that your sensitive information types are designed to detect.
  • Sensitive information source table
    • Contains the values that the EDM SIT looks for. The table is made up of columns and rows. The column headers are the field names, the rows are instances of items.
  • Rule package
    • To define the various components of your EDM SIT

You supply your own schema and data

MS Purview includes a range of built-in Sensitive Information Types. However, when working with EDM SITs, you take on the responsibility of defining the schema, as well as specifying the primary and secondary fields that identify sensitive items. Notably, all data is encrypted, ensuring that only hashed values are uploaded to the service, thus maintaining robust data security.

How matching works

Consider an example where you need to detect U.S. Social Security numbers (SSNs). To enhance match confidence, your supporting elements include first name, last name, and date of birth (DoB).

Your source table might resemble the following:

SSN                    FirstName       LastName        DoB

987-65-4320    Isaiah                 Langer                05-05-1960

078-05-1120    Ana                     Bowman            11-24-1971

When searching for matching supporting elements in a protected file, your EDM SIT examines each supporting element individually and in combination once the primary element is identified.

When you have an Address field containing values like 1 Microsoft Way, Redmond, WA or 123 Main Street, New York, NY. you select multi-token matching as the match option.

SSN                    Name                Street Address

987-65-4320    Isaiah Langer   1432 Lincoln Road

078-05-1120    Ana Bowman   8250 First Street

With multi-token matching, the Name and Street Address fields are matched both as independent supporting element strings and in combination as individual fields.

Where used

Services that EDM supports include Microsoft Purview Data Loss Prevention, Microsoft Defender for Cloud Apps, Auto-labelling (service and client side), eDiscovery and Insider Risk Management

Leverage EDM as a valuable tool in achieving GDPR compliance

By seamlessly integrating EDM functionality into your compliance framework, organisations can strengthen their data governance strategies, mitigate risks, and foster trust with stakeholders. Demonstrating a proactive commitment to safeguarding individual privacy rights under GDPR, EDM empowers organizations to handle sensitive data accurately and securely.

Feel free to contact us at contact@infotechtion.com if you need any help configuring similar scenarios.

 © 2024 Infotechtion. All rights reserved 


By submitting this form you agree that Infotechtion will store your details and send future resources. You may opt-out any time.

Recent posts

Job application.

Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorestandard dummy text ever since.

Please fill the form

Job application.

Join Infotechtion for an impactful career filled with passion, innovation, and growth. Embrace diversity, collaboration, and continuous learning. Discover your potential with us. Exciting opportunities await!

Please fill the form

By submitting the form, you confirm that you do not require a visa sponsorship to work in the country of application.