Custom sensitive information type features- Microsoft SC-400 Certification
The custom sensitive information type’s special features and their use cases are detailed as follows:
- Document fingerprinting: This converts a standard form into a sensitive information type. You can use this on a government form, an employee information form, or a patent template. In an ideal world, businesses will already have an established practice of using specific forms to send and receive sensitive information, though it is recommended that after uploading an empty form, it should be converted into a document fingerprint. Then, you should set up a corresponding policy so that any documents being shared that match that fingerprint are detected.
- Keyword dictionaries: This is a solution for managing reused keyword lists when matching large amounts of businesses’ information/data. It supports up to 1 MB of keywords in any language. When looking to identify inappropriate or explicit language, keyword dictionaries can be used to detect specific words and take the required actions on them, such as enforcing organization guidelines.
- Exact Data Match (EDM)-based classification: This enables you to generate databases with custom sensitive information types that refer to specific data. There are daily refreshes that can contain up to 100 million rows of information. These are best for businesses that are required to store large amounts of personal data, including hospitals. They can benefit from EDM-based classification to ensure no personal information is being shared.
You should now understand how to access the Microsoft 365 compliance center, the components that are part of a sensitive information type and the information they store, as well as the key features of custom sensitive information. Because you have this knowledge, you are ready to create a custom sensitive information type, which we are going to cover in the next part of this chapter.
Creating and managing custom sensitive information types
Protecting stored employee IDs, cost center numbers, and other human resources (HR) and finance-specific data are all common usage scenarios for custom sensitive information types. The recommended way to make a new custom sensitive information type is to look for a built-in sensitive information type and modify the rules. Once you have fully completed your customization, you can upload it with a new name.
We will now go through the steps required to create a new sensitive information type that is completely defined:
- From the compliance center, navigate to Data classification and then Sensitive info types. At this point, select Create sensitive info type, as shown here:
Figure 3.3 – Create info types
2. Fill in the values for Name and Description and click Next:
Figure 3.4 – Name and Description
3. On the next screen, you will need to click on Create pattern, as shown in the following screenshot. You have the option of creating many patterns, and in every case, different elements and confidence levels, which will enable you to build the new sensitive information type:
Figure 3.5 – Creating a pattern for the sensitive info type
4. The next option will be to select the default Confidence level setting, as shown in the following screenshot. The available options are low, medium, and high:
Figure 3.6 – Choosing a confidence level
5. Next, you must define the primary element. The available settings are that you can set it to a regular expression with an optional validator, Keyword list, Keyword dictionary, or a pre-constructed function, as follows:
Figure 3.7 – Primary element
6. Enter a value for Character proximity, as shown here:
Figure 3.8 – Character proximity
7. As you can see, you have the option of adding supporting elements if you have any.
8. In the final box at the bottom, add any additional checks you want to include from the available options, including Exclude specific matches, Start or Doesn’t start with characters, or Exclude duplicate characters, as well as options that you will find in the list:
Figure 3.9 – Additional checks
9. Click on Create:
Figure 3.10 – Create button
10. Click on Next:
- You will now need to set Choose the recommended confidence level to show in compliance properties, as shown in the following screenshot. Here, you have the following choices:
• High confidence level: Matched items will contain the fewest false positives but the most false negatives.
• Medium confidence level: Matched items will contain an average amount of false positives and false negatives.
• Low confidence level: Matched items will contain the fewest negatives but the most false positives:
Figure 3.12 – Confidence level
12. Check that all the settings are correct and choose Submit.
13. You may need to click Refresh on the Data Classification page, which will cause the custom sensitive information type you have just created to appear.
Once you have created the sensitive information type, you can test, modify, and remove it later if you so wish. The following steps will explain how to test a sensitive information type and how to modify a custom sensitive information type in the compliance center.