How to Add a Custom Data Type

This article describes how to add a custom data type, which only users with Administrator rights can create.

To add a custom Data Type use the following steps:

  1. From the left-side navigation menu select Settings > Global Data Types.
  2. The Data Types page opens.
  3. Select the CUSTOM DATA TYPES tab at the top of the page.
  4. From the top right corner of the page click the blue Actions button.
  5. Click "Add Custom Data Type." (Admins only)

  6. In the Add Custom Data Type pop-up window, select a Data Type from the drop-down list.

  7. Expand a section for how to add each Data Type.

Keyword Data Type

Procedure:

  1. Name: Type the name of your Data Type.

  2. Keyword: Type the keyword you want to use.

  3. Data Type Value: Use the numeric up-down control to set a value.
    Note: Data Type Value is the number you assign to the Data Type to give it weight.
    1. For example:
      1. Assign a Data Type Value of '1'.
      2. Run the scan and find five instances of that value.
      3. The sum total, or weight, is '5'.
      4. This enable you to add emphasis to certain Data Types to gain insight into your data.
  4. Click Save & Update to save the new Data Type or Cancel to discard.

Regular Expression Data Type

Procedure:

  1. Name: Enter the name of your Data Type.

  2. Regular Expression: Enter the expression in the text box.

  3. Data Type Value: Use the numeric up-down control to set a value.
  4. Click Save & Update to save the new Data Type or Cancel to discard.

Dictionary Data Type

Procedure:

  1. Click the folded paper icon to upload a Dictionary file (.dic) from your local computer.

  2. Name: This field is filled automatically from the file.

  3. Data Type Value: Use the numeric up-down control set a value.

  4. Click Save & Update to save the new Data Type or Cancel to discard.
Note: Dictionary files must be in the following format and saved with a file extension of type .dic.
[Header]
Name="Dictionary_Name"

[Words]
Word_1
Word_2
Word_3
Word_4

For example:
[Header]
Name="My New Dictionary"
[Words]
Black
Blue
Canary
Indigo
Magenta
Orange
Red

Sensitive Data Definition (SDD) Data Type

Procedure:

  1. Name: Enter the name of your Data Type.

  2. In the Definition section, fill in the following:
    1. Left criteria drop-down list:
      1. Click the more options menu.

  3. In the Select Sensitive Data Types pop-up window, click a tile with a right arrow to move it the available data types column.
    1. For example: Social Security number

  4. To remove an available Sensitive Data Type, click a tiles with a left arrow.

  5. Click OK to save or Cancel to discard.

  6. Center criteria drop-down list: Select an option from the drop-down list.

  7. Right criteria: Click the more options menu and select an option from the drop-down list.

    1. For example: Near with Distance | Bank Account Number | 2.

    2. For example: Less Than| 100 | Unique Matches.

Sensitive Data Definitions

A Sensitive Data Definition generally consists of three fields. Though in some cases, based on the operator selected, a fourth field displays. These fields are described in the table below:

Field

Description

Sensitive Data Type

The Sensitive Data Type specifies the type of data that you are filtering for.

  • "Social Security Number" and "Password" are examples of Data Types.
  • Any sensitive data types that you have created also displays here as an available selection.

Operator

The Operator defines how results are found based upon the Data Type.

  • "Equals" and "Require" are examples of an operator
  • See "Operator Types," just below

Value field

The Value Field specifies the value used to qualify the data.

  • This can range from a secondary Sensitive Data Type to a predetermined list of values from which you can select.
  • In some cases both a secondary data type and a predetermined list of values display together, depending upon the Operator that you selected.
    - Some examples of this are the operators "Near with Distance" or "Far with Distance"

Operator Types

Filters are used to capture instances of sensitive data types (such as Social Security and Credit Card numbers) based on rules/conditions set by you, the user. For example, you may wish to capture social security numbers, but only those which come after a user's email address, or within 50 characters of the user's address.

To create a filter for a sensitive data definition, you first select a Sensitive Data Type and then you select an operator. There are 22 types of operators available:

Operator

Description

Value

Near

Tests whether sensitive data types are within 50 characters of each other. This distance includes blank spaces.

  • The value is the maximum allowable number of characters between sensitive data types.
  • To manually specify the distance between characters, use the Near with Distance operator.
  • The Near operator enables you to use the same distance value for all filters using the Near operator without the need to specify the distance each time you use it in a sensitive data definition.

Near Example

  • If an SSN has a match in a file (or email) at location 3 with length 9 it ends at character 12.
  • A Credit Card number has a match in the same file (or email) at location 55 (with length 16)
  • 55-12 = 38. The sensitive data matches are 38 chararacters apart.
  • Since the distance between characters is 50 characters or less this rule is satisfied.


Near with Distance

Tests whether sensitive data types are within a specified distance of each other. This distance includes blank spaces.

  • The distance is specified in the definition itself and specifies the maximum allowable distance between sensitive data types.
  • When you select Near with Distance, a value field is presented allowing you to set the specific distance.
  • If you do not want to specify the distance in the definition itself, use the Near operator, which specifies 50 characters between data types.

Near with Distance Example

  • If an SSN has a match in a file at location 3, with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location 20 (with length 16)
  • 20-12 = 8. The sensitive data matches are 8 chararacters apart.
  • If the distance value set is 8 or lower, the sensitive data matches discovered satisfy this rule, which measures the maximum allowable distance between characters.

Enter the distance (in characters). This is the maximum number of characters allowed between data types

Near Before

Tests whether specified sensitive data types are within 50 characters of each other with the requirement the first sensitive data type (A) appears before the second specified data type (B). This distance includes blank spaces.

  • The value (50) is the maximum allowable number of characters between sensitive data types
  • When you select "Near Before with Distance," a value field appears which enables you to specify the maximum allowable distance between characters.

Near Before Example

  • If a Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location 60 (with length 16)
  • 60-12 = 48. The sensitive data matches are 48 chararacters apart, and the first data type - SSN - appears before the second data type - CCN. This rule is satisfied.


Near Before with Distance

Tests whether sensitive data types are within a specified distance of each other with the requirement the first sensitive data type (A) set appears before the second specified data type (B) set. This distance includes blank spaces.

  • The distance between data types is specified in the definition itself and specifies the maximum allowable distance between data types.
  • When you select "Near Before with Distance," a value field appears in which you set the specific distance.
  • If you do not want to specify the distance in the definition itself, use the Near Before operator, which specifies 50 characters between data types.

Near Before with Distance Example

  • For example, if a Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location 40 (with length 16)
  • 40-12 = 28. The sensitive data matches are 28 chararacters apart, and the first data type - SSN - appears before the second data type - CCN.
  • If the value set (maximum distance between characters) is 25, even though the first data type set (SSN) appears before the second data type set (CCN), this rule is NOT satisfied because the distance between data types (28) violates the distance value set (25).

Enter the distance (in characters). This is the maximum number of characters allowed between data types

Near After

Tests whether sensitive data types are within 50 characters, or less of each other with the requirement the first specified sensitive data type (A) appears after the second specified data type (B). This distance includes blank spaces.

  • The value (50) is the maximum allowable number of characters between sensitive data types
  • When you select Near After with Distance a value field appears in which you set the specific distance (in characters).

Near After Example

  • In this example, a Social Security number is set as the first data type in the rule and a Credit Card number is set as the second data type in the rule.
  • A Credit Card number match appears in a file first, at location '3' (with length 16), it ends at character 18
  • A Social Security number match appears in the same file, at location '68' (with length 9).
  • 68-18 = 50. The sensitive data matches are 50 chararacters apart, and the first data type - SSN - appears after the second data type - CCN, so this rule is satisfied.


Near After with Distance

Tests whether sensitive data types are no more than a specified distance of each other with the requirement the first sensitive data type (A) appear after the second specified data type (B). This distance includes blank spaces.

  • The value (specified by you, the user) is the maximum allowable number of characters between sensitive data types
  • When you select "Near After with Distance," a value field appears in which you set the specific distance (in characters).
  • If you do not want to specify the distance in the definition itself, use the Near After operator, which specifies 50 characters between data types.

Near After with Distance Example

  • A Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location '55' (with length 16)
  • 55-12 = 43. The sensitive data matches are 43 chararacters apart, and the first data type - SSN - appears before the second data type - CCN.
  • If the value set (maximum distance between characters) is 25, even though the first data type set (SSN) appears before the second data type set (CCN), this rule is NOT satisfied because the distance between data types (28) violates the distance value set (25).

Enter the distance (in characters). This is the maximum number of characters allowed between data types

Far

Tests whether sensitive data types are at least 50 characters (or more) away from each other. This distance includes blank spaces.

  • 50 characters is the minimum allowable number of characters between sensitive data types.
  • To specify the distance independently from the policy setting, use the Far with Distance operator.
  • By using the Far operator and configuring the distance in a policy, it enables you to use the same distance value for all filters using the Far operator without the need to specify the distance each time you use it in a sensitive data definition.

Far Example

  • A Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location '85' (with length 16)
  • 85-12 = 73. The sensitive data matches are 73 chararacters apart, which satisfies the rule.

The distance between data matches is 73, greater than the 50 character minimum, so the rule is satisfied.


Far with Distance

Tests whether sensitive data types are at least a specified distance away from each other. This distance includes blank spaces.

  • The distance is specified in the definition itself and specifies the minimum allowable distance between sensitive data types.
  • When you select Far with Distance, a value field is presented allowing you to set the specific distance.
  • If you do not want to specify the distance in the definition itself, use the Far operator.

Far with Distance Example

  • A Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location '55' (with length 16)
  • 55-12 = 43. The sensitive data matches are 43 chararacters apart, and the first data type - SSN - appears before the second data type - CCN.
  • If the value set (maximum distance between characters) is 25, even though the first data type set (SSN) appears before the second data type set (CCN), this rule is NOT satisfied because the distance between data types (28) violates the distance value set (25).

Enter the distance (in characters). This is the minimum number of characters allowed between data types

Far Before

Tests whether sensitive data types are at least 50 characters away from each other with the requirement the first specified sensitive data type (A) appears before the second specified data type (B). This distance includes blank spaces.

  • The value is the minimum allowable number of characters (50) between sensitive data types.
  • To specify the distance independently from the policy setting, use the Far Before with Distance operator.

By using the Far operator and configuring the distance in a policy, it allows you to use the same distance value for all filters using the Far operator without the need to specify the distance each time you use it in a sensitive data definition.

Far Before Example

  • A Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number has a match in the same file at location '85' (with length 16)
  • 85-12 = 73. The sensitive data matches are 73 chararacters apart, which satisfies the rule.

The distance between data matches is 73, greater than the 50 character minimum, so the rule is satisfied.


Far Before with Distance

Tests whether sensitive data types are at least a specified distance away from each other with the requirement the first specified sensitive data type (A) appears before the second specified data type (B). This distance includes blank spaces.

  • The value is the minimum allowable number of characters between sensitive data types.
  • If you do not want to specify the distance in the definition itself, use the Far Before operator.

By using the Far operator and configuring the distance in a policy, it enables you to use the same distance value for all filters using the Far operator without the need to specify the distance each time you use it in a sensitive data definition.

Far Before with Distance Example

  • A Social Security number appears in a file first, at location '3' with length 9, it ends at character 12.
  • A Credit Card number appears in the same file at location '18' (with length 16)
  • 18-12 = 6. The sensitive data matches are 6 chararacters apart.

If the value set (minimum distance between characters) is 5, the distance between data type characters (6) is greater than the value specified (5). The first specified data type (SSN) appears before the second specified data type set (CCN), as required. This rule is satisfied.

Enter the distance (in characters). This is the minimum number of characters allowed between data types

Far After

Tests whether the first specified data type (A) appears in files/emails after the second specified data type (B) by a distance of at least 50 characters. This distance includes blank spaces.

  • The value is the minimum allowable number of characters between sensitive data types.
  • To specify the distance independently from the policy setting, use the Far After with Distance operator.

Far After Example

  • A Credit Card number is set as the first data type, and appears first in a file at location '18' with length 16, it ends at character 34.
  • A Social Security number is set as the second data type and appears in the same file, after the credit card number, at location '64' with length 9,
  • 64-34 = 30. The sensitive data matches are 30 chararacters apart.

If the value set (minimum distance between characters) is 25, the distance between data type characters (30) is greater than the value specified (25). The first specified data type (SSN) appears after the second specified data type set (CCN), as required. This rule is satisfied.


Far After with Distance

Tests whether the first specified data type (A) appears in files/emails after the second specified data type (B) by the specified distance. This distance includes blank spaces.

  • The value is the minimum allowable number of characters between sensitive data types.
  • If you do not want to specify the distance in the definition itself, use the Far After operator.

Far After with Distance Example

  • A Credit Card number is set as the first data type, and appears first in a file at location '18' with length 16, it ends at character 34.
  • A Social Security number is set as the second data type and appears in the same file, after the credit card number, at location '64' with length 9,
  • 64-34 = 30. The sensitive data matches are 30 chararacters apart.

If the value set (minimum distance between characters) is 25, the distance between data type characters (30) is greater than the value specified (25). The first specified data type (SSN) appears after the second specified data type set (CCN), as required. This rule is satisfied.

Enter the distance (in characters). This is the minimum number of characters allowed between data types

Before

Tests that one sensitive data type (such as Social Security number) appears before another (such as Credit Card number).

Enter the distance (in characters)

After

Tests that one sensitive data type appears after another.


Enter the distance (in characters)

Equals

Used to compare the unique count of a type.

Enter the distance (in characters)

Does Not Equal

Used to compare the unique count of a type.

Enter the distance (in characters)

Less Than

Used to compare the unique count of a type.

Enter the distance (in characters)

Less Than or Equals

Used to compare the unique count of a type.

Enter the distance (in characters)

Greater Than

Used to compare the unique count of a type.

Enter the distance (in characters)

Greater Than or Equals

Used to compare the unique count of a type.

Enter the distance (in characters)

Require

Tests that a minimum number of sensitive data types are present.

  • When this operator is selected the value field changes to "at least this many Sensitive Data Types:"
  • The numerical value cannot exceed the number of Sensitive Data Types selected.
  • For example, if you selected two Sensitive Data Types, then you cannot select a value greater than 2.
  • For example: If you select Credit Card Number, Password, Drivers License, and Passport Number as the Sensitive Data Types,
  • Require as the Operator, and 3 as the Value, the definition returns True only when at least 3 of the selected Sensitive Data Types are present in a location.
  • If there are 2 or fewer of those Sensitive Data Types present, it is False.

Enter the distance (in characters)

Allow

Tests that a maximum number of sensitive data types are present.

  • When this operator is selected the value field changes to "no more than this many Sensitive Data Types:"
  • The numerical value cannot be less than that number of Sensitive Data Types selected.
  • For example, if you selected three Sensitive Data Types, then you cannot select a value less than 3.
  • For example, If you select Credit Card Number, Password, Drivers License, and Passport Number as the Sensitive Data Types, Allow as the Operator and 5 as the Value, the definition returns True only when up to 5 of the selected Sensitive Data Types are present in a location.
  • If 6 or more of those Sensitive Data Types are present, it is False.

Enter the distance (in characters)

Note: Each of the operators that specify a match count (Equals, Does Not Equal, Less Than, Less Than or Equals, Greater Than, Greater Than or Equals) are for unique match counts. When creating a sensitive data definition and selecting a Keyword as the Sensitive Data Type, the Value must be 0 or 1 as locations can only contain a single, unique instance of a specific Keyword. The result for a keyword is always only the keyword, meaning that you can only ever have one unique match for a keyword as opposed to a Regular Expression or Dictionary which can have many unique matches.

  • For example, if you create a sensitive data definition saying "Keyword > 3", it is never True as you cannot have 3 unique instances of the same keyword in one location.
  • If you create a sensitive data definition saying "Keyword <= 1", then it is possible for this definition to be True.
  • A Regular Expression on the other hand can have multiple unique match counts per definition as a Regular Expression does not specify a specific identity.
  • The following is a list of operators and their valid numerical values when the Data Type = Keyword:

Field

Description

Equals 0

Returns True only if there are no keyword matches found.

Equals 1

Returns True if there is one unique instance of a specific keyword.

Does Not Equal 0

Returns True if there is one unique instance of a specific keyword.

Does Not Equal 1

Returns True only if there are no keyword matches found.

Less Than 1

Returns True only if there are no keyword matches found.

Less Than or Equals 1

Returns True if there are no matches or there is one unique instance of a specific keyword.

Greater Than 0

Returns True if there is one unique instance of a specific keyword.

Greater Than or Equals 0

Returns True if there are no matches or there is one unique instance of a specific keyword.

Note: For database searches, the "Near," "Far," "Near with Distance" and "Far with Distance" operators are relevant only to data within a cell, not across cells.
- For files, it is per file.
- For email, it is the content of the email.

Was this article helpful?