Scan Options

When you set up a new scan, you are presented with a number of options to use to tailor your scan. This topic describes these scan options. You can also find this information in the tooltips for each setting within Spirion Sensitive Data Platform.

Overview

When you set up a new scan, you are presented with a number of Basic and Advanced options to use to tailor your scan.

The options vary by the type of Target you are scanning: Cloud, Local or Remote Files & Folders, Email, Collaboration Tools, or Website.

  • Several of the Target types share Basic and Advanced options.
  • This document also contains email-related options and discovery team settings.

Basic Options

Note: These settings are not available for Database Targets.

Basic options apply to the following Target types:

  • Cloud
  • Files & Folders (Local & Remote Scans)
  • Email
  • Collaboration Tools
  • Website

Additional Information

  • For information on the target type-specific options, see the individual Target type sections in How to Create a New Sensitive Data Scan.
  • The following Target types all have options specific to their type: Cloud, Files & Folders, Email, Collaboration Tools, and Website Target types all have options specific to their type.

Select Files by Extension

  • A list of extensions to include or exclude during the search.
  • Values are entered one per line as "ext;1" to specify that the extension "ext" should be enabled for the list
  • Enter file extensions in lowercase
  • File types to search:
    • Common file
      • Microsoft Windows
        • Only search Microsoft Office, Adobe Acrobat PDF, text, web, and other common formats.
        • File types include:
          • .1st, .asm, .asp, .aspx, .btm, .c, .cc, .cpp, .cs, .css, .cxx, .def, .dic, .h, .hpp, .hxx, .idl, .idq, .inc, .inf, .ini, .inx, .java, .jsl, .log, .me, .rc, .reg, .rels, .snippet, .text, .txt, .url, .wtx, .xml, .xsl, .pdf, .edn, .fdf, .xdp, .xfd, .xfdf, .htm, .html, .rtf, .7z, .gz, .tar, .z, .rar, .bz, .bz2, .tgz, .tbz, .tbz2, .zip, .doc, .dot, .xls, .xla, .xlb, .xlc, .xld, .xlk, .xll, .xlm, .xlt, .xlv, .xlw, .dif, .slk, .ppt, .pot, .ppa, .pps, .pwz, .docx, .docm, .dotx, .dotm, .xlam, .xlsx, .xlsm, .xltm, .xltx, .pptm, .pptx, .potx, .potm
      • macOS
        • Mac Agent uses Spotlight to detect how to process files
        • Common extensions include:
          • Plugin name:/System/Library/Spotlight/PDF.mdimporter
          • Plugin key:PDF
          • UTIs:
            • com.adobe.pdf
          • Plugin name:/System/Library/Spotlight/RichText.mdimporter
          • Plugin key:RichText
          • UTIs:
            • public.rtf
            • public.html public.xml
            • public.plain-text
            • com.apple.traditional-mac-plain-text
            • com.apple.rtfd com.apple.webarchive
            • org.oasis-open.opendocument.text
            • public.comma-separated-values-text
            • public.delimited-values-text
            • public.comma-separated-values-text
            • public.delimited-values-text
            • public.text
            • public.html
            • public.xml
            • public.plain-text
            • com.apple.traditional-mac-plain-text
            • org.oasis-open.opendocument.text
          • Plugin name:/System/Library/Spotlight/Office.mdimporter
          • Plugin key:Office
          • UTIs:
            • org.openxmlformats.wordprocessingml.document
            • org.openxmlformats.wordprocessingml.template
            • org.openxmlformats.wordprocessingml.document.macroenabled
            • org.openxmlformats.wordprocessingml.template.macroenabled
            • org.openxmlformats.spreadsheetml.sheet
            • org.openxmlformats.spreadsheetml.template
            • org.openxmlformats.spreadsheetml.sheet.macroenabled
            • org.openxmlformats.spreadsheetml.template.macroenabled
            • org.openxmlformats.presentationml.presentation
            • org.openxmlformats.presentationml.template.macroenabled
            • org.openxmlformats.presentationml.template
            • org.openxmlformats.presentationml.presentation.macroenabled
            • org.openxmlformats.presentationml.slideshow
            • org.openxmlformats.presentationml.slideshow.macroenabled
            • com.microsoft.powerpoint.ppt
            • com.microsoft.powerpoint.pot
            • com.microsoft.powerpoint.pps
            • com.microsoft.excel.xls
            • com.microsoft.excel.xlt
            • com.microsoft.excel.xla
            • com.microsoft.word.doc
            • com.microsoft.word.dot
            • com.microsoft.excel.openxml.addin
            • com.microsoft.excel.openxml.template
            • com.microsoft.excel.openxml.workbook
            • com.microsoft.excel.openxml.template.macro-enabled
            • com.microsoft.excel.openxml.workbook.binary
            • com.microsoft.excel.openxml.workbook.macro-enabled
            • com.microsoft.powerpoint.openxml.presentation
            • com.microsoft.powerpoint.openxml.presentation.macro-enabled
            • com.microsoft.powerpoint.openxml.slideshow
            • com.microsoft.powerpoint.openxml.slideshow.macro-enabled
            • com.microsoft.powerpoint.openxml.template
            • com.microsoft.powerpoint.openxml.template.macro-enabled
            • com.microsoft.word.openxml.document
            • com.microsoft.word.openxml.document.macro-enabled
            • com.microsoft.word.openxml.template.macro-enabled
            • org.openxmlformats.presentationml.presentation.macroenabled
            • org.openxmlformats.presentationml.slideshow
            • org.openxmlformats.presentationml.slideshow.macroenabled
            • org.openxmlformats.presentationml.template
            • org.openxmlformats.presentationml.template.macroenabled
            • org.openxmlformats.spreadsheetml.sheet
            • org.openxmlformats.spreadsheetml.sheet.macroenabled
            • org.openxmlformats.spreadsheetml.template.macroenabled
            • org.openxmlformats.wordprocessingml.document
            • org.openxmlformats.wordprocessingml.document.macroenabled
            • org.openxmlformats.wordprocessingml.template
            • org.openxmlformats.wordprocessingml.template.macroenabled
    • All Filterable
      • In addition to Common and image files, also search files for which you have a Windows IFilter.
    • Custom
      • Search the files with the extensions you choose.
    • All
      • Search all extensions.
    • Images (with OCR)
      • Only search image files such as JPG and GIF with Optical Character Recognition (OCR).
Note: "Images (with OCR)" is only available for the Windows endpoint and is a valid option only in a policy when the endpoint is licensed to and includes the OCR Image Search Module.
        • If the OCR Image Search Module is not licensed or the OCR files are not present and "Images (with OCR)" is selected, the search defaults to "Common."
    • All but common binary
      • Search all files except binary files such as EXE, DLL, or MP3.

File Extension List

  • A list of extensions to include during the search.
  • The values are entered one per line as ext;1 to specify that the extension "ext" should be enabled for the list.
  • Extensions should be entered in lowercase.
  • To make the list a list of excluded extensions check the box next to Extension list is exclude list.

Use Advanced File Identification

  • By default, the endpoint application uses file extensions to identity file types and select an appropriate filter.
  • To change that behavior and use advanced file type identification, set this value to something other than, "Disable" (0).
  • When enabled, the client will look at the header information of files (magic numbers) to determine their type.
  • While this method is more accurate, it will also cause a decrease in search speed due to the increased processing.
    • Valid values:
      • 0: Disable - File extensions are used to determine file type
      • 1: E-Mail Attachment Compressed Files - Analysis is only performed on e-mail attachments to determine if they are renamed compressed files
      • 2: Included File Types - Analysis is performed on any file whose extension is included in the list of file types to search.
        • The file is searched if its type in included in the list of file types to search.
      • 3: All Files - Analysis is performed on all files and any file whose type is included in the list of file types to search is searched

Analyze Files by

  • The type of analysis to perform when analyzing files.
  • Use this setting to enable or disable one or more of the analysis types when searching files.
  • "Analyze file content" is enabled by default and represents the behavior of search prior to version 8.0.
  • The analyze files options:
    • Analyze File Content - Default search to look through all data within a file for sensitive data.
    • Compare File Hash - Search for a representation of a file regardless of name or location.
      Note: It is not possible to get a file hash match on compressed files or access databases.
    • Analyze File Name - Search for sensitive data within file names.
    • Analyze File Metadata - Search for sensitive data within the metadata of Office and PDF files.

Advanced Options

NOTE: These settings are not available for Database Targets.

Advanced options apply to the following Target types:

  • Cloud
  • Files & Folders (Local & Remote Scans)
  • Email
  • Collaboration Tools
  • Website

Skip files as text free binary

  • This setting determines the identification method used to skip a file type.
  • By default, when advanced file identification is enabled via the setting Use Advanced File Identification and a file is determined to be a common binary file type, it is searched if the file type is included in the scope of files to search as determined by the setting File Type Search Option.
    • For example, if the setting File Type Search Option is configured to search all files, the file is searched.
    • To skip the file in this instance, set this value to "Skip" (1).
  • Note: The list of common text free binary file types is noted in the settings file located at: Locations\Files\FileExtensions\TextFreeBinaryFileExtensionExcludeList
  • This setting is supported by Windows Agents only

Disable max file size for Access Database files

  • This setting disables testing the size of Microsoft Access database files.
  • By default, all file types (other than PST E-Mail files and compressed archive files) are tested against the MaxFileSize setting to determine if they should be searched.
  • Because Access database files are not searched under the same memory requirements as other file types, large Access database files can be searched.
  • To ignore the MaxFileSize setting for Access database files, set this to "Do not test file size" (1)
  • This setting is supported by Windows Agents only

Set Max Memory File Size

  • This setting sets the maximum size of files that are stored in memory (in bytes).
  • Default: 32 MB
  • Because the value is specified in bytes, multiply the desired number of megabytes by 1048576 to determine the appropriate number of bytes.
    • For example, the default value of 33554432 bytes is 32*1048576.
  • The maximum value for this setting is 134217728 which equals 128 MB
    • A value greater than 128 MB can cause stability issues as the amount of memory that can be allocated to a process is controlled by windows.
    • Spirion strongly recommends that this number be less than 134217728 (128 MB).
  • This setting is supported by Windows Agents only

Max File Size

  • This setting specifies the maximum file size to search (in bytes).
  • To search files greater than the default (32 MB), change this value.
  • Because the value is specified in bytes, multiply the desired number of megabytes by 1048576 to determine the appropriate number of bytes.
    • For example, the default value of 33554432 bytes is 32*1048576.
  • The maximum value for this setting is 134217728 which equals 128 MB
    • A value greater than 128 MB can cause stability issues as the amount of memory that can be allocated to a process is controlled by windows.
    • Spirion strongly recommends that this value be less than 134217728 (128 MB).
  • This setting is supported by Windows Agents only

Enable Max Compressed File Size

  • This setting specifies the maximum size of compressed files to search (in bytes).
  • By default, any searchable file is extracted from compressed files regardless of the size of the archive itself.
  • To skip the processing of archives larger than a specified size, set this value to a number greater than 0.
  • This value is only read when the setting Enable Max Compressed File Size is set to "Enable" (1).
  • Because the value is specified in bytes, multiply the desired number of megabytes by 1048576 to determine the appropriate number of bytes
  • This setting is supported by Windows Agents only

Enable Scan Byte Limit

  • This setting controls whether to enforce a size limit on each file scanned (in bytes)
  • By default, the endpoint application searches each file in its entirety.
  • To search only the first specified number of kilobytes in each file, set this value to "Enable" (1) and set the value of Search Byte Limit to a number greater than 0.

OCR

Enable OCR for Files

  • Search supported file types via OCR.
  • By default, when OCR is licensed, the supported file types are searched via the OCR module.
  • To disable searching with OCR, set this to "Do not search" (0).
  • This setting is supported by Windows Agents only

Decomposition Mode

  • The method used to perform page analysis.
  • Different methods/algorithms can be used to analyze the page before performing the OCR.
  • By default, the method is selected automatically, but if the default setting is not producing acceptable output at an acceptable speed, a specific method can be forced.
  • Valid options are:
    • Auto (0):
      • The engine automatically determines the algorithm to use.
      • This is the default setting and produces the best output for the widest array of images and image types.
    • Legacy (1):
      • The engine uses simple page decomposition which generally executes faster than standard but does not produce as accurate results.
    • Standard (2):
      • The engine uses standard page decomposition which generally produces better results than legacy but may execute slower.
    • Fast (3):
      • The engine uses fast page decomposition which generally executes the fastest of the methods
      • Produces the least accurate results unless the images are very simple representations of text.
      • Performs the least amount of page analysis
      • Does not work well for forms, tables, differing font sizes, and so on
  • This setting is supported by Windows Agents only

Deskew

  • Automatically align skewed text.
  • By default, if an image appears to be skewed or angled, an attempt is made to straighten the image to improve the likelihood of obtaining accurate text.
  • To disable deskewing (which will increase speed if it is known that no images are skewed), set this to "Off" (0).
  • This setting is supported by Windows Agents only

Despeckle

  • Enhance image quality to reduce pixel artifacts.
  • It may be possible to improve the accuracy of text extracted from certain images by first attempting to remove information that does not appear to be part of a valid character.
  • This setting specifies whether the adaptive noise removal algorithm will be activated for black and white images with a resolution of 280 DPI or higher.
  • This setting might influence the recognition accuracy.
  • To enable despeckling, set this to "On" (1).

Document Type

  • The type of text included in the Target locations.
  • Machine Text - By default, the OCR module recognizes machine generated text and does not recognize handwritten characters.
  • Handwritten Characters - If the desired data is known to be comprised of only handwritten characters, set this to "Handwritten Characters" (2).
  • Both Machine Text and Handwritten Characters - If is it necessary to identify both machine text and handwritten characters, set this to "Both Machine Text and Handwritten Characters" (3).
  • Note: If the input text type is known, it is more efficient to specify Machine Text or Handwritten Characters, as appropriate.
  • This setting is supported by Windows Agents only

Scan Only This Page

  • Search all pages or only search the specified page number.
  • By default, all pages of an image are processed with OCR.
  • In some cases, it may be desirable to only search a specific page such as the first or second page.
  • To only search a specific page for all files processed via OCR, set this value to a number greater than 0.
  • This setting is supported by Windows Agents only

OCR PDFs

  • The method to use for searching PDF files when OCR is enabled.
  • The OCR PDFs setting (often labeled as "Search Image-only PDFs" or similar in the Console) determines whether the Spirion agent should treat PDF files as images when they do not contain a selectable text layer.
  • This setting performs the following functions:
    • Extracts Hidden Data: Enables the agent to discover sensitive information (like SSNs, birthdates, or Student IDs) that is "trapped" inside scanned images within a PDF wrapper.
    • Triggers High-Resource Processing: Because OCR is computationally expensive, enabling this setting tells the agent to use more CPU and memory to perform the pixel-to-text conversion.
    • Applies OCR Cleanup: Once enabled, the agent applies other OCR-related settings (like the Unknown Character Replacement and Additional Languages we discussed earlier) to the text it extracts from the PDF.
  • Performance Impact: This is one of the most resource-intensive settings in Spirion. If you enable it, expect your scan times to increase significantly, especially if the target contains many large, scanned PDF files.
  • "Image-Only" vs. "All" PDFs:
    • Some versions allow you to OCR only image-only PDFs (saving time on PDFs that already have text).
    • Other configurations allow you to OCR every PDF. This is more thorough (in case the text layer is corrupt or incomplete) but much slower.
  • Platform Support: While the Windows agent has robust, built-in OCR capabilities for PDFs, the Mac and Linux agents have different dependencies for OCR. Always verify your specific agent version's requirements (some may require a separate OCR license or component).
  • Summary for your FERPA Playbook: If your FERPA data includes scanned student records or historical transcripts saved as PDFs, OCR PDFs must be enabled, or those files will be completely skipped by your sensitive data definitions.

By default, an attempt is made to extract readable text from a PDF file. If there are both text and images in the PDF, only the text or the images are searched, based on the method selected:

  • Text only - To prevent OCR from being used when there is no text, set this to value to "Text only" (0).
  • OCR when no text - If there is no readable text, OCR is used to attempt to identify text.
  • OCR Always - To always use OCR and not attempt to extract readable text, set this value to "OCR Always" (2).
  • This setting is supported by Windows Agents only

Fax Correction

  • Enhance scanned faxes.
  • Most OCR engines are optimized to recognize dark text on a light background (example: black ink on white paper). The Invert setting instructs the Spirion agent to "flip" the color values of an image or PDF page before the OCR engine analyzes it.
    • Original: White text on a black or dark blue background.
    • Inverted: Black text on a white or light background.
  • The Invert setting is a specialized image pre-processing tool used during Optical Character Recognition (OCR) to improve the detection of sensitive data in documents with non-standard styling.
  • This setting performs the following functions:
    • Improves Recognition of "Reverse" Text: In many industries, specific documents use inverted styles for headers, highlighted sections, or specialized forms. For example:
      • Blueprints or Technical Schematics: Often feature white text on dark backgrounds.
      • Web-captured Screenshots: If a user uses "Dark Mode," text that appears sensitive may be white-on-black.
      • Government/University Forms: Some older or stylized forms use dark header blocks with white lettering for field labels (e.g., "STUDENT NAME").
    • Reduces OCR Failure: Without Invert, the OCR engine may see a dark block with white text as a solid "blob" or a graphical element, completely skipping any sensitive data contained within it.
  • Best Practice Recommendation:
    • Selective Use: Do not enable Invert globally for every scan unless you know your data contains many "dark mode" or scanned negative documents. Inverting a standard black-on-white document makes it white-on-black, which will actually prevent the OCR engine from reading it correctly.
    • Performance: Like all OCR settings, Invert adds a small amount of processing time as the agent must manipulate the image in memory before scanning.
  • Check the checkbox to enable fax correction - When fax correction is enabled, the resolution of black and white images with an approximate resolution of 200 x 100 DPI is doubled in the 'y' direction (vertically) in an attempt to improve character recognition.
  • This setting is supported by Windows Agents only

Invert

  • Swap black and white pixels before performing OCR.
  • When enabled, black and white images are inverted from black on white to white on black.
  • In some cases, this can improve the recognition.
  • Check this checkbox to enable inversion.
  • This setting is supported by Windows Agents only

Recognition Mode

  • The method to use for character recognition.
  • When performing character recognition, there is a trade-off between speed and accuracy.
    • Favor accuracy - Default. By default, accuracy is favored over speed.
    • Favor speed - Favor speed over accuracy.
  • This setting is supported by Windows Agents only

Resolution Enhancement

  • Enhance resolution when recognizing color images.
  • When enabled, the resolution enhancement algorithm results in an image whose resolution is higher than the resolution of the original image.
  • This is done on the basis of information taken from the extra pixel depth in the grayscale and color images and may impact recognition accuracy.
  • Enable (1) - Enables resolution enhancement.
  • Note: This setting only applies to non-black and white and non-palette color images.
  • This setting is supported by Windows Agents only

Rotation

  • Rotate images before performing OCR.
  • By default, an attempt will be made to determine if an image is rotated 90, 180, or 270 degrees before recognition.
  • To disable automatic rotation (which will increase speed if it is known that all text is "right-side-up"), set this to "Off" (0).
  • This setting is supported by Windows Agents only

Unknown Character Replacement

  • The character to use when a character is not recognized.
  • When the Spirion engine performs OCR on an image (like a scanned PDF, a JPG of a Driver's License, or a fax), it attempts to identify every character in the image. If the engine encounters a character it cannot confidently identify due to low image quality, pixel artifacts, or unusual fonts, it must still place a "placeholder" in the resulting text string so that the search patterns (AnyFinds/Regular Expressions) can continue to evaluate the surrounding data.

Common Use Cases

  • Driver's License / ID Scans: As noted in internal guides for searching Driver's Licenses, this character ensures that even if a single letter in a name or ID number is blurry, the rest of the record is extracted and indexed.
  • Legacy UTF-8 Handling: Historically, in some older agent versions, characters outside the standard ASCII range in UTF-8 files were sometimes replaced with spaces (as noted in engineering tickets like AL-22825) if a proper parser wasn't available. The replacement setting helps standardize what that "fallback" character is.
  • Fax and Low-Res Documents: It is often paired with other OCR enhancements like Deskew (straightening), Despeckle (removing noise), and Invert (swapping black/white) to improve overall recognition.

The values that correspond to the settings in the endpoint UI are:

  • No Character - By default, no character is displayed.
  • Space - Enter a space character to see which characters were not recognized.
  • - (dash) - Enter a dash character to see which characters were not recognized.
  • ~ (tilde) - Enter a tilde character to see which characters were not recognized.
  • Other characters may be specified - A space or question mark are commonly used and recommended by archTIS.
    • Most administrators set this to a space or a question mark. These are the safest characters.
    • A space is often best because many Spirion patterns are already designed to handle or ignore whitespace between characters.
    • Using a unique character like a question mark (?) can be helpful when reviewing logs or "Match Context" in the console, as it makes it immediately obvious that the OCR engine struggled with that specific part of the document.
  • This setting is supported by Windows Agents only

Important Notes

  • Don't Use Numbers or Letters: Never use 0, X, or * as replacement characters, as these are frequently used in actual sensitive data (e.g., "X" in redacted IDs or "0" in account numbers), which will spike your false positive rate.
  • Pair with "Max Substitutions": If your custom data type allows it, limit how many "unknowns" are acceptable in a single string. Allowing 1 unknown character in a 10-digit ID is usually safe; allowing 5 would lead to massive inaccuracy.

Additional Languages

  • The Additional Languages setting dictates which character sets and linguistic dictionaries the OCR engine uses to interpret text from images and PDFs.
  • When Spirion performs OCR, it doesn't just look for shapes; it uses "language packs" to increase the probability of a correct match.
    • For example, if the engine sees a shape that could be an "n" or an "ñ," selecting Spanish as an additional language tells the engine that "ñ" is a valid and likely character, whereas in an English-only scan, it might be misinterpreted as a pixel artifact or a standard "n."
  • Enables Non-English Character Recognition: It enables the engine to recognize accented characters, umlauts, cedillas, and other diacritics specific to languages like French, German, Spanish, and others.
  • Improves Dictionary Validation: The OCR engine uses these languages to "guess" words more accurately. If a word is partially blurry, the engine compares it against the dictionaries of the selected languages to find the most likely match.
  • Expands Search Scope: By default, many Spirion versions are optimized for English. Adding languages ensures that sensitive data embedded in foreign language documents (example: a French "Numéro de Sécurité Sociale") is extracted correctly for the search engine to evaluate.
  • By default, only English characters are recognized when performing a search with OCR.
  • To recognize characters from additional languages, select each desired language.
    • Spanish
    • French
    • German
  • Note: Each additional language has an impact on the performance of the recognition and therefore languages that are known not to be present in the target locations should not be enabled.
  • This setting is supported by Windows Agents only

Impact on Performance and Accuracy

  • Accuracy (Recall): Increases. It significantly reduces the number of "unknown characters" (the ? placeholders mentioned previously) in non-English documents, making it more likely that sensitive data will be found.
  • Scan Speed: Decreases. Adding more languages requires the OCR engine to perform more "checks" against more dictionaries for every single image processed.
  • Best Practice: Only enable the specific languages you expect to find in your environment. Do not select "All Languages" unless necessary, as it can substantially slow down your scan window.
  • OCR Compensation: This setting often works in tandem with OCR Compensation logic, which handles common misinterpretations (like "8" vs "B") across the character sets of the selected languages.

Higher Education Environments

  • If you are an organization with international operations or a university (handling FERPA data for international students), you should ensure the languages relevant to those students' primary documents are enabled to ensure high-fidelity data extraction.

SharePoint On-Premise/Online Options

Use the following steps to complete the settings on both the "Select SharePoint options" and "Select advanced SharePoint options" pages.

  1. Under "Search this SharePoint Site Content" select the types to search within SharePoint sites.
  2. Specify the search types to include when searching SharePoint sites.
    • By default, all file types in the SharePoint document library are searched.
    • Alternately, searches can be enabled for taskscalendar events, or contacts that have been synchronized with Outlook or input into SharePoint.

When using this setting outside of the console, note that the value for this setting is a bitmask of the logical OR of any of these values.

  • When created in the Windows Registry, they are of type REG_DWORD.
  • When entered into the Windows Registry or a configuration XML file, they should be entered as hexadecimal values.
  • When entered into a security template (.inf) file, they should be entered in decimal.

Name

Value

Default

Documents/Files

0x00000001

On

Tasks

0x00000002

Off

Calendar

0x00000004

Off

Contacts

0x00000008

Off

Item

0x00000010

Off

  1. On the Select advanced SharePoint options, select the settings to use.

SSL Settings

The SSL settings to use when searching SharePoint sites.

  • Ignore untrusted root: Continue searching if the root certificate of the SSL chain is not currently in the trust store.
  • Ignore invalid date: Continue searching if the SSL certificate for the URL has an invalid or expired date.
  • Ignore mismatched CN: Continue searching if the domain name on the SSL certificate does not match the URL configuration.
  • Ignore incorrect usage: Continue searching if the SSL certificate is intended for a purpose other than verifying the identity of the sender and encrypting server communications.

When using this setting outside of the console, note that the value for this setting is a bitmask of the logical OR of any of these values.

  • When created in the Windows Registry, they are of type REG_DWORD.
  • When entered into the Windows Registry or a configuration XML file, they should be entered as hexadecimal values.
  • When entered into a security template (.inf) file, they should be entered in decimal.

Name

Value

Default

Ignore untrusted root

0x00000001

False

Ignore invalid date

0x00000002

False

Ignore mismatched CN

0x00000004

False

Subsite Search Depth

This setting controls the number of subsites from a specified SharePoint site to search.

  • By default, SharePoint subsites are not searched.
  • To search subsites, specify the depth to traverse.
    • For example, when set to 1, only subsites directly below the site specified in the search is searched.
  • If the value is 3, subsites that are subsites of subsites of a site specified in the search is searched.

Note: Subsites are treated as though they were explicitly specified in the SharePoint search configuration.

Subsites are searched after completing the search of the parent site.

SQL Advanced Options

The SQL advanced options page appears only for the following SQL database Targets:

  • Microsoft SQL Server
  • SQLBase
  • SAP SQL Anywhere
  • MySQL
  • SQLite

Include Primary Key Data

  • If a database includes a Primary Key column, it is possible to return the data in the cell of the Primary Key column for the row in which a result was found.
  • The data will be displayed in the Preview window along with the list of columns in which the result was found.
  • Do not include Primary Key data - Include the data from the cell of the Primary Key column for the row in which a result was found.
  • Include Primary Key data - Include the data from the cell of the Primary Key column for the row in which a result was found.

Set Non-Matching cells limit

When searching structured data (files or databases searched via a connection string), the endpoint application looks within a column for data that matches the specified data type.

  • After a specified number of cells in that column are searched without finding any matches, the search will move on to the next column.
  • The counter is started at the first row and continues until the limit is hit or a match is found.
  • If a match is found, all subsequent rows in that column will be searched.
  • To disable this limit and search all cells, use a value of 0.

Exclude Column Types

  • Specify which column types are excluded when searching databases.
  • When using this setting outside of the console, note that the value for this setting is a bitmask of the logical OR of any of these values.
  • When created in the Windows Registry, they are of type REG_DWORD.
  • When entered into the Windows Registry or a configuration XML file, they should be entered as hexadecimal values.
  • When entered into a security template (.inf) file, they should be entered in decimal.

Log Level

When searching databases via the Database Search Module, it is desirable to see detailed logging information during configuration or troubleshooting.

  • The logging specified via this setting only applies when logging has been enabled and only specific log entries are displayed if their corresponding log type has been enabled (for example, Info, Error)

The following log levels are available:

  • Default logging: Standard logging. Includes basic information such as the name of the table being searched and errors.
  • Additional logging: Standard logging plus information about the status of the search for the current row.
  • Comprehensive logging: Additional logging plus details about each table, column and row.
  • Debug logging: Comprehensive logging plus the actual cell data for each cell searched.
  • Full logging: Data from the database is written in clear text to the client log file.

Note: Data from the database is written in clear text to the client log file.

Note: Logging beyond the default level, especially the maximum level, creates very large log files and may contain sensitive information.

Row Count Start - Logging

When the setting Settings\Locations\ Databases\LogLevel is set to 1, it is possible to specify a row at which additional, detailed column information can be logged.

  • To enable this logging, specify the row number at which to start
  • This setting should be set only after consulting with the Support Team

Row Count Start

The row number at which to start detailed column logging:

  • When Log Level (Settings\Locations\Databases \LogLevel) is set to 1, it is possible to specify a row at which additional, detailed column information can be logged.
  • To enable this logging, specify the row number at which to start.
  • This setting should only be set after consulting with the Support Team.

Row Count Stop

  • By default, all rows in a Target database is searched.
  • To specify the maximum number of rows to search in each table, set this to a value greater than 0.
  • Once that number of rows have been searched, the search of that table is stopped and searching resumes with the next appropriate table.

Scan Column Names

By default, the column names listed in the field "Column Names to Include/Exclude" must match exactly to be included or excluded from the search.

  • Allow partial match - Enable partial name matching - for example, allow the value "zip" (when specified in "Column Names to Include/Exclude") to match the column "ZipCodes" 
  • Require exact match - Default. If selected, the scanned column names must match exactly those specified  in "Column Names to Include/Exclude".

*Windows-only setting. Mac/Linux are excluded.

Include/Exclude Columns (check to exclude)

  • Enabled (checked) - Search all columns except those specified in "Column Names to Include/Exclude" (one per line) 
     - The column name list applies to all databases configured to be searched.
  • Disabled (unchecked) - Default. Search all columns in the specified database.

*Windows-only setting. Mac/Linux are excluded.

Column Names to Include/Exclude

  • By default, all columns in a specified database are searched.
  • To search only specific columns, enter those table names, one per line.
  • The column name list applies to all databases configured to be searched.
  • To use this list as a list of columns to exclude from search, enable (check) the setting above, "Include/Exclude Columns (check to exclude)"

*Windows-only setting. Mac/Linux are excluded.

Scan Table Names

  • Require exact match - Default. By default, the table names listed in the setting "Table Names to Include/Exclude" below must match exactly to be included or excluded from the search.
  • Allow partial match - Allow partial name matching when searching. For example, enable this setting to allow the value "cust" (when specified in "Table Names to Include/Exclude") to match the SQL table named "Customer_Data" 

*Windows-only setting. Mac/Linux are excluded.

Include/Exclude Table (check to exclude)

  • Enabled (checked) - Check this box to exclude table names entered in the field below (one per line) from search
  • Disabled (unchecked) - Uncheck this box to include table names entered in the field below (one per line) in search

*Windows-only setting. Mac/Linux are excluded.

Table Names to Include/Exclude

  • By default, all tables in a specified database are searched.
  • To search only specific tables, enter those table names, one per line.
  • The table name list applies to all databases configured to be searched.
  • By default, the comparison of table names requires an exact match.
  • To enable a partial match, set "Scan Table Names" above, to "Allow partial match."
  • To use this list as a list of tables to exclude from the search, enable (check the box) the setting above: "Include/Exclude Tables (check to exclude)"

*Windows-only setting. Mac/Linux are excluded.

Scan Table Types

Specify whether tables or views are included when searching databases.

  • When using this setting outside of the console, note that the value for this setting is a bitmask of the logical OR of any of these values.
  • When created in the Windows Registry, they are of type REG_DWORD.
  • When entered into the Windows Registry or a configuration XML file, they need to be entered as hexadecimal values.
  • When entered into a security template (.inf) file, they need to be entered in decimal.
  • Default: Include Tables

*Windows-only setting. Mac/Linux are excluded.

Database Preview Length

The number of characters before and after a database match to send to the Spirion console. To provide context to matches when viewing results on the console, endpoints and console version 10.7 and later can send the specified number of characters from before and after the match itself.

  • By default, no characters preceding or following database matches will be sent to the console.
  • The maximum allowed number of characters is 1000 and a value of 0 disables sending preview information to the console.

Valid values:

  • 0: Default value. Disabled (no preview data will be sent to the console)
  • 1-1000: The specified number of characters from before and after the database match are sent to the console
  • >1000: Invalid (the value will be set back to the default of 20)

Note: When "Console\sendMatch" setting is set to Disable (0), preview information is not sent to the console.

*Windows-only setting. Mac/Linux are excluded.

Database Preview Match Max

Specify the maximum number of instances of a database match for which to include preview data.

  • When the setting "Database Preview Match Max" above is set to a value greater than '0' preview information (characters before and after the match) are sent to the console to provide context around the matches. 
  • When there are multiple instances of a match in a location, it is often sufficient to review the preview context for just a few of the matches to determine if the matches are true or false positives. 
  • By default, preview data is not sent to the console for any database matches

Valid values:

  • Maximum: 1000000 
  • 0: Disabled. No preview information is sent to the console.
     - For example, if a location has 5 unique matches (3 SSNs and 2 CCNs) and one of those CCNs appears 500 times, by default, the contextual preview information is only sent for the first 5 instances of that CCN.
  • 1-1000000: Preview data is sent to the console for the specified number of instances of each match.
  • >1000000: Invalid
  • Note: When the setting "Console\sendMatch" is set to Disable (0), preview information is not sent to the console.

*Windows-only setting. Mac/Linux are excluded.

Select Website Options

The "Select website options" page only appears when configuring website Targets.

Search linked webpage content

Search web pages in the Website search.

  • No - Do not search web pages in the Website search
  • Yes (default) - Search web pages in the Website search

*This setting applies to Windows Agents only. Mac and Linux platforms are excluded.

Search file content

Search files linked from web pages in the Website search.

  • Yes (default) - Search files linked from web pages in the Website search.
  • No - Do not search files linked from web pages in the Website search.

*This setting applies to Windows Agents only. Mac and Linux platforms are excluded.

Website Options - Advanced Options

Exclude

Enter any website URL to prevent the website from being scanned.

Set Search Depth

Specify the maximum depth to crawl (search).

  • Default: 3

*This setting applies to Windows Agents only. Mac and Linux platforms are excluded.

Set User Agent String

The user agent used during the Website search.

  • By default, the user agent string used by the endpoint application during the Website search is, "Spirion Web Crawler Agent"
  • To set a custom value for the webcrawler, change this value

*This setting applies to Windows Agents only. Mac and Linux platforms are excluded.

Specify the behavior when externally linked sites are encountered.

  • Ignore External Links - Default
  • Search externally linked files but do not follow external page links - To disable the searching of files linked from web pages in the Website search, set this value to "False" (0)
  • Follow External Links

*This setting applies to Windows Agents only. Mac and Linux platforms are excluded.

Restricted to Selected URL

Only search pages and files in folders that are sub-folders of the specified folder(s). 

  • For example, if the URL http://www.website.com/folder is specified in the list, the default behavior ('No' setting) is to read all of the pages in that folder and follow all the links (with respect to other settings such as link depth and redirect policy, of course) such as a link to the web page http://www.website.com/folder2/page1.html.
  • No (default) - Do NOT restrict searches of website pages and files
  • Yes - Restrict searches to only those website pages and files in folders that are sub-folders of URLs specified in the URL list.

*This setting applies to Windows Agents only. Mac and Linux platforms are excluded.

Advanced Options, Second Screen

The additional or second screen of Advanced options apply to the following Target types:

  • Cloud, Files & Folders (Local & Remote), Email, Collaboration Tools (SharePoint & Bitbucket), Database, and Website Target types.

Scan Only Changed Files

  • Under Search History is the option Scan Only Changed Files. See the image above.
  • This is the Differential Scanning setting
New in version 13.4 is a feature called Differential Scanning.

Differential Scanning:
- Is enabled by default for new scans using v13.4 or later agents
- Scans only files that have changed since the last scan
- Scans all the files in your Target when the initial scan is performed
- When scans are complete, skipped (unscanned) files are marked with an icon on the Scan Results screen. See the image "Scan Results with Skipped Locations" below.
- Details of skipped locations: Open the skipped location from the Scan Results page for additional details. See the image "Skipped Location Details" below.

Gmail and Exchange: Email Drafts and Attachments

  • Gmail and Exchange sources: With Differential Scanning enabled (it is enabled by default), when scanning either Exchange or Gmail locations, emails in draft form as well as attachments to such emails are always scanned, regardless of their state of change.
    • Emails and attachments are never marked to be skipped.

Impact of Classification

  • Important! Given that Differential Scanning is enabled (it is enabled by default), locations/files which are classified by SDP during a scan are not marked for rescanning as the location/file has not been altered, and can therefore be skipped during the next scan, assuming no other changes are made.

Impact of Redaction

  • Important! Given that Differential Scanning is enabled (it is enabled by default), and sensitive information in locations/files are redacted by SDP, then the locations/files are still marked for rescanning regardless of any other changes being made.
    • Marking redacted files to be skipped by subsequent scans would result in playbook rules being unenforced.

Impact of AnyFind Definition Change

  • Important! If your Agent is updated and this includes a change to AnyFind logic (the AnyFind definition file changes) all existing Search History is invalidated.

Scan Results with Skipped Locations

  • Skipped locations on the Scan Results page are designated with a yellow, circular icon in-between the info icon and scan name in the "Scan Name" column:
     
  • Spirion Sensitive Data Platform did not detect any changes to these locations since the last scan.

Skipped Location Details

  • Note the Location Details in the screenshot below and the details under the column Last Action Taken

Supported Sources

  • Amazon S3
  • Box
  • Dropbox
  • Exchange
  • Exchange Online
  • Gmail
  • Google Drive
  • OneDrive
  • Local files and folders
  • Remote files and folders
  • SharePoint
  • SharePoint Online

Unsupported Sources

  • Any Database
  • BitBucket
  • OLEDB
  • ODBC
  • Website

Global Ignore Lists

The Global Ignore List feature enables you to instruct Spirion Agents (version 13.6 or higher) to ignore data such as sample or fictitious data, when scanning. The data to ignore is specified in a list called a Global Ignore List. Global Ignore Lists can be created only by Admin users.

To ignore specific scan results, see How to Ignore Sensitive Data Matches.

The Global Ignore Lists section on the "Select advanced options" page appears to all users. To apply a Global Ignore List (which contains the specific data or type of data to ignore) to your scan, click the 'v' to expand the section and select one or more Global Ignore Lists from the list of available options.

Note: If your scan uses Spirion Agents earlier than version 13.6, these older Agents can not apply any Global Ignore Lists you select.

Global Ignore Lists can be viewed, created, and managed by users with Admin rights on the Scans Settings page (Settings > Application Settings > Scans Settings > Global Ignore Lists).

See Global Ignore Lists.

Allow Mismatched Bookends

  • Allow special characters before and after a match to differ.
  • When using AnyFind, any endpoint application requires that special characters (such as open parentheses, open square bracket, and so on), before a match, also expects to see the corresponding character after the match.
  • To disable this check and allow the leading and trailing characters to be any valid delimiter, set this value to "Allow" (1).

Memory Trigger Application

  • The number of bytes allocated to the application before the search is paused.
  • By default, the endpoint pauses the search if the number of bytes allocated to the application reaches 1000000000 (approximately 1 GB).
  • Because of the type and amount of memory required to conduct the search, searching generally fails when this allocation of memory has been reached.

Memory Trigger System Pagefile

  • Percentage of the system page file remaining before the search is paused.
  • By default, the endpoint pauses the search if the system page file has 10% or less space available.
  • Because of the type and amount of memory required to conduct the search, searching will generally fail when the page file is low.

Prevent Suspension During

  • Prevent automatic suspension while searching.
  • By default, the endpoint respects the Windows power settings and therefore a search may become paused if the computer enters sleep during a search.
  • To prevent the computer from sleeping while a search is in progress, set this to "Prevent Suspension" (1).
Note: This setting only prevents sleeps caused by power plans. It does not prevent Screen Savers and it does not prevent power suspensions caused by the user such as by closing a laptop screen or pressing the power button.

Run Low I/O Priority

  • Run the endpoint application with a lower I/O priority.
  • To lower the I/O priority of the endpoint application to give preference to other running, foreground applications, set this value to "Enable" (1).

Run Low Process Priority

  • Run the endpoint application with a lower priority.
  • To lower the priority of the endpoint application to give preference to other running, foreground applications, set this value to Enable (1).

Number of Cores

  • Use all available or a specified maximum number of processor cores during the search.
  • By default, the search uses all available CPU cores when searching for AnyFind and OnlyFind information within a location.
  • The valid values are:
    • 0: Use only a single CPU core
    • 1: Default. Use all available cores
    • >1: Use a maximum of this many processor cores.
      • For example, on a system with 8 cores, set this value to 4 to limit the search to a maximum of 4 cores.

Sensitive Data Engine Results Display

  • Specify which results are displayed when a Sensitive Data Definition is matched.
  • When displaying results from a Sensitive Data Definition, the default behavior is to show only the definition name itself in the results.
  • To display all the matching types, select "Display All Matching Result Types" (2).
Note: If only Display Sensitive Data Definition Name is selected, no match preview information is sent to the console.
- One or both of the options must be selected.
- If no option is selected, no results are displayed.

Display Sensitive Data Definition Name

  • Specify which results are displayed when a Sensitive Data Definition is matched.
  • When sensitive data definitions have been created on the Console and applied to an endpoint via policy, it is possible to control the data types that the endpoint can search.
  • By default, the Sensitive Data Engine ribbon button will be visible and selectable.
  • Selecting the button will cause the search to use only the sensitive data definitions applied via policy and will disable any AnyFind or Custom Types.
  • Deselecting the ribbon button will allow the interactive user to disable the sensitive data definitions and enable AnyFind or Custom Types via the UI (or use those set via policy).
  • To force the Sensitive Data Engine to run and only use the sensitive data definitions, set this to "Enabled" (1).
  • To hide the Sensitive Data Engine button and prevent the use of sensitive data definitions, even if they have been applied via policy, set this to Hidden (2)

Display All Matching Result Types

  • Specify which results are displayed when a Sensitive Data definition is matched.
  • When displaying results from a Sensitive Data Definition, the default behavior is to show only the definition name itself in the results.
  • To display all the matching types, select Display All Matching Result Types (2).
Note: If only Display Sensitive Data Definition Name is selected, no match preview information is sent to the console.
- One or both of the options must be selected.
- If no option is selected, no results are displayed.

Restore Original Modified Timestamp

  • Reset the timestamps on files after performing actions.
  • When the endpoint application performs actions on a location, it can affect the timestamps of those files.
  • For example, redacting a document updates the Last Modified Date.
  • To have the endpoint record the timestamp before performing an action on the file and then reset it back to those values after the file is modified, select one or more actions.
  • It is important to note that this is only reliable on local, NTFS file systems.
  • It is likely to work on remote NTFS file systems; however, for non-Windows file systems, this reset can be unreliable and there is no indication that the reset was not correct.
  • It is known that some versions of SAMBA incorrectly report the values of the remote files causing their reset to be inaccurate.
Note: This setting does not affect cloud storage locations.
This setting does affect locations that are manually redacted or manually classified.
  • Specify the timestamp types to use.
  • When selected the settings in the corresponding category are applied when they are enabled.

Example

  • If Modified Date is selected then the file restriction, older than file restriction, e-mail restriction, and older than e-mail restriction settings in the ModifiedDate category are used when their corresponding Enable flag (EnableFileRestriction, EnableFileRestrictionOlderThan, EnableEmailRestriction, EnableEmailRestrictionOlderThan) is set to (1).
  • Restore Classification Modified Timestamp
  • Restore Redact Modified Timestamp
Note: This setting does not affect cloud storage locations.
This setting does affect locations that are manually redacted or manually classified.

Access Bitness

Specify the bitness of the installed version of Access.

  • When a 64-bit version of Microsoft Office is installed, a specific value is written into the Windows registry to indicate this.
  • Under normal circumstances, the bitness of Office is the same as the bitness of Access.
  • However, if Office is 64-bit but Access is 32-bit, the registry value is read and it is assumed that Access is 64-bit (because Office is) and there is a failure to load the proper resources to search within Access database files.
  • Similarly, if Office is 32-bit but Access is 64-bit, it is assumed that Access is 32-bit (because the registry value does not exist) and the Access database file search does not operate properly.

If it is known that the bitness of Access differs from the bitness of Office, set this to "Force 32 bit" (1) or "Force 64 bit" (2), as appropriate.

  • Auto Detect
  • Force 32-bit
  • Force 64-bit

Match Preview Length

  • The number of characters before and after a database match to send to console.
  • To provide context to matches when viewing results on the console, version 10.7 and later of the endpoints and console can send the specified number of characters from before and after the match itself.
  • By default, no characters preceding or following database matches will be sent to the console.
  • The maximum allowed number of characters is 1000 and a value of 0 will disable sending preview information to the console.

Valid values:

  • 0: Disabled (no preview data will be sent to the console)
  • 1-1000: The specified number of characters from before and after the database match will be sent to the console
  • >1000: Invalid (the value will be set back to the default of 20)
Note: When Console\sendMatch is set to Disable (0), preview information is not be sent to the console.

Preview Match Maximum Instances

  • The maximum number of instances of a database match for which to include preview data.
  • For full explanation see Math Preview Length.
  • Note: When Console\sendMatch is set to Disable (0), preview information is not sent to the console.

Send Only Last Four Characters

Send only the last four characters of the match string to the console.

  • By default, the entire match string is sent to the console.
  • When the setting send Match is disabled, this setting has no effect.

To send only the last 4 characters (or all characters if the match string is 4 characters or less), set this value to Enable (1).

  • Disabled/Entire match (Default)
  • Last four only
  • Last four only (and first six for credit card numbers)
  • Note: Send Match and Send Only Last Four Characters options only display if Sensitive Data Finder feature is disabled.
  • For information on specific options for a specific target type, see How to Create a New Sensitive Data Scan.
  • These options also apply to Discovery Scans.

Compressed Files

Scan email and compressed files.

Include Dumpster Folder

  • Microsoft Exchange Server uses a special folder to facilitate discovery efforts.
  • Prior to Exchange Server 2013, this was called the dumpster; starting with Exchange Server 2013, the folder is called the Recoverable Items folder.
  • To include this folder in the Exchange Server search, set this to "Include dumpster folder" (1).
  • The Dumpster / Recoverable Items folder is used by these Exchange features:
    • Deleted item retention
    • Single item recovery
    • In-Place Hold
    • Litigation hold
    • Mailbox audit logging
    • Calendar logging

Search All Mailboxes

  • To enable the searching of only specific Exchange Server mailboxes, set this value to "Search specified mailboxes" (1).

Compressed Files

  • Enable this setting (check this checkbox) to search compressed files.
  • Disable this setting (uncheck this checkbox) to disable searching compressed files.

MBox

Search by extension

  • To enable the searching of files with the extensions specified in the MBOXFiles value as MBOX mail files, set this value to "True" (1).

Search specific files / folders

  • To enable the searching of files and/or folders specified in the MBOXLocationList as MBOX mail files, set this value to "True" (1).

Scan Microsoft Outlook

  • To disable the inclusion of Outlook and Exchange in the e-mail search, set this value to "False" (0).

Scan Windows Mail

  • To enable the inclusion of Outlook Express or Windows Mail in the e-mail search, set this value to "True" (1).

Thunderbird

  • To enable the inclusion of Mozilla Thunderbird in the e-mail search, set this value to "True" (1).

The search method to use for Thunderbird mbox files:

  • Only use MSF file (if MSF does not exist, skip mail folder)
  • Try MSF, if MSF does not exist, directly read mbox file (Default)
  • Ignore MSF and always read directly from mbox file

Exchange / Outlook Options

Set Outlook Bitness

  • When a 64-bit version of Microsoft Office is installed, a specific value is written into the Windows registry to indicate this.
  • Under normal circumstances, the bitness of Office is the same as the bitness of Outlook.
  • However, if Office is 64-bit but Outlook is 32-bit, the registry value will be read and it will be assumed that Outlook is 64-bit (because Office is) and there will be a failure to load the proper resources to search within Outlook.
  • Similarly, if Office is 32-bit but Outlook is 64-bit, it will be assumed that Outlook is 32-bit (because the registry value does not exist) and the Outlook search will not operate properly.
  • If it is known that the bitness of Outlook differs from the bitness of Office, set this to "Force 32 bit" (1) or "Force 64 bit" (2), as appropriate.

PST

  • Specify when to search unattached PST files.

Search Detached

  • To enable the ability to search PST files that are not attached to an existing profile, set this value to "True."

Skip PST on remote drive

  • This setting only applies to Outlook stores attached to an active profile; it is not applicable to detached PSTs.
  • By default, when configured to search Outlook, all of the stores will be searched.
  • If the PST for one of those stores is on a remote network drive and it is not desirable to allow that connection, that store can be skipped by setting this value to Skip.

Search detached Zimbra

Note: This setting is highly dependent on the configuration and format of Zimbra mail files and is not guaranteed to work.

Troubleshooting assistance and support for this setting are not available.

Info: When Settings\Locations\Email\Microsoft\SearchDetachedPST is enabled (set to True), it is possible to attempt to search .zdb files, by treating them as PST files and automatically attaching them to the current Outlook profile.

To attempt to attach .zdb files to the current Outlook profile and search them as pst files, set this to "Include in search."

Search Selected Outlook Folders

  • System generated list of GUIDs for all available Outlook and Exchange e-mail folders.

Exclude Exchange Public Folders

  • To exclude Exchange public folders from the Outlook/Exchange search, set this value to "True" (1).

Exclude IMAP Folders

  • To exclude IMAP folders from the Outlook/Exchange search, set this value to "True" (1).

Search only Cached Exchange stores

  • By default, the endpoint application will attempt to search all Outlook E-Mail stores connected to an Exchange Server when the setting "Search Remote Mail Folders" is set to "True".
  • To only search those stores that are configured to use cached mode and skip all other stores, change this setting to "Search only cached stores."

Discovery Team Settings

Discovery Team Settings become available when you select more than one Agent to perform your scan.

The "Discovery Team Settings" page is approximately the 11th step in your scan wizard.

Distributed scans use the assigned discovery agent to conduct location discovery and provide a queue in which all other Agents are assigned locations to scan. While the Discovery Agent can be manually chosen, Spirion recommends you use the preferred Discovery Agent (preferred Agent) - this is the default Agent shown in the field "Discovery Agent" on the page.

Note: Only in special use cases are static discovery Agents used.

Cloud Storage Analysis Type

  • The type of analysis to perform when analyzing Cloud Storage for Discovery Team searches.
  • Specify the method to be used when analyzing Cloud Storage locations for inclusion in a Discovery Team search.
  • Count by Bytes (0)
    • Default
    • Count and report by bytes in each cloud storage folder.
    • The workload will be divided by folders based on the size of the files/objects stored within them.
  • Count by Items (1)
    • Count and report by number of items in each cloud storage folder.
    • The workload will be divided by folders based on the number of files/objects stored within them.

Exchange Analysis Type

  • The type of analysis to perform when analyzing Exchange for Discovery Team searches.
  • Specify the method to be used when analyzing Microsoft Exchange locations for inclusion in a Discovery Team search.
  • Count by Bytes (0)
    • Count and report by bytes in each Microsoft Exchange mailbox folder.
    • The workload will be divided by folders based on the size of the files/objects stored within them.
  • Count by Items (1)
    • Default
    • Count and report by number of items in each Microsoft Exchange mailbox folder.
    • The workload will be divided by folders based on the number of files/objects stored within them.
  • Count by Mailboxes (2)
    • Count and report by the number of Microsoft Exchange mailbox users.
    • The workload will be divided by number of mailboxes.

File System Analysis Type

The type of analysis to perform when analyzing File Systems for Discovery Team searches.

  • Specify the method to be used when analyzing File System locations for inclusion in a Discovery Team search.
  • Count by Bytes (0):
    • Default
    • Count and report by bytes in each File System folder.
    • The workload will be divided by folders based on the size of the files/objects stored within them.
  • Count by Items (1):
    • Count and report by number of items in each File System folder.
    • The workload will be divided by folders based on the number of files/objects stored within them.

Gmail Analysis Type

The type of analysis to perform when analyzing Gmail for Discovery Team searches.

  • Specify the method to be used when analyzing Gmail locations for inclusion in a Discovery Team search.
  • Count by Bytes (0):
    • Count and report by bytes in each Gmail folder.
    • The workload will be divided by folders based on the size of the files/objects stored within them.
  • Count by Items (1):
    • Default
    • Count and report by number of items in each Gmail folder.
    • The workload will be divided by folders based on the number of files/objects stored within them.
  • Count by Users (2):
    • Count and report by the number of Gmail users.
    • The workload will be divided by user account.

SharePoint Analysis Type

The type of analysis to perform when analyzing SharePoint for Discovery Team searches.

  • Specify the method to be used when analyzing SharePoint locations for inclusion in a Discovery Team search.
  • Count by Bytes (0):
    • Default
    • Count and report by bytes in each SharePoint folder.
    • The workload will be divided by folders based on the size of the files/objects stored within them.
  • Count by Items (1):
    • Count and report by number of items in each SharePoint folder.
    • The workload will be divided by folders based on the number of files/objects stored within them.


Was this article helpful?