Tips: What are the Best Ways to Handle Duplicate Results from Scans of (Email) Public Folders?

The most effective way to handle duplicate results from Public Folder scans is to prevent them at the source. Because Public Folders are shared, scanning them from multiple user endpoints is the primary cause of "result bloat" in the Spirion Console.

If you have already run scans and are facing a mountain of duplicate data, or if you are planning a scan and want to avoid this, follow these best practices:

1. The "Scan Once" Rule (Prevention)

The absolute best way to handle duplicates is to ensure the data is scanned by only one Spirion Agent.

  • The Strategy: Set "Exclude Exchange Public Folders" to Yes for all general user policies.
  • The Execution: Create a single, dedicated scan task that runs from a centralized server (not a workstation) using a Service Account. This ensures that each unique piece of sensitive data in the Public Folders is reported exactly once.

2. Use "Location-Based" Filtering in the Console

If your database is already full of duplicates, use the Spirion Console's filtering tools to group results by their Location.

  • How to do it: In the Results or Data Finder tab, sort or group by the "Path" or "Location" column.
  • The Benefit: You see that the same file path (for example, \\Public Folders\Finance\Budget.xlsx) appears under 50 different users. You can then select all but one of those entries and "Ignore" or "Delete" them to clean up your reporting.

3. Leverage "Global Ignore" Lists

If you find a specific Public Folder that contains "known" sensitive data that is authorized to be there (e.g., a legacy archive that has already been audited):

  • The Action: Add that specific Folder GUID or Path to the Global Ignore List in the Spirion Console.
  • The Result: Future scans still "see" the folder, but the Agent automatically drops the results before they are sent to the console, preventing new duplicates from appearing.

4. Enable "Result De-duplication" (If available)

Depending on your version of the Spirion Console (SDP/SDM), there are internal mechanisms to help manage this:

  • Data Aggregation: The console is designed to recognize when the same "Identity" (for example, a specific Social Security number) is found in the same "Location" (the Public Folder path).
  • The Limitation: Even if the console "groups" them, the raw database still grows with every scan. This is why Step 1 (Prevention) is still the most important.

5. Remediation via a Single Account

If you need to "Clean" or "Quarantine" sensitive data found in a Public Folder:

  • The Risk: If 10 different users try to "Shred" the same file in a Public Folder at the same time, it can cause file corruption or Exchange errors.
  • The Fix: Perform all remediation actions from the Console using a single administrative account. Do not allow end-users to perform remediation on shared Public Folder data.

6. Use "Discovery" Mode for Public Folders

If you are unsure which Public Folders even contain sensitive data:

  • The Action: Run a scan with "Match Count" enabled but "Preview" disabled.
  • The Benefit: This tells you how much sensitive data is in a folder without downloading the actual snippets of data for every single user. This keeps the database size manageable while still giving you a "Risk Map" of the Public Folders.

Summary Table: Handling Duplicates

Approach

Effort

Effectiveness

Recommended For

Exclude from User Scans

Low

Highest

Every organization.

Centralized Server Scan

Medium

High

Auditing shared organizational data.

Console Grouping/Filtering

Medium

Medium

Cleaning up existing "bloated" databases.

Global Ignore Lists

High

High

Known-safe shared archives.

Final Recommendation

Do not try to "manage" duplicates after they happen. Exclude Public Folders from all endpoint policies and scan them once from a central location to keep your data clean and your Exchange server healthy.