How can I optimize scan policies for Online Archives?
Because Online Archives are network-dependent and often contain massive amounts of historical data, optimization is focused on reducing the volume of data transferred and the number of items inspected.
Here are the most effective recommended optimization strategies:
1. Decouple Archives from Primary Mailboxes
The most common mistake is scanning the Primary Mailbox and Online Archive in the same task.
- The Strategy: Create two separate policies.
- Policy A (Primary): Scans local/cached mailboxes daily.
- Policy B (Archive): Scans Online Archives once a month or once a quarter.
- The Benefit: This prevents the slow Online Archive scan from delaying the reporting of high-risk findings in the user's active, daily mailbox.
2. Use "Date-Based" Filtering
Online Archives often contain data going back 10+ years. If you haven't changed your business processes in that time, scanning 2014 data every month is redundant.
- The Strategy: In the Policy settings, use the "Date Modified" or "Date Created" filter.
- The Execution: Set the policy to only scan items modified within the last 365 days.
- The Benefit: This drastically reduces the item count, focusing the Agent only on "new" data moved into the archive.
3. Implement "Discovery-First" Workflows
Don't perform a deep "Sensitive Data" scan (which opens every file) on an archive until you know it's worth the time.
- The Strategy: Run a Discovery Scan (Metadata only) on the archives first.
- The Benefit: This gives you a report of which archives are the largest and which contain specific file types (like
.pstor.zip). You can then target your deep, time-consuming scans only on the "high-risk" archives.
4. Optimize Attachment Handling
Attachments are the primary cause of "hangs" during archive scans.
- The Strategy: Limit the size and type of attachments scanned in the archive.
- The Execution:
- Set a Maximum File Size (for example, don't scan attachments over 20 MB).
- Exclude known-safe or high-latency file types (e.g.,
.iso,.exe,.mp4). - Disable "Deep Archive Inspection" (scanning zips within zips) for the archive policy unless strictly required.
5. Adjust Throttling and Threading
Since Online Archives are server-side, you need to be "gentle" to avoid being blocked by Exchange/M365.
- The Strategy: Reduce the "aggressiveness" of the Agent.
- The Execution:
- Lower the Thread Count for the archive scan. While it sounds counter-intuitive, fewer threads often result in faster total scan times because you avoid triggering server-side throttling.
- Ensure the "Search only Cached Exchange Stores" setting is set to No only for this specific policy.
6. Use "Search History" to Skip Unchanged Files
Spirion Agents can track which files they have already scanned.
- The Strategy: Enable "Use Search History" in the policy.
- The Benefit: On the second run of the archive scan, the Agent compares the "Last Modified" date of the archive items against its local database. If the item hasn't changed since the last scan, the agent skips it entirely. This can reduce subsequent scan times by 90%.
7. Target Specific Folders
Users often use specific folders in their archive for "sensitive" items (for example, a folder named "Tax Returns" or "Old Payroll").
- The Strategy: If you have a specific risk area, use Folder Includes/Excludes.
- The Benefit: Instead of scanning the entire
\\Online Archive\, you can target\\Online Archive\Inbox\Important\or exclude\\Online Archive\Deleted Items\.
Summary Checklist for Archive Optimization:
Setting | Recommended Value | Why? |
|---|---|---|
Frequency | Monthly/Quarterly | Archives don't change as fast as primary mail. |
Date Filter | Last 1 Year | Reduces item count by ignoring stale history. |
Search History | Enabled | Skips millions of previously scanned, unchanged items. |
Max File Size | 10MB - 20MB | Prevents the agent from choking on large legacy files. |
Thread Count | Low (1-2) | Avoids triggering M365/Exchange throttling. |
Final Tip
Always monitor the "Items Scanned per Minute" metric in your first Archive scan.
If it's below 10 items per minute, you are likely being throttled and should further reduce your thread count or increase your file size exclusions.