Can I monitor the job queue status automatically?
Because archTIS creates a unique search_queue_<GUID> table for every scan, any automation must first identify the correct table name before it can query the status counts.
1. Identify the Active Table Name
You can automate the discovery of the active scan table by querying the PostgreSQL information schema.
A script would first run:
SELECT table_name
FROM information_schema.tables
WHERE table_name LIKE 'search_queue_%'
AND table_name != 'search_queue';
2. Automation via PowerShell (Recommended)
Since these are Windows Agents, PowerShell is the most effective tool for local automation.
You can use the Npgsql library or the psql command-line tool (if available in the agent path) to export metrics.
Example Logic for a Monitoring Script:
- Retrieve Credentials: Use the local agent configuration or environment variables to get the Postgres username/password.
- Connect to Port 5433: Ensure the script targets the Spirion-specific Postgres port.
- Loop the Status Query: Run the following query every 60 seconds to track progress:
SELECT status, COUNT(*)
FROM public.<DYNAMIC_TABLE_NAME>
GROUP BY status; - Output to Logs/Monitoring Tool: Pipe these results into a local log file, a SIEM agent, or a performance monitoring tool (like Datadog or Zabbix).
3. Key Metrics to Watch (Automation Triggers)
If you are building an automated alerting system, set thresholds for these conditions:
- Stuck "Processing" (Status 1): If the count of rows in
status 1does not change for several minutes, it may indicate a hung worker thread. - High "Failed" (Status 3): Automated retries happen up to 3 times (5 minutes apart). If this count grows rapidly, it indicates a transient environment issue (e.g., a network share becoming unreachable).
- "PermFailed" (Status 4) > 0: This should trigger an immediate alert, as it means the agent has given up on those specific files/locations.
4. Built-in Background Maintenance
Spirion already includes an internal automated process for queue health called the Background Queue Maintenance Job. You don't need to automate these specific cleanup tasks yourself:
- Orphan Recovery: Automatically moves "Processing" (1) jobs back to "Pending" (0) within ~1 minute if a worker agent crashes (detected via missing heartbeats).
- Retry Logic: Automatically retries "Failed" (3) jobs up to 3 times.
- Auto-Cleanup: Deletes "Processed" (2) rows after 15 minutes and "PermFailed" (4) rows after 4 hours.
- Database Optimization: Runs
VACUUM ANALYZEevery 24 hours to maintain performance.
5. Potential Roadblocks
- Credential Rotation: Credentials retrieved via the Test Runner portal are intended for support sessions. For long-term automated monitoring, you would need to ensure your script can dynamically access the current credentials used by the Spirion Endpoint Service.
- Port Access: Ensure local firewalls allow your monitoring service to talk to port 5433.