How do I create a custom regex for medical record numbers?

To create a custom regex for Medical Record Numbers (MRNs), you need to define a Custom Data Type in the Spirion Console.

Because MRN formats vary significantly between healthcare providers (for example, Epic vs. Cerner vs. legacy systems), a custom RegEx is the most accurate way to detect them.

Step 1: Define the Pattern

First, identify the exact structure of your MRNs.

  • Example A (Simple Numeric): 12345678 (8 digits)
    • Regex: \b\d{8}\b
  • Example B (Alphanumeric with Prefix): MRN-12345
    • Regex: \bMRN-\d{5}\b
  • Example C (Complex): AB-123-45
    • Regex: \b[A-Z]{2}-\d{3}-\d{2}\b

Step 2: Create the Custom Data Type in the Console

Procedure:

  1. From the left-side navigation menu navigate to Settings > Global Data Types > CUSTOM DATA TYPES tab.
  2. Click Actions > Add Custom Data Type.
  3. Select Regular Expression from the drop-down menu.
  4. Name: Give it a clear name like Custom MRN - [Hospital Name].
  5. Regex Pattern: Enter your pattern (for example, \b\d{8}\b).
    • Tip: Always use \b (word boundaries) at the start and end to prevent matching a 8-digit string that is actually part of a longer number (like a phone number).
  6. Validation (Optional but Recommended):
    • If your MRN has a checksum, you can use a SearchAPI script for advanced validation.
    • Otherwise, rely on Keywords (see Step 3).

Step 3: Add Context with Keywords (Reduce False Positives)

A 8-digit number regex will match everything—dates, serial numbers, and parts of zip codes. You must add keywords to ensure accuracy.

  1. In the same Custom Data Type definition, look for the Keywords section.
  2. Add terms such as the following: MRN, Medical Record, Patient ID, Patient Number, Chart Number.
  3. Set the Proximity: Require one of these keywords to be within 50 to 100 characters of the RegEx match.

Step 4: Test and Deploy

  1. Test: Use the "Test Regex" tool in the console (if available) or run a scan against a single test file containing known MRNs.
  2. Add to Policy: Go to your Scan Policy, navigate to Sensitive Data Types, and check the box for your new Custom MRN type.
  3. Run Scan: Execute the scan and review the Match Evidence to ensure it is picking up the MRNs correctly without too much noise.


Best Practice: The "SDD" Upgrade

Once your regex is working, wrap it in a Sensitive Data Definition (SDD).

  • Create a Sensitive Data Definition called "Confirmed PHI".
  • Logic: (Custom MRN Regex) NEAR (Patient Name AnyFind).
  • Why: This ensures that you only alert on files where an MRN is actually associated with a person, which is the true definition of PHI under HIPAA.

Summary

Use a Custom Regex Data Type with word boundaries (\b), add Keywords for context, and eventually combine it with AnyFinds in a Sensitive Data Definition for the highest possible accuracy.