How do I create a custom regex for medical record numbers?
To create a custom regex for Medical Record Numbers (MRNs), you need to define a Custom Data Type in the Spirion Console.
Because MRN formats vary significantly between healthcare providers (for example, Epic vs. Cerner vs. legacy systems), a custom RegEx is the most accurate way to detect them.
Step 1: Define the Pattern
First, identify the exact structure of your MRNs.
- Example A (Simple Numeric):
12345678(8 digits) - Regex:
\b\d{8}\b
- Regex:
- Example B (Alphanumeric with Prefix):
MRN-12345 - Regex:
\bMRN-\d{5}\b
- Regex:
- Example C (Complex):
AB-123-45 - Regex:
\b[A-Z]{2}-\d{3}-\d{2}\b
- Regex:
Step 2: Create the Custom Data Type in the Console
Procedure:
- From the left-side navigation menu navigate to Settings > Global Data Types > CUSTOM DATA TYPES tab.
- Click Actions > Add Custom Data Type.
- Select Regular Expression from the drop-down menu.
- Name: Give it a clear name like
Custom MRN - [Hospital Name]. - Regex Pattern: Enter your pattern (for example,
\b\d{8}\b). - Tip: Always use
\b(word boundaries) at the start and end to prevent matching a 8-digit string that is actually part of a longer number (like a phone number).
- Tip: Always use
- Validation (Optional but Recommended):
- If your MRN has a checksum, you can use a SearchAPI script for advanced validation.
- Otherwise, rely on Keywords (see Step 3).
Step 3: Add Context with Keywords (Reduce False Positives)
A 8-digit number regex will match everything—dates, serial numbers, and parts of zip codes. You must add keywords to ensure accuracy.
- In the same Custom Data Type definition, look for the Keywords section.
- Add terms such as the following:
MRN,Medical Record,Patient ID,Patient Number,Chart Number. - Set the Proximity: Require one of these keywords to be within 50 to 100 characters of the RegEx match.
Step 4: Test and Deploy
- Test: Use the "Test Regex" tool in the console (if available) or run a scan against a single test file containing known MRNs.
- Add to Policy: Go to your Scan Policy, navigate to Sensitive Data Types, and check the box for your new
Custom MRNtype. - Run Scan: Execute the scan and review the Match Evidence to ensure it is picking up the MRNs correctly without too much noise.
Best Practice: The "SDD" Upgrade
Once your regex is working, wrap it in a Sensitive Data Definition (SDD).
- Create a Sensitive Data Definition called "Confirmed PHI".
- Logic:
(Custom MRN Regex) NEAR (Patient Name AnyFind). - Why: This ensures that you only alert on files where an MRN is actually associated with a person, which is the true definition of PHI under HIPAA.
Summary
Use a Custom Regex Data Type with word boundaries (\b), add Keywords for context, and eventually combine it with AnyFinds in a Sensitive Data Definition for the highest possible accuracy.