RegEx Example: ABN Including Mod-89 Check

This following SearchDLL.cpp content finds Australian Business Numbers (ABNs) including a modulus-89 check.

An Australian Business Number (ABN) is a unique 11-digit identifier used to identify businesses to the government, suppliers, and the public. It consists of a 9-digit identifier plus 2 leading check digits.

The "mod-89 check" is an algorithm that verifies that the two leading digits are correctly derived from the following nine digits. This validates if an ABN is correctly structured, helping to identify common data entry errors like digit transposition.

This includes the following RegEx:

pData->data =
_T("(?i)"
"(?:"
"\\\\bABN\\\\b[^\\\\r\\\\n]*?(?<!\\\\d)[1-9]\\\\d\\\\s?\\\\d{3}\\\\s?\\\\d{3}\\\\s?\\\\d{3}(?!\\\\d)"
"|(?<!\\\\d)[1-9]\\\\d\\\\s?\\\\d{3}\\\\s?\\\\d{3}\\\\s?\\\\d{3}(?!\\\\d)[^\\\\r\\\\n]*?\\\\bABN\\\\b"
")");


In simple terms:

  • (?i) – Match is case-insensitive (abn, ABN, Abn, etc.).
  • It looks for a line that contains the word “ABN” and an ABN-shaped number (11 digits, first digit not 0).
  • The number can be written with or without spaces:
    • Example: 53004085617 or 53 004 085 617.
  • It allows “ABN” to appear either before or after the number on the same line:
    • ABN: 53 004 085 617
    • 53 004 085 617 (ABN)
  • (?<!\d) and (?!\d) around the number make sure it’s not part of a longer digit string.
  • [^\r\n]*? in the middle just means “any characters, but don’t cross a newline”, so the label and number have to be on the same line.
    • Then, this code is what is checking to make sure it meets the Mod-89 check:
      // Validate an 11-digit Australian Business Number (ABN) using the official mod-89 algorithm. static bool IsValidABN(const std::wstring& digits)
      {
      // ABN must be exactly 11 digits
      if (digits.length() != 11)
      return false;
      // Weights for each digit
      static const int weights[11] = { 10, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 };
      long long sum = 0;
      for (size_t i = 0; i < 11; ++i)
      {
      wchar_t ch = digits[i];
      if (ch < L'0' || ch > L'9')
      return false; // safety guard – should never happen after cleaning
      int d = static_cast<int>(ch - L'0');
      // For the first digit, subtract 1 as per the ABN rules
      if (i == 0)
      d -= 1;
      sum += static_cast<long long>(d) * weights[i];
      }
      // Valid ABNs have a sum that is exactly divisible by 89
      return (sum % 89) == 0;
      }
      In simple terms:
    • This function expects a string of exactly 11 digits (e.g. 53004085616).
    • If it’s not 11 digits, it’s immediately rejected.
    • Each position in the ABN has a fixed weight:
       
    • The algorithm then:
      1. Converts each character to its numeric value d.
      2. For the first digit only, subtracts 1 (d = d - 1).
      3. Multiplies each digit by its corresponding weight.
      4. Adds all these products together into a running sum.
    • At the end, it checks:
      sum % 89 == 0
      • If the sum is divisible by 89, the ABN is valid → returns true.
      • If not, the ABN is invalid → returns false.
    • In the detector pipeline:
      • The RegEx finds a candidate ABN … 11-digit number.
      • We strip spaces etc. to get digits only.
      • IsValidABN is called on those 11 digits.
      • If the mod-89 check fails, the match is rejected and never shown in the results.

Example SearchDLL.cpp File - Used to Discover and Validate ABNs

The SearchDLL.cpp code below is provided as an example.

While the code below has been tailored to find and validate 11-digit Australian Business Numbers (ABNs) using the official mod-89 algorithm, the code can be customized for different purposes, such as discovering other data types.

// SearchDLL.cpp : Defines the exported functions for the Identity Finder client application.
//
#include "stdafx.h"
#include <tchar.h>
#include <string>
#include <vector>
#include <cwctype>
#include "SearchItemData.h"

/*
Define the custom name, result type, and icon index
CUSTOM_SEARCH_NAME - The custom name that will be displayed in the Identity Finder client and on the console.
RESULT_TYPE - The unique number between 12001 and 14000 that is mapped on the console to this custom search DLL
*/
#define CUSTOM_SEARCH_NAME _T("FindsABNincludingMod89check")
#define RESULT_TYPE 12548

using namespace std;

/*
Validate an 11-digit Australian Business Number (ABN) using the official mod-89 algorithm.

Algorithm:
- ABN is 11 digits.
- Subtract 1 from the first digit.
- Multiply each digit by its corresponding weight:
Weights: 10, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19
- Sum the products and check: sum % 89 == 0
*/
static bool IsValidABN(const std::wstring& digits)
{
// ABN must be exactly 11 digits
if (digits.length() != 11)
return false;

static const int weights[11] = { 10, 1, 3, 5, 7, 9, 11, 13, 15, 17, 19 };

long long sum = 0;
for (size_t i = 0; i < 11; ++i)
{
wchar_t ch = digits[i];
if (ch < L'0' || ch > L'9')
return false; // safety guard

int d = static_cast<int>(ch - L'0');

// For the first digit, subtract 1
if (i == 0)
d -= 1;

sum += static_cast<long long>(d) * weights[i];
}

return (sum % 89) == 0;
}

/*
Helper: extract only digit characters from a matched string
*/
static std::wstring ExtractDigits(const std::wstring& x)
{
std::wstring digitsOnly;
const std::wstring allowed = _T("0123456789");

for (std::wstring::size_type i = 0; i < x.length(); ++i)
{
if (allowed.find(x.at(i)) != std::wstring::npos)
digitsOnly += x.at(i);
}
return digitsOnly;
}

/*
Helper: does the raw match string contain "ABN" (case-insensitive)?
*/
static bool ContainsABNLabel(const std::wstring& x)
{
std::wstring upper = x;
for (size_t i = 0; i < upper.length(); ++i)
{
upper[i] = static_cast<wchar_t>(std::towupper(upper[i]));
}
return upper.find(L"ABN") != std::wstring::npos;
}

extern "C" __declspec(dllexport)
const TCHAR* GetDisplayName(void)
{
return CUSTOM_SEARCH_NAME;
}

extern "C" __declspec(dllexport)
unsigned int GetResultType(void)
{
return RESULT_TYPE;
}

extern "C" __declspec(dllexport)
void GetSearchItemData(SearchItemData*& pData)
{
/*********************************************************/
/****************** Begin Search Item ********************/

/*
Create a new Search Item - this line should not be edited
*/
pData = new SearchItemData();

/*
The custom search name and result type will be set below
based on the above settings - these lines should not be edited
*/
pData->searchInfo.displayName = CUSTOM_SEARCH_NAME;
pData->resultType = RESULT_TYPE;

/*
Specify the base regular expression or keyword on which to match

We use a single REGEX that:
- requires a same-line label before/after the number:
ICCID -> ICCID + 89-prefixed 19�20-digit number
IMEA/IMEI/SIM -> label + 15- or 17-digit number
ABN -> label + 11-digit Australian Business Number
- prevents starting/ending the match inside a larger digit run: (?<!\d) / (?!\d)
- keeps all content on the SAME LINE between label and number via [^\r\n]*?
*/

pData->data =
_T("(?i)"
"(?:"
"\\bABN\\b[^\\r\\n]*?(?<!\\d)[1-9]\\d\\s?\\d{3}\\s?\\d{3}\\s?\\d{3}(?!\\d)"
"|(?<!\\d)[1-9]\\d\\s?\\d{3}\\s?\\d{3}\\s?\\d{3}(?!\\d)[^\\r\\n]*?\\bABN\\b"
")");

/*
For a regular expression, use type 1
For a simple keyword, use type 2
*/
pData->dataType = 1;

/******************* End Search Item *********************/
/*********************************************************/

/*********************************************************/
/****************** Dependencies (None) ******************/
/*
Intentionally left empty � all label logic is inside the regex itself.
*/

/****************** End Dependencies *********************/
/*********************************************************/

return;
}

extern "C" __declspec(dllexport)
void DeleteSearchItemData(SearchItemData* pData)
{
/*
This function serves as the destructor for the search item
and should not be edited
*/
if (NULL != pData)
{
delete pData;
}
}

extern "C" __declspec(dllexport)
bool DoTest(const std::wstring& x, const std::wstring* /*fileDataPtr*/)
{
/*
Additional validation layer:
- If this looks like an ABN (11 digits + ABN label), enforce the mod-89 check.
- If invalid, reject the match entirely.
*/
std::wstring digitsOnly = ExtractDigits(x);

if (digitsOnly.length() == 11 && ContainsABNLabel(x))
{
// ABN candidate � must pass checksum
if (!IsValidABN(digitsOnly))
{
return false; // reject this match completely
}
}

// For ICCID / IMEI / SIM or valid ABN, accept the match
return true;
}

extern "C" __declspec(dllexport)
bool DoTestEx(const std::wstring& x, const std::wstring* fileDataPtr, std::wstring::size_type /*location*/)
{
/*
Extended test � mirror DoTest so invalid ABNs are rejected
no matter which path the engine uses.
*/
return DoTest(x, fileDataPtr);
}

extern "C" __declspec(dllexport)
bool Clean(const std::wstring& x, std::wstring *&result)
{
/*
Cleaner: return only the digits of the match (strip spaces/tabs/labels/etc.)
ABN validity is handled in DoTest/DoTestEx.
*/
try
{
std::wstring digitsOnly = ExtractDigits(x);
result = new std::wstring(digitsOnly);
}
catch(...)
{
return false;
}
return (result != NULL);
}

extern "C" __declspec(dllexport)
bool FreeCleanedResult(std::wstring *&result)
{
/*
The FreeCleanedResult function is used to free memory allocated
during the clean function by the DLL.
This function is automatically called and must not be altered.
*/
bool ok = false;
try
{
if (result != NULL)
{
delete result;
result = NULL;
}
ok = true;
}
catch(...) { }
return ok;
}

Was this article helpful?