SORMAS-Project icon indicating copy to clipboard operation
SORMAS-Project copied to clipboard

Normalization of Automatic Processing for Name, Birthdate, and Address Matching

Open markusmann-vg opened this issue 4 months ago • 0 comments

Problem Description

When processing new messages automatically, the system currently requires a perfect match in name, birthdate, and address. However, minor variations such as accents, case sensitivity, or spaces can cause unnecessary mismatches. To improve the success rate of automatic processing, the following changes should be implemented:

Proposed Change

Task: Normalize Name, Birthdate, and Address Matching:

Ignore Accents: Automatic processing should treat names as equivalent regardless of accents (e.g., "Élise" and "Elise").

Ignore Case Sensitivity: The matching process should be case-insensitive, meaning names like "DUPONT" and "Dupont" should be treated as a match.

Ignore Spaces: The system should ignore spaces, whether they appear between words or at the beginning/end (e.g., "Marie Dupont" and "MarieDupont" should be considered the same).

Update the Automatic Matching Algorithm:

Implement normalization rules in the matching algorithm to accommodate variations in name, birthdate, and address.

Test and Validate: Ensure that the new normalization rules work as expected and do not introduce false positives in the matching process.

Added Value/Benefit

Increased Match Rate: By allowing for minor variations (accents, case differences, and spaces), the system can match records that would otherwise be missed. This significantly improves the likelihood of correctly processing data even when user input is inconsistent.

Reduced Manual Intervention: Automating this process reduces the need for manual corrections and re-entry of data when minor variations are present. This decreases the workload for administrators and ensures that more cases can be processed without human involvement.

Acceptance Criteria

Names should be matched regardless of accents, case sensitivity, or extra spaces. For example:

  • [ ] "Élise" should match "Elise."
  • [ ] "DUPONT" should match "Dupont."
  • [ ] "Marie Dupont" should match "MarieDupont."

The matching process should still correctly handle variations while avoiding false positives.

The system should successfully process new messages even if names or addresses have minor differences in spelling, casing, or spacing.

Implementation Details

No response

Mockups

No response

Additional Information

No response

markusmann-vg avatar Oct 18 '24 08:10 markusmann-vg