pydatastructs icon indicating copy to clipboard operation
pydatastructs copied to clipboard

feat: Implement Bitap Algorithm for approximate string matching

Open asmit27rai opened this issue 9 months ago • 1 comments

Description

The Bitap Algorithm (also known as the Shift-Or Algorithm) is a bit-parallel algorithm used for approximate string matching. It is particularly useful for fuzzy string matching and searching with errors (e.g., allowing a certain number of mismatches).

This issue involves implementing the Bitap Algorithm in the pydatastructs.strings.algorithms module.

Proposed Implementation

  • Add a function bitap(text: str, query: str, max_errors: int = 0) -> DynamicOneDimensionalArray to the algorithms.py file.
  • The function should return the starting positions of all matches of the query in the text, allowing up to max_errors mismatches.
  • Include unit tests for the new function.
  • Update the __all__ list to include the bitap function.

asmit27rai avatar Mar 03 '25 16:03 asmit27rai

@Kishan-Ved I am working on this issue....

asmit27rai avatar Mar 03 '25 16:03 asmit27rai