Reference regex not matching keyword at beginning of line [was: External reference not found]
I'm trying to use the new references behaviour as described in the wiki: https://doorstop.readthedocs.io/en/latest/reference/item/#references-new-array-behavior
My requirement is:
active: true
derived: false
header: ''
level: 1.1.1
links:
- PRJ02: ZhhHI6cybMRVdVtgPToHqiAPERFx2O6FtdTjT4JIV_4=
normative: true
ref: ''
references:
- keyword: '#F001'
path: F_allocation.csv
type: file
reviewed: Ge91zi0XKSRDFfr7HqBE1EPQnSQZsiXz0aC0GzTEUBU=
safety: 0
text: |
SOMETHING
The file F_allocation.csv is in the git repository root (I have tried both / and \ for relative paths).
Running doorstop raises no errors, and draws the requirement tree.
However, running doorstop publish F Fmd crashes:
$ doorstop publish F F.md -v
looking for documents in C:\Users\lz\Documents\git\abms...
found document: PRJ
found document: FSR
found document: VS_FSR
found document: FR
found document: SG
found document: VS_SG
found document: F
found document: VS_F
building tree...
root of the tree: PRJ
added to tree: F
added to tree: VS_F
added to tree: FR
added to tree: SG
added to tree: VS_SG
added to tree: FSR
added to tree: VS_FSR
built tree: PRJ <- [ F <- [ FR, SG <- [ FSR, VS_FSR ], VS_SG ], VS_F ]
Deleting contents of assets directory C:\Users\lz\Documents\git\abms\assets
Copying c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\files\assets to C:\Users\lz\Documents\git\abms\assets
publishing to C:\Users\lz\Documents\git\abms\F.md...
loading document F's items...
Traceback (most recent call last):
File "c:\users\lz\appdata\local\programs\python\python37\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\lz\appdata\local\programs\python\python37\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\lz\AppData\Local\Programs\Python\Python37\Scripts\doorstop.exe\__main__.py", line 7, in <module>
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\cli\main.py", line 176, in main
success = function(args, os.getcwd(), parser.error)
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\cli\commands.py", line 587, in run_publish
document, path, ext, template=args.template, **kwargs
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\publisher.py", line 100, in publish
common.write_lines(lines, path2)
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\common.py", line 157, in write_lines
for line in lines:
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\publisher.py", line 240, in publish_lines
yield from gen(obj, **kwargs)
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\publisher.py", line 382, in _lines_markdown
yield _format_md_references(item)
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\publisher.py", line 474, in _format_md_references
references = item.find_references()
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\item.py", line 90, in wrapped
return func(self, *args, **kwargs)
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\item.py", line 668, in find_references
path, self.root, self.tree, path, keyword
File "c:\users\lz\appdata\local\programs\python\python37\lib\site-packages\doorstop\core\reference_finder.py", line 101, in find_file_reference
raise DoorstopError(msg)
doorstop.common.DoorstopError: external reference not found: F_allocation.csv
I found the root issue. The CSV file had the lines arranged like this:
Requirement reference,Requirement version,Design element
#F001,f1cf87320d9ddf5da0c58212c0fa4c4b3cfd84ad,root_system
But this does not match the regex defined in doorstop\core\reference_finder.py:
pattern = r"(\b|\W){}(\b|\W)".format(re.escape(keyword))
The regex wants something before the actual keyword. See example: https://regex101.com/r/2LryAs/2
A correct regex might be this one, with a quantifier:
pattern = r"(\b|\W)*{}(\b|\W)".format(re.escape(keyword))
Example: https://regex101.com/r/AOYY4E/1
As a workaround I'm rearranging the CSV columns in order to let the keyword appear in the second column (not at the line beginning). But I believe that this is a bug, and the documentation does not mention that the keyword should not be at the beginning of the line.
I'm editing the bug title since it is misleading.