DAWG
DAWG copied to clipboard
Fix cython build error and expose three more methods
I would like to add a fix for this issue: https://github.com/pytries/DAWG/issues/31 Also I would like to expose 3 more methods, they are has_value(), root() and follow(). They will be useful if we want to scan a string then highlight every matched keywords, for example:
Target string: "One two three!" List of keywords: ["one", "three"]
import dawg
import re
targetedStr = 'one two three!'
targetLen = len(targetedStr)
dic = dawg.DAWG(['one', 'three'])
index = dic.root()
hasMore = False
reportedStr = ""
bufferedStr = ""
isContainingKeywords = False
for i, chr in enumerate(targetedStr):
willRedoInNextLoop = True
while (willRedoInNextLoop):
willRedoInNextLoop = False
bufferedStr += targetedStr[i]
hasMore, index = dic.follow(chr, index)
isEgdeCase = (hasMore == True and i + 1 >= targetLen)
if (hasMore == False or isEgdeCase == True):
if (dic.has_value(index)):
# Keyword detected!
keyword = bufferedStr if isEgdeCase else bufferedStr[0:(
len(bufferedStr) - 1)]
reportedStr += "{" + keyword + "}"
isContainingKeywords = True
if not isEgdeCase:
willRedoInNextLoop = True
else:
# Not a keyword
reportedStr += bufferedStr
bufferedStr = ""
index = dic.root()
if (isContainingKeywords == True):
reportedStr += bufferedStr
print(reportedStr)
Output:
{one} two {three}!
HI.
I'm trying to compile your branch of DAWG with python3.7 and Cython 0.27.3.
sudo python3.7 -m pip install git+https://github.com/Meigyoku-Thmn/DAWG --upgrade
Can u guide me how to compile? Thank you.
Collecting git+https://github.com/Meigyoku-Thmn/DAWG
Cloning https://github.com/Meigyoku-Thmn/DAWG to /tmp/pip-req-build-76e2ndpc
running install
running build
running build_ext
building 'dawg' extension
creating build
creating build/temp.linux-x86_64-3.7
creating build/temp.linux-x86_64-3.7/src
creating build/temp.linux-x86_64-3.7/lib
creating build/temp.linux-x86_64-3.7/lib/b64
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_dawg.cpp -o build/temp.linux-x86_64-3.7/src/_dawg.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/iostream.cpp -o build/temp.linux-x86_64-3.7/src/iostream.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_dictionary.cpp -o build/temp.linux-x86_64-3.7/src/_dictionary.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_completer.cpp -o build/temp.linux-x86_64-3.7/src/_completer.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_guide_unit.cpp -o build/temp.linux-x86_64-3.7/src/_guide_unit.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/_guide.cpp -o build/temp.linux-x86_64-3.7/src/_guide.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/b64_decode.cpp -o build/temp.linux-x86_64-3.7/src/b64_decode.o
gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -Ilib -I/usr/local/include/python3.7m -c src/dawg.cpp -o build/temp.linux-x86_64-3.7/src/dawg.o
In file included from src/dawg.cpp:266:0:
src/../lib/dawgdic/dictionary-builder.h: In member function ‘bool dawgdic::DictionaryBuilder::BuildDictionary(dawgdic::BaseType, dawgdic::BaseType)’:
src/../lib/dawgdic/dictionary-builder.h:138:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
if (dawg_.is_merging(dawg_child_index))
^~
src/../lib/dawgdic/dictionary-builder.h:139:53: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
link_table_.Insert(dawg_child_index, offset); {
^
src/dawg.cpp: In function ‘PyObject* __pyx_f_4dawg_9BytesDAWG_items(__pyx_obj_4dawg_BytesDAWG*, int, __pyx_opt_args_4dawg_9BytesDAWG_items*)’:
src/dawg.cpp:11011:37: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (__pyx_t_10 = 0; __pyx_t_10 < __pyx_t_9; __pyx_t_10+=1) {
~~~~~~~~~~~^~~~~~~~~~~
src/dawg.cpp: In function ‘PyObject* __pyx_gb_4dawg_9BytesDAWG_24generator2(__pyx_GeneratorObject*, PyObject*)’:
src/dawg.cpp:11485:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {
~~~~~~~~~~^~~~~~~~~~~
src/dawg.cpp: In function ‘PyObject* __pyx_f_4dawg_9BytesDAWG_keys(__pyx_obj_4dawg_BytesDAWG*, int, __pyx_opt_args_4dawg_9BytesDAWG_keys*)’:
src/dawg.cpp:11814:37: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (__pyx_t_10 = 0; __pyx_t_10 < __pyx_t_9; __pyx_t_10+=1) {
~~~~~~~~~~~^~~~~~~~~~~
src/dawg.cpp: In function ‘PyObject* __pyx_gb_4dawg_9BytesDAWG_29generator3(__pyx_GeneratorObject*, PyObject*)’:
src/dawg.cpp:12222:35: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
for (__pyx_t_6 = 0; __pyx_t_6 < __pyx_t_5; __pyx_t_6+=1) {
~~~~~~~~~~^~~~~~~~~~~
src/dawg.cpp: In function ‘int __Pyx_GetException(PyObject**, PyObject**, PyObject**)’:
src/dawg.cpp:21523:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
tmp_type = tstate->exc_type;
^~~~~~~~
curexc_type
src/dawg.cpp:21524:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tmp_value = tstate->exc_value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21525:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tmp_tb = tstate->exc_traceback;
^~~~~~~~~~~~~
curexc_traceback
src/dawg.cpp:21526:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
tstate->exc_type = local_type;
^~~~~~~~
curexc_type
src/dawg.cpp:21527:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tstate->exc_value = local_value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21528:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tstate->exc_traceback = local_tb;
^~~~~~~~~~~~~
curexc_traceback
src/dawg.cpp: In function ‘void __Pyx_ExceptionSwap(PyObject**, PyObject**, PyObject**)’:
src/dawg.cpp:21550:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
tmp_type = tstate->exc_type;
^~~~~~~~
curexc_type
src/dawg.cpp:21551:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tmp_value = tstate->exc_value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21552:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tmp_tb = tstate->exc_traceback;
^~~~~~~~~~~~~
curexc_traceback
src/dawg.cpp:21553:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
tstate->exc_type = *type;
^~~~~~~~
curexc_type
src/dawg.cpp:21554:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tstate->exc_value = *value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21555:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tstate->exc_traceback = *tb;
^~~~~~~~~~~~~
curexc_traceback
src/dawg.cpp: In function ‘void __Pyx_ExceptionSave(PyObject**, PyObject**, PyObject**)’:
src/dawg.cpp:21568:21: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
*type = tstate->exc_type;
^~~~~~~~
curexc_type
src/dawg.cpp:21569:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
*value = tstate->exc_value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21570:19: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
*tb = tstate->exc_traceback;
^~~~~~~~~~~~~
curexc_traceback
src/dawg.cpp: In function ‘void __Pyx_ExceptionReset(PyObject*, PyObject*, PyObject*)’:
src/dawg.cpp:21582:24: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
tmp_type = tstate->exc_type;
^~~~~~~~
curexc_type
src/dawg.cpp:21583:25: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tmp_value = tstate->exc_value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21584:22: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tmp_tb = tstate->exc_traceback;
^~~~~~~~~~~~~
curexc_traceback
src/dawg.cpp:21585:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_type’; did you mean ‘curexc_type’?
tstate->exc_type = type;
^~~~~~~~
curexc_type
src/dawg.cpp:21586:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_value’; did you mean ‘curexc_value’?
tstate->exc_value = value;
^~~~~~~~~
curexc_value
src/dawg.cpp:21587:13: error: ‘PyThreadState {aka struct _ts}’ has no member named ‘exc_traceback’; did you mean ‘curexc_traceback’?
tstate->exc_traceback = tb;
^~~~~~~~~~~~~
curexc_traceback
error: command 'gcc' failed with exit status 1
----------------------------------------
Can't roll back DAWG; was not uninstalled
Command "/usr/local/bin/python3.7 -u -c "import setuptools, tokenize;__file__='/tmp/pip-req-build-76e2ndpc/setup.py';f=getattr(tokenize, 'open', open)(__file__);code=f.read().replace('\r\n', '\n');f.close();exec(compile(code, __file__, 'exec'))" install --record /tmp/pip-record-6iiqpsev/install-record.txt --single-version-externally-managed --compile" failed with error code 1 in /tmp/pip-req-build-76e2ndpc/