link-grammar
link-grammar copied to clipboard
Dictionary definitions for user API
dictionary_lookup_list()
and dictionary_lookup_wild()
are part of the API, and words that they return, if subscripted, have SUBSCRIPT_MARK
in them. However, this symbol is currently not part of the API.
Similarly, returned parses may include LEFT-WORD
and RIGHT-WORD
, which may need to be handled by the user.
These symbols are defined in dict-defines.h
, which is not part of the user API.
I encountered this problem when I added the rest of the API functions (those that are in dict-api.h
and dict-structures.h
) to the Python bindings. When I wanted to test dictionary_lookup_list()
I found that SUBSCRIPT_MARK
needs to be hard-coded.
I already had such a problem in link-generator
:
https://github.com/opencog/link-grammar/blob/5cb9e68e553b6ba39f17b021cdd13b6c6105a414/link-parser/generator-utilities.c#L13-L15
Possible solutions:
- Continue to hard-code these constants.
- Provide a public
.h
file with definitions for constants that are used by the API. The filedict-api.h
is not appropriate unless we would eventually like to use another include file for non-dictionary constants. It also contains irrelevant (to the API) definitions. - Put the constants in the public API file that contains functions that use them, with a priority to
link-includes.h
. - Implement a new API call
linkgrammar_get_library_definition(const char *)
, when the argument may be strings likeleft-wall
, 'max-word`, etc. But I think this is less desirable than (3).
I propose to implement (3). It is already implemented in link-includes.h
for the lg_error_severity
and ConstituentDisplayStyle
enums, and also for LG_PANIC_DISJUNCT_COST
. Similarly, LEFT_WALL_WORD
and RIGHT_WALL_WORD
can be moved to it. The SUBSCRIPT_*
definitions can be moved to dict-api.h
. It is not clear to me what to do with constants like MAX_WORD
.
However, if (3) is implemented, there will be a need to include link-includes.h
in files that need e.g. `LEFT_WORD
why not just make dict-common/dict-defines.h
part of the public API? That is, just install it, and let users use it.
why not just make
dict-common/dict-defines.h
part of the public API? That is, just install it, and let users use it.
It currently also contains internal-only definitions:
static const double UNINITIALIZED_MAX_DISJUNCT_COST = -10000.0;
static const double DEFAULT_MAX_DISJUNCT_COST = 2.7;
/* We need some of these as literal strings. */
#define LG_DISJUNCT_COST "max-disjunct-cost"
#define LG_DICTIONARY_VERSION_NUMBER "dictionary-version-number"
#define LG_DICTIONARY_LOCALE "dictionary-locale"
However, I guess an added comment can just indicate that (or maybe they can be moved to dict-common/dict-common.h
).
In any case, I will include its move to the public API in the PR I am preparing (started as adding lg_exp_resolve()
, but got extended, due to the need to test this function, to adding all the rest of the API functions to Python and adding tests).
Yes, the internal-only definitions should be moved to dict-common.h
! That would be perfect,
In order to reduce the size of the said PR, I will first send a PR for moving dict-defines.h
to the public API.
There is another definition that is part of the user API but is not exposed: WILDCARD_WORD.
I would like to move it from tok-structures.h
to dict-defines.h
, even though it is only dictionary-related (and not a dictionary definition proper).
It may also be a good idea to expose MAX_SENTENCE
. However, it is not dict-related.
However, it is related to the max. effective value for word-count arguments of API calls that appear in link-includes.h
(parse_options_set_short_length()
, parse_options_set_*_null_count(()
, functions with WordIdx
parameter).
So can I just add it to link-includes.h
?
Yes, MAX_SENTENCE
belongs in link-includes.h