Termset Expansion Macros¶
NLPQL supports a set of macros for termset generation. The macros provide a compact syntax for representing lists of synonyms and lexical variants (plurals and verb inflections). The macros also support the concept of a “namespace”, so that terms can be generated from different sources.
The use of termset expansion macros is optional. They are provided purely for convenience, as a means to generate and suggest additional synonyms.
Syntax¶
The macro syntax is namespace.function(args)
, where the namespace is either
Clarity
or OHDSI
. The argument is either a single term in double
quotes or a comma-separated list of terms surrounded by brackets:
namespace.function("term")
namespace.function(["term1", "term2", ..., "termN"])
If the namespace is omitted it defaults to Clarity
. The supported macros
are:
Macro Meaning | |
---|---|
Clarity.Synonyms | Generate a list of synonyms from WordNet |
Clarity.Plurals | Generate a list of plural forms |
Clarity.VerbInflections | Generate inflections for the verb in base form |
OHDSI.Synonyms | Generate a list of OHDSI synonyms for the concept |
OHDSI.Ancestors | Generate all OHDSI ancestor concepts |
OHDSI.Descendants | Generate all OHDSI descendant concepts |
The synonym finder examines the macro argument(s) and attempts to find the nouns, adjectives, and adverbs. It generates synonyms for each that it finds, returning the cartesian product [1] of all possibilities. This process can cause a combinatorial explosion in the number of results. To illustrate, consider this example:
The human walks the pet.
If the synonyms for human
are man, woman, boy, girl
and the synonyms for
pet
are dog, cat
, then 4*2 = 8 results will be generated, in addition
to the original:
The human walks the pet.
The man walks the dog.
The woman walks the dog.
The boy walks the dog.
The girl walks the dog.
The man walks the cat.
The woman walks the cat.
The boy walks the cat.
The girl walks the cat.
Hundreds or perhaps thousands of result strings could be generated by expansion of terms with many synonyms. So we recommend caution with synonym generation, limiting its use to single terms or short strings.
Both single and multiword terms can be included in a macro, and the macro can operate only on selected terms in a list:
Synonyms(["heart", "heart attack", "heart disease"])
"heart", Synonyms("heart attack"), "heart disease",
IMPORTANT NOTE: the VerbInflections
macro requires that the verb be
given in base form (also called “raw infinitive” form, “dictionary” form, or
“bare” form). The reason for this is because it is not possible to
unambiguously determine the base form of a verb from an arbitrary inflection,
and the ClarityNLP verb inflector requires the base form as input. See the
documentation for the verb inflector for more on this
topic.
Macro Nesting¶
Macros can also be nested:
Clarity.LexicalVariants(OHDSI.Synonyms(["myocardial infarction"]))
Plurals(Synonyms("neoplasm"))
The nesting depth is limited to two, as these examples illustrate.
API¶
The API endpoint nlpql_expander
allows users to view the results of macro
expansion. For instance, to expand macros in the NLPQL file macros.nlpql
,
HTTP POST the file to the nlpql_expander
API endpoint with this cURL [2]
command:
curl -i -X POST http://localhost:5000/nlpql_expander -H "Content-Type: text/plain" --data-binary "@macros.nlpql"
Another HTTP client, such as Postman [3], could also be used to POST the file.
Examples¶
Here is an example that illustrates the use of the NLPQL macros.
Consider this termset for symptoms related to influenza:
termset FluTermset: [
"coughing",
OHDSI.Synonyms("fever"),
Synonyms("body ache"),
VerbInflections("have fever"),
];
After macro expansion, the termset becomes:
termset FluTermset: [
"coughing",
"febrile", "fever", "fever (finding)", "pyrexia", "pyrexial",
"body ache", "body aching", ... "torso aching", "trunk ache", "trunk aching",
"had fever", "has fever", "have fever", "having fever",
];
Some synonyms for “body ache” have been omitted. The result will obviously require editing and removal of irrelevant synonyms. One could use the macros as part of an iterative development process for termsets, using the macros to generate initial lists of terms which would then be pruned and refined.