Rule Materialization ==================== The ``c_clause.RulesHandler`` can be used to calculate the materialization of given input rules and to calculate their statistics, that is, the support and the number of predictions (body groundings in data). Note that the rules do not need to be loaded with the loader but are directly passed as handler input. The loader only needs to load data. The materialization will be based on the **data** argument of the loader. .. code-block:: python from c_clause import Loader from clause import Options data = [ ("anna", "livesIn", "london"), ("anna", "learns", "english"), ("bernd", "speaks", "french") ] opts = Options() loader = Loader(opts.get("loader")) loader.load_data(data=data) Materialize a rule set ~~~~~~~~~~~~~~~~~~~~~~ First, the handler option ``rules_handler.collect_predictions`` needs to be turned on (default: True) and the handler is created. Other handler options can be found in the `config-default.yaml `_. .. note:: Calculating the materialization of large rule sets, especially for cyclical rules, can be memory and runtime expensive. If you only want to calculate statistics (see next section), turn off ``rules_handler.collect_predictions`` for reducing memory footprint and ensure that all threads are used (default: all). .. code-block:: python from c_clause import Loader, RulesHandler from clause import Options opts.set("rules_handler.collect_predictions", True) rh = RulesHandler(options=opts.get("rules_handler")) Materialization is performed with ``RulesHandler.calculate_predictions(rules, loader)``. The calculations are based on the **data** argument of the loader. The argument **rules** is either a list with rule strings or a file path. If a file path is specified the file can either contain a rule string on every line or each line is tab separated as the standard rule file syntax containing statistics. In the latter case, the statistics do not have any effect. .. code-block:: python # input from python rules = [ "speaks(X,Y) <= learns(X,Y)", "speaks(X,english) <= livesIn(X,london)", "speaks(X,english) <= speaks(X,A)" ] # input from file # each line contains (no spaces) # either 'num_preds\t support\t conf\t rulestring' or just 'rulestring' rules = "my-rules.txt" rh.calculate_predictions(rules=rules, loader=loader) Obtaining the outputs ~~~~~~~~~~~~~~~~~~~~~ Outputs can be retrieved directly: .. code-block:: python # obtain directly as string preds_str = rh.get_predictions(as_string=True) # obtain directly as idx preds_idx = rh.get_predictions(as_string=False) Here **preds_str[i]** and **preds_idx[i]** are a list containing the materialized triples for the **i'th** input rule. Outputs can also be written to a file: .. code-block:: python # write to file as flat KG # the file is a standard tab separated file containing triples # duplicates are removed, can be directly loaded as an input set with the loader rh.write_predictions(path="mat.txt", flat=True, as_string=True) # file is in jsonl format, each line can be read and dumped with Python json module # each dict contains as key the rule string and as value the materialized triples rh.write_predictions(path="mat.txt", flat=False, as_string=True)