yosys/passes/pmgen
Clifford Wolf 64bd414e54 Minor bugfix in "test_pmgen -generate"
Signed-off-by: Clifford Wolf <clifford@clifford.at>
2019-08-16 14:35:13 +02:00
..
.gitignore Refactor demo_reduce into test_pmgen 2019-08-16 11:47:51 +02:00
Makefile.inc Refactor demo_reduce into test_pmgen 2019-08-16 11:47:51 +02:00
README.md Add pmgen finish statement, return number of matches 2019-08-16 14:16:35 +02:00
ice40_dsp.cc pmgen progress 2019-04-30 10:51:51 +02:00
ice40_dsp.pmg Change pmgen default rule to reject, switch peepopt behavior to accept 2019-08-15 22:47:59 +02:00
peepopt.cc Fix typo 2019-05-03 08:25:30 +02:00
peepopt_muldiv.pmg Change pmgen default rule to reject, switch peepopt behavior to accept 2019-08-15 22:47:59 +02:00
peepopt_shiftmul.pmg Change pmgen default rule to reject, switch peepopt behavior to accept 2019-08-15 22:47:59 +02:00
pmgen.py Add pmgen finish statement, return number of matches 2019-08-16 14:16:35 +02:00
test_pmgen.cc Minor bugfix in "test_pmgen -generate" 2019-08-16 14:35:13 +02:00
test_pmgen.pmg Add pmgen finish statement, return number of matches 2019-08-16 14:16:35 +02:00

README.md

Pattern Matcher Generator

The program pmgen.py reads a .pmg (Pattern Matcher Generator) file and writes a header-only C++ library that implements that pattern matcher.

The "patterns" in this context are subgraphs in a Yosys RTLIL netlist.

The algorithm used in the generated pattern matcher is a simple recursive search with backtracking. It is left to the author of the .pmg file to determine an efficient cell order for the search that allows for maximum use of indices and early backtracking.

API of Generated Matcher

When pmgen.py reads a foobar.pmg file, it writes foobar_pm.h containing a class foobar_pm. That class is instantiated with an RTLIL module and a list of cells from that module:

foobar_pm pm(module, module->selected_cells());

The caller must make sure that none of the cells in the 2nd argument are deleted for as long as the patter matcher instance is used.

At any time it is possible to disable cells, preventing them from showing up in any future matches:

pm.blacklist(some_cell);

The .run_<pattern_name>(callback_function) method searches for all matches for the pattern<pattern_name> and calls the callback function for each found match:

pm.run_foobar([&](){
    log("found matching 'foo' cell: %s\n", log_id(pm.st.foo));
    log("          with 'bar' cell: %s\n", log_id(pm.st.bar));
});

The .pmg file declares matcher state variables that are accessible via the .st_<pattern_name>.<state_name> members. (The .st_<pattern_name> member is of type foobar_pm::state_<pattern_name>_t.)

Similarly the .pmg file declares user data variables that become members of .ud_<pattern_name>, a struct of type foobar_pm::udata_<pattern_name>_t.

There are three versions of the run_<pattern_name>() method: Without callback, callback without arguments, and callback with reference to pm. All versions of the run_<pattern_name>() method return the number of found matches.

The .pmg File Format

The .pmg file format is a simple line-based file format. For the most part lines consist of whitespace-separated tokens.

Lines in .pmg files starting with // are comments.

Declaring a pattern

A .pmg file contains one or more patterns. Each pattern starts with a line with the pattern keyword followed by the name of the pattern.

Declaring state variables

One or more state variables can be declared using the state statement, followed by a C++ type in angle brackets, followed by a whitespace separated list of variable names. For example:

state <bool> flag1 flag2 happy big
state <SigSpec> sigA sigB sigY

State variables are automatically managed by the generated backtracking algorithm and saved and restored as needed.

They are automatically initialized to the default constructed value of their type when .run_<pattern_name>(callback_function) is called.

Declaring udata variables

Udata (user-data) variables can be used for example to configure the matcher or the callback function used to perform actions on found matches.

There is no automatic management of udata variables. For this reason it is recommended that the user-supplied matcher code treats them as read-only variables.

They are declared like state variables, just using the udata statement:

udata <int> min_data_width max_data_width
udata <IdString> data_port_name

They are automatically initialized to the default constructed value of their type when the pattern matcher object is constructed.

Embedded C++ code

Many statements in a .pmg file contain C++ code. However, there are some slight additions to regular C++/Yosys/RTLIL code that make it a bit easier to write matchers:

  • Identifiers starting with a dollar sign or backslash are automatically converted to special IdString variables that are initialized when the matcher object is constructed.

  • The port(<cell>, <portname>) function is a handy alias for sigmap(<cell>->getPort(<portname>)).

  • Similarly param(<cell>, <paramname>) looks up a parameter on a cell.

  • The function nusers(<sigspec>) returns the number of different cells connected to any of the given signal bits, plus one if any of the signal bits is also a primary input or primary output.

  • In code..endcode blocks there exist accept, reject, branch, finish, and subpattern statements.

  • In index statements there is a special === operator for the index lookup.

Matching cells

Cells are matched using match..endmatch blocks. For example:

match mul
    if ff
    select mul->type == $mul
    select nusers(port(mul, \Y) == 2
    index <SigSpec> port(mul, \Y) === port(ff, \D)
    filter some_weird_function(mul) < other_weird_function(ff)
    optional
endmatch

A match block starts with match <statevar> and implicitly generates a state variable <statevar> of type RTLIL::Cell*.

All statements in the match block are optional. (An empty match block would simply match each and every cell in the module.)

The if <expression> statement makes the match block conditional. If <expression> evaluates to false then the match block will be ignored and the corresponding state variable is set to nullptr. In our example we only try to match the mul cell if the ff state variable points to a cell. (Presumably ff is provided by a prior match block.)

The select lines are evaluated once for each cell when the matcher is initialized. A match block will only consider cells for which all select expressions evaluated to true. Note that the state variable corresponding to the match (in the example mul) is the only state variable that may be used in select lines.

Index lines are using the index <type> expr1 === expr2 syntax. expr1 is evaluated during matcher initialization and the same restrictions apply as for select expressions. expr2 is evaluated when the match is calulated. It is a function of any state variables assigned to by previous blocks. Both expression are converted to the given type and compared for equality. Only cells for which all index statements in the block pass are considered by the match.

Note that select and index are fast operations. Thus select and index should be used whenever possible to create efficient matchers.

Finally, filter <expression> narrows down the remaining list of cells. For performance reasons filter statements should only be used for things that can't be done using select and index.

The optional statement marks optional matches. That is, the matcher will also explore the case where mul is set to nullptr. Without the optional statement a match may only be assigned nullptr when one of the if expressions evaluates to false.

The semioptional statement marks matches that must match if at least one matching cell exists, but if no matching cell exists it is set to nullptr.

Additional code

Interleaved with match..endmatch blocks there may be code..endcode blocks. Such a block starts with the keyword code followed by a list of state variables that the block may modify. For example:

code addAB sigS
    if (addA) {
        addAB = addA;
        sigS = port(addA, \B);
    }
    if (addB) {
        addAB = addB;
        sigS = port(addB, \A);
    }
endcode

The special keyword reject can be used to reject the current state and backtrack. For example:

code
    if (ffA && ffB) {
        if (port(ffA, \CLK) != port(ffB, \CLK))
            reject;
        if (param(ffA, \CLK_POLARITY) != param(ffB, \CLK_POLARITY))
            reject;
    }
endcode

Similarly, the special keyword accept can be used to accept the current state. (accept will not backtrack. This means it continues with the current branch and may accept a larger match later.)

The special keyword branch can be used to explore different cases. Note that each code block has an implicit branch at the end. So most use-cases of the branch keyword need to end the block with reject to avoid the implicit branch at the end. For example:

state <int> mode

code mode
    for (mode = 0; mode < 8; mode++)
        branch;
    reject;
endcode

But in some cases it is more natural to utilize the implicit branch statement:

state <IdString> portAB

code portAB
    portAB = \A;
    branch;
    portAB = \B;
endcode

There is an implicit code..endcode block at the end of each (sub)pattern that just rejects.

A code..finally..endcode block executes the code after finally during back-tracking. This is useful for maintaining user data state or printing debug messages. For example:

udata <vector<Cell*>> stack

code
    stack.push_back(addAB);
    ...
finally
    stack.pop_back();
endcode

accept and finish statements can be used inside the finally section, but not reject, branch, or subpattern.

Declaring a subpattern

A subpattern starts with a line containing the subpattern keyword followed by the name of the subpattern. Subpatterns can be called from a code block using a subpattern(<subpattern_name>); C statement.

Arguments may be passed to subpattern via state variables. The subpattern line must be followed by a arg <arg1> <arg2> ... line that lists the state variables used to pass arguments.

state <IdString> foobar_type
state <bool> foobar_state

code foobar_type foobar_state
    foobar_state = false;
    foobar_type = $add;
    subpattern(foo);
    foobar_type = $sub;
    subpattern(bar);
endcode

subpattern foo
arg foobar_type foobar_state

match addsub
    index <IdString> addsub->type === foobar_type
    ...
endmatch

code
    if (foobar_state) {
        subpattern(tail);
    } else {
        foobar_state = true;
        subpattern(bar);
    }
endcode

subpattern bar
arg foobar_type foobar_state

match addsub
    index <IdString> addsub->type === foobar_type
    ...
endmatch

code
    if (foobar_state) {
        subpattern(tail);
    } else {
        foobar_state = true;
        subpattern(foo);
    }
endcode

subpattern tail
...

Subpatterns cann be called recursively.

Generate Blocks

Match blocks may contain an optional generate section that is used for automatic test-case generation. For example:

match mul
    ...
generate 10
    SigSpec Y = port(ff, \D);
    SigSpec A = module->addWire(NEW_ID, GetSize(Y) - rng(GetSize(Y)/2));
    SigSpec B = module->addWire(NEW_ID, GetSize(Y) - rng(GetSize(Y)/2));
    module->addMul(NEW_ID, A, B, Y, rng(2));
endmatch

The expression rng(n) returns a non-negative integer less than n.

The argument to generate is the chance of this generate block being executed when the match block did not match anything, in percent.

The special statement finish can be used within generate blocks to terminate the current pattern matcher run.