mirror of https://github.com/YosysHQ/yosys.git
385 lines
13 KiB
Markdown
385 lines
13 KiB
Markdown
Pattern Matcher Generator
|
|
=========================
|
|
|
|
The program `pmgen.py` reads a `.pmg` (Pattern Matcher Generator) file and
|
|
writes a header-only C++ library that implements that pattern matcher.
|
|
|
|
The "patterns" in this context are subgraphs in a Yosys RTLIL netlist.
|
|
|
|
The algorithm used in the generated pattern matcher is a simple recursive
|
|
search with backtracking. It is left to the author of the `.pmg` file to
|
|
determine an efficient cell order for the search that allows for maximum
|
|
use of indices and early backtracking.
|
|
|
|
|
|
API of Generated Matcher
|
|
========================
|
|
|
|
When `pmgen.py` reads a `foobar.pmg` file, it writes `foobar_pm.h` containing
|
|
a class `foobar_pm`. That class is instantiated with an RTLIL module and a
|
|
list of cells from that module:
|
|
|
|
foobar_pm pm(module, module->selected_cells());
|
|
|
|
The caller must make sure that none of the cells in the 2nd argument are
|
|
deleted for as long as the patter matcher instance is used.
|
|
|
|
At any time it is possible to disable cells, preventing them from showing
|
|
up in any future matches:
|
|
|
|
pm.blacklist(some_cell);
|
|
|
|
The `.run_<pattern_name>(callback_function)` method searches for all matches
|
|
for the pattern`<pattern_name>` and calls the callback function for each found
|
|
match:
|
|
|
|
pm.run_foobar([&](){
|
|
log("found matching 'foo' cell: %s\n", log_id(pm.st.foo));
|
|
log(" with 'bar' cell: %s\n", log_id(pm.st.bar));
|
|
});
|
|
|
|
The `.pmg` file declares matcher state variables that are accessible via the
|
|
`.st_<pattern_name>.<state_name>` members. (The `.st_<pattern_name>` member is
|
|
of type `foobar_pm::state_<pattern_name>_t`.)
|
|
|
|
Similarly the `.pmg` file declares user data variables that become members of
|
|
`.ud_<pattern_name>`, a struct of type `foobar_pm::udata_<pattern_name>_t`.
|
|
|
|
There are three versions of the `run_<pattern_name>()` method: Without callback,
|
|
callback without arguments, and callback with reference to `pm`. All versions
|
|
of the `run_<pattern_name>()` method return the number of found matches.
|
|
|
|
|
|
The .pmg File Format
|
|
====================
|
|
|
|
The `.pmg` file format is a simple line-based file format. For the most part
|
|
lines consist of whitespace-separated tokens.
|
|
|
|
Lines in `.pmg` files starting with `//` are comments.
|
|
|
|
Declaring a pattern
|
|
-------------------
|
|
|
|
A `.pmg` file contains one or more patterns. Each pattern starts with a line
|
|
with the `pattern` keyword followed by the name of the pattern.
|
|
|
|
Declaring state variables
|
|
-------------------------
|
|
|
|
One or more state variables can be declared using the `state` statement,
|
|
followed by a C++ type in angle brackets, followed by a whitespace separated
|
|
list of variable names. For example:
|
|
|
|
state <bool> flag1 flag2 happy big
|
|
state <SigSpec> sigA sigB sigY
|
|
|
|
State variables are automatically managed by the generated backtracking algorithm
|
|
and saved and restored as needed.
|
|
|
|
They are automatically initialized to the default constructed value of their type
|
|
when `.run_<pattern_name>(callback_function)` is called.
|
|
|
|
Declaring udata variables
|
|
-------------------------
|
|
|
|
Udata (user-data) variables can be used for example to configure the matcher or
|
|
the callback function used to perform actions on found matches.
|
|
|
|
There is no automatic management of udata variables. For this reason it is
|
|
recommended that the user-supplied matcher code treats them as read-only
|
|
variables.
|
|
|
|
They are declared like state variables, just using the `udata` statement:
|
|
|
|
udata <int> min_data_width max_data_width
|
|
udata <IdString> data_port_name
|
|
|
|
They are automatically initialized to the default constructed value of their type
|
|
when the pattern matcher object is constructed.
|
|
|
|
Embedded C++ code
|
|
-----------------
|
|
|
|
Many statements in a `.pmg` file contain C++ code. However, there are some
|
|
slight additions to regular C++/Yosys/RTLIL code that make it a bit easier to
|
|
write matchers:
|
|
|
|
- Identifiers starting with a dollar sign or backslash are automatically
|
|
converted to special IdString variables that are initialized when the
|
|
matcher object is constructed.
|
|
|
|
- The `port(<cell>, <portname>)` function is a handy alias for
|
|
`sigmap(<cell>->getPort(<portname>))`.
|
|
|
|
- Similarly `param(<cell>, <paramname>)` looks up a parameter on a cell.
|
|
|
|
- The function `nusers(<sigspec>)` returns the number of different cells
|
|
connected to any of the given signal bits, plus one if any of the signal
|
|
bits is also a primary input or primary output.
|
|
|
|
- In `code..endcode` blocks there exist `accept`, `reject`, `branch`,
|
|
`finish`, and `subpattern` statements.
|
|
|
|
- In `index` statements there is a special `===` operator for the index
|
|
lookup.
|
|
|
|
Matching cells
|
|
--------------
|
|
|
|
Cells are matched using `match..endmatch` blocks. For example:
|
|
|
|
match mul
|
|
if ff
|
|
select mul->type == $mul
|
|
select nusers(port(mul, \Y) == 2
|
|
index <SigSpec> port(mul, \Y) === port(ff, \D)
|
|
filter some_weird_function(mul) < other_weird_function(ff)
|
|
optional
|
|
endmatch
|
|
|
|
A `match` block starts with `match <statevar>` and implicitly generates
|
|
a state variable `<statevar>` of type `RTLIL::Cell*`.
|
|
|
|
All statements in the match block are optional. (An empty match block
|
|
would simply match each and every cell in the module.)
|
|
|
|
The `if <expression>` statement makes the match block conditional. If
|
|
`<expression>` evaluates to `false` then the match block will be ignored
|
|
and the corresponding state variable is set to `nullptr`. In our example
|
|
we only try to match the `mul` cell if the `ff` state variable points
|
|
to a cell. (Presumably `ff` is provided by a prior `match` block.)
|
|
|
|
The `select` lines are evaluated once for each cell when the matcher is
|
|
initialized. A `match` block will only consider cells for which all `select`
|
|
expressions evaluated to `true`. Note that the state variable corresponding to
|
|
the match (in the example `mul`) is the only state variable that may be used
|
|
in `select` lines.
|
|
|
|
Index lines are using the `index <type> expr1 === expr2` syntax. `expr1` is
|
|
evaluated during matcher initialization and the same restrictions apply as for
|
|
`select` expressions. `expr2` is evaluated when the match is calulated. It is a
|
|
function of any state variables assigned to by previous blocks. Both expression
|
|
are converted to the given type and compared for equality. Only cells for which
|
|
all `index` statements in the block pass are considered by the match.
|
|
|
|
Note that `select` and `index` are fast operations. Thus `select` and `index`
|
|
should be used whenever possible to create efficient matchers.
|
|
|
|
Finally, `filter <expression>` narrows down the remaining list of cells. For
|
|
performance reasons `filter` statements should only be used for things that
|
|
can't be done using `select` and `index`.
|
|
|
|
The `optional` statement marks optional matches. That is, the matcher will also
|
|
explore the case where `mul` is set to `nullptr`. Without the `optional`
|
|
statement a match may only be assigned nullptr when one of the `if` expressions
|
|
evaluates to `false`.
|
|
|
|
The `semioptional` statement marks matches that must match if at least one
|
|
matching cell exists, but if no matching cell exists it is set to `nullptr`.
|
|
|
|
Slices and choices
|
|
------------------
|
|
|
|
Cell matches can contain "slices" and "choices". Slices can be used to
|
|
create matches for different sections of a cell. For example:
|
|
|
|
state <int> pmux_slice
|
|
|
|
match pmux
|
|
select pmux->type == $pmux
|
|
slice idx GetSize(port(pmux, \S))
|
|
index <SigBit> port(pmux, \S)[idx] === port(eq, \Y)
|
|
set pmux_slice idx
|
|
endmatch
|
|
|
|
The first argument to `slice` is the local variable name used to identify the
|
|
slice. The second argument is the number of slices that should be created for
|
|
this cell. The `set` statement can be used to copy that index into a state
|
|
variable so that later matches and/or code blocks can refer to it.
|
|
|
|
A similar mechanism is "choices", where a list of options is given as
|
|
second argument, and the matcher will iterate over those options:
|
|
|
|
state <SigSpec> foo bar
|
|
state <IdString> eq_ab eq_ba
|
|
|
|
match eq
|
|
select eq->type == $eq
|
|
choice <IdString> AB {\A, \B}
|
|
define <IdString> BA (AB == \A ? \B : \A)
|
|
index <SigSpec> port(eq, AB) === foo
|
|
index <SigSpec> port(eq, BA) === bar
|
|
set eq_ab AB
|
|
set eq_ba BA
|
|
endmatch
|
|
|
|
Notice how `define` can be used to define additional local variables similar
|
|
to the loop variables defined by `slice` and `choice`.
|
|
|
|
Additional code
|
|
---------------
|
|
|
|
Interleaved with `match..endmatch` blocks there may be `code..endcode` blocks.
|
|
Such a block starts with the keyword `code` followed by a list of state variables
|
|
that the block may modify. For example:
|
|
|
|
code addAB sigS
|
|
if (addA) {
|
|
addAB = addA;
|
|
sigS = port(addA, \B);
|
|
}
|
|
if (addB) {
|
|
addAB = addB;
|
|
sigS = port(addB, \A);
|
|
}
|
|
endcode
|
|
|
|
The special keyword `reject` can be used to reject the current state and
|
|
backtrack. For example:
|
|
|
|
code
|
|
if (ffA && ffB) {
|
|
if (port(ffA, \CLK) != port(ffB, \CLK))
|
|
reject;
|
|
if (param(ffA, \CLK_POLARITY) != param(ffB, \CLK_POLARITY))
|
|
reject;
|
|
}
|
|
endcode
|
|
|
|
Similarly, the special keyword `accept` can be used to accept the current
|
|
state. (`accept` will not backtrack. This means it continues with the current
|
|
branch and may accept a larger match later.)
|
|
|
|
The special keyword `branch` can be used to explore different cases. Note that
|
|
each code block has an implicit `branch` at the end. So most use-cases of the
|
|
`branch` keyword need to end the block with `reject` to avoid the implicit
|
|
branch at the end. For example:
|
|
|
|
state <int> mode
|
|
|
|
code mode
|
|
for (mode = 0; mode < 8; mode++)
|
|
branch;
|
|
reject;
|
|
endcode
|
|
|
|
But in some cases it is more natural to utilize the implicit branch statement:
|
|
|
|
state <IdString> portAB
|
|
|
|
code portAB
|
|
portAB = \A;
|
|
branch;
|
|
portAB = \B;
|
|
endcode
|
|
|
|
There is an implicit `code..endcode` block at the end of each (sub)pattern
|
|
that just rejects.
|
|
|
|
A `code..finally..endcode` block executes the code after `finally` during
|
|
back-tracking. This is useful for maintaining user data state or printing
|
|
debug messages. For example:
|
|
|
|
udata <vector<Cell*>> stack
|
|
|
|
code
|
|
stack.push_back(addAB);
|
|
...
|
|
finally
|
|
stack.pop_back();
|
|
endcode
|
|
|
|
`accept` and `finish` statements can be used inside the `finally` section,
|
|
but not `reject`, `branch`, or `subpattern`.
|
|
|
|
Declaring a subpattern
|
|
----------------------
|
|
|
|
A subpattern starts with a line containing the `subpattern` keyword followed
|
|
by the name of the subpattern. Subpatterns can be called from a `code` block
|
|
using a `subpattern(<subpattern_name>);` C statement.
|
|
|
|
Arguments may be passed to subpattern via state variables. The `subpattern`
|
|
line must be followed by a `arg <arg1> <arg2> ...` line that lists the
|
|
state variables used to pass arguments.
|
|
|
|
state <IdString> foobar_type
|
|
state <bool> foobar_state
|
|
|
|
code foobar_type foobar_state
|
|
foobar_state = false;
|
|
foobar_type = $add;
|
|
subpattern(foo);
|
|
foobar_type = $sub;
|
|
subpattern(bar);
|
|
endcode
|
|
|
|
subpattern foo
|
|
arg foobar_type foobar_state
|
|
|
|
match addsub
|
|
index <IdString> addsub->type === foobar_type
|
|
...
|
|
endmatch
|
|
|
|
code
|
|
if (foobar_state) {
|
|
subpattern(tail);
|
|
} else {
|
|
foobar_state = true;
|
|
subpattern(bar);
|
|
}
|
|
endcode
|
|
|
|
subpattern bar
|
|
arg foobar_type foobar_state
|
|
|
|
match addsub
|
|
index <IdString> addsub->type === foobar_type
|
|
...
|
|
endmatch
|
|
|
|
code
|
|
if (foobar_state) {
|
|
subpattern(tail);
|
|
} else {
|
|
foobar_state = true;
|
|
subpattern(foo);
|
|
}
|
|
endcode
|
|
|
|
subpattern tail
|
|
...
|
|
|
|
Subpatterns can be called recursively.
|
|
|
|
If a `subpattern` statement is preceded by a `fallthrough` statement, this is
|
|
equivalent to calling the subpattern at the end of the preceding block.
|
|
|
|
Generate Blocks
|
|
---------------
|
|
|
|
Match blocks may contain an optional `generate` section that is used for automatic
|
|
test-case generation. For example:
|
|
|
|
match mul
|
|
...
|
|
generate 10 0
|
|
SigSpec Y = port(ff, \D);
|
|
SigSpec A = module->addWire(NEW_ID, GetSize(Y) - rng(GetSize(Y)/2));
|
|
SigSpec B = module->addWire(NEW_ID, GetSize(Y) - rng(GetSize(Y)/2));
|
|
module->addMul(NEW_ID, A, B, Y, rng(2));
|
|
endmatch
|
|
|
|
The expression `rng(n)` returns a non-negative integer less than `n`.
|
|
|
|
The first argument to `generate` is the chance of this generate block being
|
|
executed when the match block did not match anything, in percent.
|
|
|
|
The second argument to `generate` is the chance of this generate block being
|
|
executed when the match block did match something, in percent.
|
|
|
|
The special statement `finish` can be used within generate blocks to terminate
|
|
the current pattern matcher run.
|