467 lines
18 KiB
Plaintext
467 lines
18 KiB
Plaintext
|
|
||
|
**************************************************************************
|
||
|
* *
|
||
|
* The SubCircuit C++11 library *
|
||
|
* *
|
||
|
* An implementation of a modified Ullmann Subgraph Isomorphism Algorithm *
|
||
|
* for coarse grain logic networks. by Clifford Wolf *
|
||
|
* *
|
||
|
**************************************************************************
|
||
|
|
||
|
============
|
||
|
Introduction
|
||
|
============
|
||
|
|
||
|
This is a library that implements a modified Ullmann Subgraph Isomorphism
|
||
|
Algorithm with additional features aimed at working with coarse grain logic
|
||
|
networks. It also contains a simple frequent subcircuit mining algorithm.
|
||
|
|
||
|
A simple command line tool that exposes the features of the library is also
|
||
|
included.
|
||
|
|
||
|
|
||
|
C++11 Warning
|
||
|
-------------
|
||
|
|
||
|
This project is written in C++11. Use appropriate compiler switches to compile
|
||
|
it. Tested with clang version 3.0 and option -std=c++11. Also tested with gcc
|
||
|
version 4.6.3 and option -std=c++0x.
|
||
|
|
||
|
|
||
|
========
|
||
|
Features
|
||
|
========
|
||
|
|
||
|
The input is two graphs (needle and haystack) that represent coarse grain
|
||
|
logic networks. The algorithm identifies all subgraphs of haystack that are
|
||
|
isomorphic to needle.
|
||
|
|
||
|
The following additional features over the regular Ullmann Subgraph Isomorphism
|
||
|
Algorithm are provided by the library.
|
||
|
|
||
|
* The graphs are attributed hypergraphs capable of representing netlists:
|
||
|
|
||
|
- Nodes represent the logic cells:
|
||
|
- Nodes have types and only match compatible types
|
||
|
- Nodes have ports with variable bit-width
|
||
|
|
||
|
- Hyperedges represent the signals:
|
||
|
- Each hyperedge connects one to many bits on ports on nodes
|
||
|
|
||
|
- Callback functions for advanced attributes and compatibility rules:
|
||
|
Any set of node-node compatibility rules and edge-edge
|
||
|
compatibility rules can be implemented by providing
|
||
|
the necessary callback functions.
|
||
|
|
||
|
* The algorithm is very efficient when all or many bits of one port are
|
||
|
connected to bits of the same other port. This is usually the case
|
||
|
in coarse grain logic networks. But the algorithm does not add any
|
||
|
restrictions in this area; it is just optimized for this scenario.
|
||
|
|
||
|
* The algorithm can be configured to allow larger ports in needle cells to
|
||
|
match smaller ports in haystack cells in certain situations. This way it
|
||
|
is possible to e.g. have a 32-bit adder cell in the needle match a
|
||
|
16-bit adder cell in the haystack.
|
||
|
|
||
|
* The algorithm can be configured to perform port-swapping on certain
|
||
|
ports on certain cell types to match commutative operations properly.
|
||
|
|
||
|
This is, however, not implemented very efficiently when a larger number
|
||
|
of permutations is possible on a cell type. Therefore it is recommended
|
||
|
to only use swap groups with only a few members and a few such groups
|
||
|
on one cell type type.
|
||
|
|
||
|
Also note, that the algorithm can not resolve complex dependencies
|
||
|
between the port swappings of different cells. Therefore it is
|
||
|
recommended to only use port swapping on input pins of commutative
|
||
|
operations, where such complex dependencies can not emerge.
|
||
|
|
||
|
* The algorithm can be configured to distinguish between internal signals
|
||
|
of the needle and externally visible signals. The needle will only
|
||
|
match a subgraph of the haystack if that subgraph does not expose the
|
||
|
internal signal to nodes in the haystack outside the matching subgraph.
|
||
|
|
||
|
* The algorithm can recognize a subcircuit even if some or all of its
|
||
|
inputs and/or outputs are shorted together.
|
||
|
|
||
|
* Explicit fast support for constant signals without extra nodes for
|
||
|
constant drivers.
|
||
|
|
||
|
* Support for finding only non-overlapping matches.
|
||
|
|
||
|
* A simple miner for frequent subcircuts that operates on the same circuit
|
||
|
description format.
|
||
|
|
||
|
* The public API of the library is using std::string identifiers for
|
||
|
nodes, node types and ports. Internally the costly part of the
|
||
|
algorithm is only using integer values, thus speeding up the
|
||
|
algorithm without exposing complex internal encodings to the caller.
|
||
|
|
||
|
|
||
|
=================
|
||
|
API Documentation
|
||
|
=================
|
||
|
|
||
|
This section gives a brief overview of the API. For a working example, have a
|
||
|
look at the demo.cc example program in this directory.
|
||
|
|
||
|
|
||
|
Setting up graphs
|
||
|
-----------------
|
||
|
|
||
|
Instanciate the SubCircuit::Graph class and use the methods of this class to
|
||
|
set up the circuit.
|
||
|
|
||
|
SubCircuit::Graph myGraph;
|
||
|
|
||
|
For each node in the circuit call the createNode() method. Specify the
|
||
|
identifier for the node and also the type of function implemented by the node.
|
||
|
Then call createPort() for each port of this node.
|
||
|
|
||
|
E.g. the following code adds a node "myAdder" of type "add" with three 32 bit
|
||
|
wide ports "A", "B" and "Y". Note that SubCircuit does not care which port is
|
||
|
an input and which port is an output. The last (and optional) argument to
|
||
|
createPort() specifies the minimum number of bits required for this port in the
|
||
|
haystack (this field is only used in the needle graph). So in this example the
|
||
|
node would e.g. also match any adder with a bit width smaller 32.
|
||
|
|
||
|
myGraph.createNode("myAdder", "add");
|
||
|
myGraph.createPort("myAdder", "A", 32, 1);
|
||
|
myGraph.createPort("myAdder", "B", 32, 1);
|
||
|
myGraph.createPort("myAdder", "Y", 32, 1);
|
||
|
|
||
|
The createConnection() method can be used to connect the nodes. It internally
|
||
|
creates a hypergraph. So the following code does not only connect cell1.Y with
|
||
|
cell2.A and cell3.A but also implicitly cell2.A with cell3.A.
|
||
|
|
||
|
myGraph.createConnection("cell1", "Y", "cell2", "A");
|
||
|
myGraph.createConnection("cell1", "Y", "cell3", "A");
|
||
|
|
||
|
Redundent calls to createConnection() are ignored. As long as the method is
|
||
|
called after the relevant nodes and ports are created, the order in which the
|
||
|
createConnection() calls are performed is irrelevant.
|
||
|
|
||
|
The createConnection() method can also be used to connect single bit signals.
|
||
|
In this case the start bit for both ports must be provided as well as an
|
||
|
optional width (which defaults to 1). E.g. the following calls can be used to
|
||
|
connect the 32 bit port cell4.Y to the 32 bit port cell5.A with a one bit left
|
||
|
rotate shift,
|
||
|
|
||
|
myGraph.createConnection("cell4", "Y", 0, "cell5", "A", 1, 31);
|
||
|
myGraph.createConnection("cell4", "Y", 31, "cell5", "A", 0);
|
||
|
|
||
|
The method createConstant() can be used to add a constant driver to a signal.
|
||
|
The signal value is encoded as one char by bit, allowing for multi-valued
|
||
|
logic matching. The follwoing command sets the lowest bit of cell6.A to a
|
||
|
logic 1:
|
||
|
|
||
|
myGraph.createConnection("cell6", "A", 0, '1');
|
||
|
|
||
|
It is also possible to set an entire port to a integer value, using the
|
||
|
encodings '0' and '1' for the binary digits:
|
||
|
|
||
|
myGraph.createConnection("cell6", "A", 42);
|
||
|
|
||
|
The method markExtern() can be used to mark a signal as externally visible. In
|
||
|
a needle graph this means, this signal may match a signal in the haystack that
|
||
|
is used outside the matching subgraph. In a haystack graph this means, this
|
||
|
signal is used outside the haystack graph. I.e. an internal signal of the
|
||
|
needle won't match an external signal of the haystack regardless where the
|
||
|
signal is used in the haystack.
|
||
|
|
||
|
In some application one may disable this extern/intern checks. This can easily
|
||
|
be achieved by marking all signals in the needle as extern. This can be done
|
||
|
using the Graph::markAllExtern() method.
|
||
|
|
||
|
|
||
|
Setting up and running solvers
|
||
|
------------------------------
|
||
|
|
||
|
To actually run the subgraph isomorphism algorithm, an instance of
|
||
|
SubCircuit::Solver must be created.
|
||
|
|
||
|
SubCircuit::Solver mySolver;
|
||
|
|
||
|
The addGraph() method can be used to register graphs with the solver:
|
||
|
|
||
|
mySolver.addGraph("graph1", myGraph);
|
||
|
mySolver.addGraph("graph2", myOtherGraph);
|
||
|
|
||
|
Usually nodes in the needle and the haystack must have the same type identifier
|
||
|
to match each other. Additionally pairs of compatible needle and haystack node
|
||
|
pairs can be registered using the addCompatibleTypes() method:
|
||
|
|
||
|
mySolver.addCompatibleTypes("alu", "add");
|
||
|
mySolver.addCompatibleTypes("alu", "sub");
|
||
|
mySolver.addCompatibleTypes("alu", "and");
|
||
|
mySolver.addCompatibleTypes("alu", "or");
|
||
|
mySolver.addCompatibleTypes("alu", "xor");
|
||
|
|
||
|
Note that nodes in needle and haystack must also use the same naming convention
|
||
|
for their ports in order to be considered compatible by the algorithm.
|
||
|
|
||
|
Similarly the method addCompatibleConstants() can be used the specify which
|
||
|
constant values in the needle should match which constant value in the haystack.
|
||
|
Equal values always do match.
|
||
|
|
||
|
mySolver.addCompatibleConstants('x', '0');
|
||
|
mySolver.addCompatibleConstants('x', '1');
|
||
|
|
||
|
Some cells implement commutative operations that don't care if their input
|
||
|
operands are swapped. For this cell types it is possible to register groups
|
||
|
of swappable ports. Let's consider a cell "macc23" that implements the
|
||
|
function Y = (A * B) + (C * D * E):
|
||
|
|
||
|
mySolver.addSwappablePorts("macc23", "A", "B");
|
||
|
mySolver.addSwappablePorts("macc23", "C", "D", "E");
|
||
|
|
||
|
Sometimes the rules for port swapping are a more complicated and the swapping
|
||
|
of one port is related to the swapping of another port. Let's consider a cell
|
||
|
"macc22" that implements the function Y = (A * B) + (C * D):
|
||
|
|
||
|
mySolver.addSwappablePorts("macc22", "A", "B");
|
||
|
mySolver.addSwappablePorts("macc22", "C", "D");
|
||
|
|
||
|
std::map<std::string, std::string> portMapping;
|
||
|
portMapping["A"] = "C";
|
||
|
portMapping["B"] = "D";
|
||
|
portMapping["C"] = "A";
|
||
|
portMapping["D"] = "B";
|
||
|
mySolver.addSwappablePortsPermutation("macc22", portMapping);
|
||
|
|
||
|
I.e. the method mySolver.addSwappablePortsPermutation() can be used to register
|
||
|
additional permutations for a node type of which one or none is applied on top
|
||
|
of the permutations yielded by the permutations generated by the swap groups.
|
||
|
|
||
|
Note that two solutions that differ only in the applied port swapping are not
|
||
|
reported as separate solutions. Instead only one of them is selected (in most
|
||
|
cases the one with less port swapping as it is usually identified first).
|
||
|
|
||
|
Once everything has been set up, the solve() method can be used to actually
|
||
|
search for isomorphic subgraphs. The first argument to solve() is an
|
||
|
std::vector<SubCircuit::Solver::Result> objects to which all found solutions
|
||
|
are appended. The second argument is the identifier under which the needle
|
||
|
graph has been registered and the third argument is the identifier under which
|
||
|
the haystack graph has been registered:
|
||
|
|
||
|
std::vector<SubCircuit::Solver::Result> results;
|
||
|
mySolver.solve(results, "graph1", "graph2");
|
||
|
|
||
|
The SubCircuit::Solver::Result object is a simple data structure that contains
|
||
|
the mappings between needle and haystack nodes, port mappings after the port
|
||
|
swapping and some additional metadata. See "subcircuit.h" and "demo.cc" for
|
||
|
details.
|
||
|
|
||
|
The solve() method has a third optional boolean argument. If it is set to
|
||
|
false, solve will not return any solutions that contain haystack nodes that
|
||
|
have been part of a previously found solution. This way it is e.g. easy
|
||
|
to implement a greedy macro cell matching algorithm:
|
||
|
|
||
|
std::vector<SubCircuit::Solver::Result> results;
|
||
|
mySolver.solve(results, "macroCell1", "circuit", false);
|
||
|
mySolver.solve(results, "macroCell2", "circuit", false);
|
||
|
mySolver.solve(results, "macroCell3", "circuit", false);
|
||
|
|
||
|
After this code has been executed, the results vector contains all
|
||
|
non-overlapping matches of the three macrocells. The method
|
||
|
clearOverlapHistory() can be used to reset the internal state used
|
||
|
for this feature. The default value for the third argument to solve()
|
||
|
is true (allow overlapping). The optional boolean fourth argument to the
|
||
|
Graph::createNode() method can be used to mark a node as shareable even
|
||
|
in non-overlapping solver mode.
|
||
|
|
||
|
The solve() method also has a fourth optional integer argument. If it is set to
|
||
|
a positive integer, this integer specifies the maximum number of solutions to
|
||
|
be appended to the results vector, i.e. to terminate the algorithm early when
|
||
|
the set number of matches is found. When this fourth argument is negative or
|
||
|
omitted all matches are found and appended.
|
||
|
|
||
|
An alternative version of the solve() method supports an additional argument
|
||
|
after they haystack graph identifier that specifies initial mappings for
|
||
|
the algorithm. In the following example only the haystack nodes cell_1 and
|
||
|
cell_2 are considered as mappings for the needle node cell_A:
|
||
|
|
||
|
std::map<std::string, std::set<std::string>> initialMappings;
|
||
|
initialMappings["cell_A"].insert("cell_1");
|
||
|
initialMappings["cell_A"].insert("cell_2");
|
||
|
|
||
|
std::vector<SubCircuit::Solver::Result> results;
|
||
|
mySolver.solve(results, "graph1", "graph2", initialMappings);
|
||
|
|
||
|
The clearConfig() method can be used to clear all data registered using
|
||
|
addCompatibleTypes(), addCompatibleConstants(), addSwappablePorts() and
|
||
|
addSwappablePortsPermutation() but retaining the graphs and the overlap state.
|
||
|
|
||
|
|
||
|
Using user callback function
|
||
|
----------------------------
|
||
|
|
||
|
For more complex tasks it is possible to derive a class from SubCircuit::Solver
|
||
|
that overloads one or more of the following virtual methods. The userData
|
||
|
arguments to the following methods are void pointers that can be passed as
|
||
|
third argument to Graph::createNode() and are simly passed thru to the user
|
||
|
callback functions together with the node id whenever a node is referenced.
|
||
|
|
||
|
bool userCompareNodes(needleGraphId, needleNodeId, needleUserData, haystackGraphId, haystackNodeId, haystackUserData):
|
||
|
|
||
|
Perform additional checks on a pair of nodes (one from the needle, one
|
||
|
from the haystack) to determine if the nodes are compatible. The default
|
||
|
implementation always returns true.
|
||
|
|
||
|
|
||
|
bool userCompareEdge(needleGraphId, needleFromNodeId, needleFromUserData, needleToNodeId, needleToUserData,
|
||
|
haystackGraphId, haystackFromNodeId, haystackFromUserData, haystackToNodeId, haystackToUserData):
|
||
|
|
||
|
Perform additional checks on a pair of a pair of adjacent nodes (one
|
||
|
adjacent pair from the needle and one adjacent pair from the haystack)
|
||
|
to determine wheter this edge from the needle is compatible with
|
||
|
that edge from the haystack. The default implementation always
|
||
|
returns true.
|
||
|
|
||
|
bool userCheckSolution(result):
|
||
|
|
||
|
Perform additional checks on a solution before appending it to the
|
||
|
results vector. When this function returns false, the solution is
|
||
|
ignored. The default implementation always returns true.
|
||
|
|
||
|
|
||
|
Mining for frequent SubCircuits
|
||
|
-------------------------------
|
||
|
|
||
|
The solver also contains a miner for frequent subcircuits. The following code
|
||
|
fragment will find all frequent subcircuits with at least minNodes nodes and
|
||
|
at most maxNodes nodes that occurs at least minMatches times:
|
||
|
|
||
|
std::vector<SubCircuit::Solver::MineResult> results;
|
||
|
mySolver.mine(results, minNodes, maxNodes, minMatches);
|
||
|
|
||
|
The mine() method has an optional fifth parameter that limits the number of
|
||
|
matches counted in one graph. This can be useful when mining for circuits that
|
||
|
are found in at least a number of graphs. E.g. the following call would find
|
||
|
all subcircuits with 5 nodes that are found in at least 7 of the registered
|
||
|
graphs:
|
||
|
|
||
|
mySolver.mine(results, 5, 5, 7, 1);
|
||
|
|
||
|
Note that this miner is not very efficient and therefore its use is not
|
||
|
recommended for large circuits. Also note that the miner is working under the
|
||
|
assumption that subgraph isomorphism is bidirectional. This is not the case in
|
||
|
circuits with gates with shorted pins. This can result in undetected frequent
|
||
|
subcircuits in some corner cases.
|
||
|
|
||
|
|
||
|
Debugging
|
||
|
---------
|
||
|
|
||
|
For debugging purposes the SubCircuit::Solver class implements a setVerbose()
|
||
|
method. When called once, all further calls to the solve() method cause the
|
||
|
algorithm to dump out a lot of debug information to stdout.
|
||
|
|
||
|
In conjunction with setVerbose() one can also overload the userAnnotateEdge()
|
||
|
method in order to add additional information about the edges to the debug
|
||
|
output.
|
||
|
|
||
|
|
||
|
===================
|
||
|
Shell Documentation
|
||
|
===================
|
||
|
|
||
|
This package also contains a small command-line tool called "scshell" that can
|
||
|
be used for experimentation with the algorithm. This program reads a series of
|
||
|
commands from stdin and reports its findings to stdout on exit.
|
||
|
|
||
|
$ ./scshell < test_macc22.txt
|
||
|
|
||
|
...
|
||
|
|
||
|
Match #3: (macc22 in macc4x2)
|
||
|
add_1 -> add_2 A:B B:A Y:Y
|
||
|
mul_1 -> mul_4 A:A B:B Y:Y
|
||
|
mul_2 -> mul_3 A:A B:B Y:Y
|
||
|
|
||
|
The following commands can be used in scshell to specify graphs:
|
||
|
|
||
|
graph <graph_name>
|
||
|
...
|
||
|
endgraph
|
||
|
|
||
|
Used to specify a graph with the given name. Only the commands
|
||
|
"node", "connect" and "extern" may be used within the graph ...
|
||
|
endgraph block.
|
||
|
|
||
|
node <node_name> [<port_name> [<bits> [<min_bits>]]]+
|
||
|
|
||
|
Used to create a node and ports. This command is a direct frontend
|
||
|
to the Graph::createNode() and Graph::createPort() methods.
|
||
|
|
||
|
connect <from_node> <from_port> <to_node> <to_port>
|
||
|
connect <from_node> <from_port> <from_bit> <to_node> <to_port> <to_bit>
|
||
|
connect <from_node> <from_port> <from_bit> <to_node> <to_port> <to_bit> <width>
|
||
|
|
||
|
Used to connect the nodes in the graph via Graph::createConnection().
|
||
|
|
||
|
constant <node> <port> [<bit>] <value>
|
||
|
|
||
|
Call Graph::createConstant().
|
||
|
|
||
|
extern <node> [<port> [<bit>]]+
|
||
|
|
||
|
Mark signals as extern via Graph::markExtern().
|
||
|
|
||
|
allextern
|
||
|
|
||
|
Mark all signals as extern via Graph::markAllExtern().
|
||
|
|
||
|
The following commands can be used in scshell outside a graph ... endgraph block:
|
||
|
|
||
|
compatible <needle_type> <haystack_type>
|
||
|
|
||
|
Call Solver::addCompatibleTypes().
|
||
|
|
||
|
constcompat <needle_value> <haystack_value>
|
||
|
|
||
|
Call Solver::addCompatibleConstants().
|
||
|
|
||
|
swapgroup <needle_type> <port>+
|
||
|
|
||
|
Call Solver::addSwappablePorts().
|
||
|
|
||
|
swapperm <needle_type> <ports>+ : <ports>+
|
||
|
|
||
|
Call Solver::addSwappablePortsPermutation(). Both port lists must
|
||
|
have the same length and the second one must be a permutation of the
|
||
|
first one.
|
||
|
|
||
|
initmap <needle_node> <haystack_node>+
|
||
|
|
||
|
Add an entry to the initial mappings for the next solve command.
|
||
|
This mappings are automatically reset after the solve command.
|
||
|
|
||
|
solve <needle_graph> <haystack_graph> [<allow_overlap> [<max_solutions>]]
|
||
|
|
||
|
Call Solver::solve(). The <allow_overlap> must be "1" or "true"
|
||
|
for true and "0" or "false" for false.
|
||
|
|
||
|
mine <min_nodes> <max_nodes> <min_matches> [<limit_matches_per_graph>]
|
||
|
|
||
|
Call Solver::mine().
|
||
|
|
||
|
expect <number>
|
||
|
|
||
|
Print all results so far since the last call to expect. Expect
|
||
|
<number> results and exit with error code 1 if a different number
|
||
|
of results have been found.
|
||
|
|
||
|
clearoverlap
|
||
|
|
||
|
Call Solver::clearOverlapHistory().
|
||
|
|
||
|
clearconfig
|
||
|
|
||
|
Call Solver::clearConfig().
|
||
|
|
||
|
verbose
|
||
|
|
||
|
Call Solver::setVerbose().
|
||
|
|