This change makes it possible to delete rules after inserting them,
without needing to query the rules first. Additionally, this allows
positioning a new rule next to an existing rule.
There are two ways to refer to a rule: Either by ID or by handle. The ID
is assigned by userspace, and is only valid within a transaction, so it
can only be used before the flush. The handle is assigned by the kernel
when the transaction is committed, and can thus only be used after the
flush. We thus need to set an ID on each newly created rule, and
retrieve the handle of the rule during the flush.
I extended the message struct with a pointer to the Rule which the
message creates. This allows calling the reply handler callback which
sets the handle.
I updated tests to add a handle to generated replies for the
NFT_MSG_NEWRULE messages.
Commit 0d9bfa4d18 added code to handle "overrun", but the commit is
very misleading. NLMSG_OVERRUN is in fact not a flag, but a complete
message type, so the (re&netlink.Overrun) masking makes no sense. Even
better, NLMSG_OVERRUN is never actually used by Linux.
The actual bug which the commit was attempting to fix is that Flush was
not receiving replies which the kernel sent for messages with the echo
flag. This change reverts that commit and instead adds code in Flush to
receive the replies.
I updated tests which simulate the kernel to generate replies.
There was an existing mechanism to allocate IDs for sets, but this was
using a global counter without any synchronization to prevent data
races. I replaced this by a new mechanism which uses a connection-scoped
counter, protected by the Conn.mu Mutex. This can then also be used in
other places where IDs need to be allocated.
As an additional safeguard, it will panic instead of allocating the same
ID twice in a transaction. Most likely, your program will run out of
memory before reaching this point.
* fix: resolve deadlock in `Flush` function when handling ENOBUFS error
* Simulate deadlock issue using reduced read/write buffers to verify the fix and ensure no regressions
Related: #242
After 7879d7ecf6, it seems that
any multi-message operation performed without CAP_SYS_ADMIN will
leads to forever block inside nftables.Conn.Flush.
For example:
```go
package main
import "github.com/google/nftables"
func main() {
conn, err := nftables.New()
if err != nil {
panic(err)
}
t := conn.AddTable(&nftables.Table{})
err = conn.AddSet(&nftables.Set{Table: t}, []nftables.SetElement{})
if err != nil {
panic(err)
}
conn.AddSet(&nftables.Set{Table: t}, []nftables.SetElement{})
if err != nil {
panic(err)
}
err = conn.Flush()
if err != nil {
panic(err)
}
return
}
```
That's because that although we send multiple messages on netlink
socket, kernel will only sends one permission error message as reply.
Signed-off-by: black-desk <me@black-desk.cn>
When you flush multiple messages/ops on a connection, and if flush fails
to apply, the netlink connection returns errors per command. Since we
are returning on noticing the first error, the rest of the errors are
buffered and leaks into the result of next flush.
This pull request invokes `conn.Receive()` * number of messages to drain
any buffered errors in the connection.
* Close receiver for lasting netlink connections while defaulting to existing temporary netlink connection usage
* add unit test for New lasting connection, Close and correct default connection handling behavior
* refactor tests to use New constructor
* make Conn mutex un-exported (#159)
fixes issue #157