Commit Graph

26 Commits

Author SHA1 Message Date
rjl493456442 5adf4adc8e
eth/protocols/snap: cleanup dangling account trie nodes due to incomplete storage (#30258)
This pull request fixes #30229.
 
During snap sync, large storage will be split into several pieces and
synchronized concurrently. Unfortunately, the tradeoff is that the respective
merkle trie of each storage chunk will be incomplete due to the incomplete
boundaries. The trie nodes on these boundaries will be discarded, and any
dangling nodes on disk will also be removed if they fall on these paths,
ensuring the state healer won't be blocked.

However, the dangling account trie nodes on the path from the root to the
associated account are left untouched. This means the dangling account trie
nodes could potentially stop the state healing and break the assumption that the
entire subtrie should exist if the subtrie root exists. We should consider the
account trie node as the ancestor of the corresponding storage trie node.

In the scenarios described in the above ticket, the state corruption could occur
if there is a dangling account trie node while some storage trie nodes are
removed due to synchronization redo.

The fixing idea is pretty straightforward, the trie nodes on the path from root
to account should all be explicitly removed if an incomplete storage trie
occurs. Therefore, a `delete` operation has been added into `gentrie` to
explicitly clear the account along with all nodes on this path. The special
thing is that it's a cross-trie clearing. In theory, there may be a dangling
node at any position on this account key and we have to clear all of them.
2024-08-12 10:43:54 +02:00
rjl493456442 d3c4466edd
core, eth/protocols/snap, trie: fix cause for snap-sync corruption, implement gentrie (#29313)
This pull request defines a gentrie for snap sync purpose.

The stackTrie is used to generate the merkle tree nodes upon receiving a state batch. Several additional options have been added into stackTrie to handle incomplete states (either missing states before or after).

In this pull request, these options have been relocated from stackTrie to genTrie, which serves as a wrapper for stackTrie specifically for snap sync purposes.

Further, the logic for managing incomplete state has been enhanced in this change. Originally, there are two cases handled:

-    boundary node filtering
-    internal (covered by extension node) node clearing

This changes adds one more:
 
- Clearing leftover nodes on the boundaries.

This feature is necessary if there are leftover trie nodes in database, otherwise node inconsistency may break the state healing.
2024-04-16 09:05:36 +02:00
Martin Holst Swende 96b75033c0
trie: use explicit errors in stacktrie (instead of panic) (#28361)
This PR removes panics from stacktrie (mostly), and makes the Update return errors instead. While adding tests for this, I also found that one case of possible corruption was not caught, which is now fixed.
2023-10-25 14:53:50 +02:00
rjl493456442 ab04aeb855
core, eth, trie: filter out boundary nodes and remove dangling nodes in stacktrie (#28327)
* core, eth, trie: filter out boundary nodes in stacktrie

* eth/protocol/snap: add comments

* Update trie/stacktrie.go

Co-authored-by: Martin Holst Swende <martin@swende.se>

* eth, trie: remove onBoundary callback

* eth/protocols/snap: keep complete boundary nodes

* eth/protocols/snap: skip healing if the storage trie is already complete

* eth, trie: add more metrics

* eth, trie: address comment

---------

Co-authored-by: Martin Holst Swende <martin@swende.se>
2023-10-23 18:31:56 +03:00
rjl493456442 1b1611b8d0
core, trie, eth: refactor stacktrie constructor (#28350)
This change enhances the stacktrie constructor by introducing an option struct. It also simplifies the `Hash` and `Commit` operations, getting rid of the special handling round root node.
2023-10-17 14:09:25 +02:00
Martin Holst Swende 8976a0c97a
trie: remove owner and binary marshaling from stacktrie (#28291)
This change
  - Removes the owner-notion from a stacktrie; the owner is only ever needed for comitting to the database, but the commit-function, the `writeFn` is provided by the caller, so the caller can just set the owner into the `writeFn` instead of having it passed through the stacktrie.
  - Removes the `encoding.BinaryMarshaler`/`encoding.BinaryUnmarshaler` interface from stacktrie. We're not using it, and it is doubtful whether anyone downstream is either.
2023-10-11 06:12:45 +02:00
Martin Holst Swende 08326794e8
trie: refactor stacktrie (#28233)
This change refactors stacktrie to separate the stacktrie itself from the
internal representation of nodes: a stacktrie is not a recursive structure
of stacktries, rather, a framework for representing and operating upon a set of nodes.

---------

Co-authored-by: Gary Rong <garyrong0905@gmail.com>
2023-10-10 08:28:56 +02:00
Marius van der Wijden 5976e58415
trie: reduce allocs in recHash (#27770) 2023-08-18 22:41:19 +02:00
rjl493456442 bbcb5ea37b
core, trie: rework trie database (#26813)
* core, trie: rework trie database

* trie: fix comment
2023-04-24 10:38:52 +03:00
rjl493456442 99f81d2724
all: refactor trie API (#26995)
In this PR, all TryXXX(e.g. TryGet) APIs of trie are renamed to XXX(e.g. Get) with an error returned.

The original XXX(e.g. Get) APIs are renamed to MustXXX(e.g. MustGet) and does not return any error -- they print a log output. A future PR will change the behaviour to panic on errorrs.
2023-04-20 06:57:24 -04:00
Darioush Jalali b7bfbc1e64
trie, accounts/abi: add error-checks (#26914) 2023-03-17 06:19:51 -04:00
rjl493456442 fe01a2f63b
all: use unified emptyRootHash and emptyCodeHash (#26718)
The EmptyRootHash and EmptyCodeHash are defined everywhere in the codebase, this PR replaces all of them with unified one defined in core/types package, and also defines constants for TxRoot, WithdrawalsRoot and UncleRoot
2023-02-21 06:12:27 -05:00
rjl493456442 743e404906
core, eth, les, tests, trie: abstract node scheme (#25532)
This PR introduces a node scheme abstraction. The interface is only implemented by `hashScheme` at the moment, but will be extended by `pathScheme` very soon.

Apart from that, a few changes are also included which is worth mentioning:

-  port the changes in the stacktrie, tracking the path prefix of nodes during commit
-  use ethdb.Database for constructing trie.Database. This is not necessary right now, but it is required for path-based used to open reverse diff freezer
2022-11-28 14:31:28 +01:00
Martin Holst Swende 5a02b2d6d0
all: fix spelling mistakes (#25961) 2022-10-11 09:37:00 +02:00
Felix Lange b628d72766
build: upgrade to go 1.19 (#25726)
This changes the CI / release builds to use the latest Go version. It also
upgrades golangci-lint to a newer version compatible with Go 1.19.

In Go 1.19, godoc has gained official support for links and lists. The
syntax for code blocks in doc comments has changed and now requires a
leading tab character. gofmt adapts comments to the new syntax
automatically, so there are a lot of comment re-formatting changes in this
PR. We need to apply the new format in order to pass the CI lint stage with
Go 1.19.

With the linter upgrade, I have decided to disable 'gosec' - it produces
too many false-positive warnings. The 'deadcode' and 'varcheck' linters
have also been removed because golangci-lint warns about them being
unmaintained. 'unused' provides similar coverage and we already have it
enabled, so we don't lose much with this change.
2022-09-10 13:25:40 +02:00
Martin Holst Swende d79bd2f0f6
trie: better error reporting (#25645) 2022-09-01 08:41:10 +02:00
rjl493456442 22defa5af7
all: introduce trie owner notion (#24750)
* cmd, core/state, light, trie, eth: add trie owner notion

* all: refactor

* tests: fix goimports

* core/state/snapshot: fix ineffasigns

Co-authored-by: Martin Holst Swende <martin@swende.se>
2022-06-06 17:14:55 +02:00
Qian Bin 65ed1a6871
rlp, trie: faster trie node encoding (#24126)
This change speeds up trie hashing and all other activities that require
RLP encoding of trie nodes by approximately 20%. The speedup is achieved by
avoiding reflection overhead during node encoding.

The interface type trie.node now contains a method 'encode' that works with
rlp.EncoderBuffer. Management of EncoderBuffers is left to calling code.
trie.hasher, which is pooled to avoid allocations, now maintains an
EncoderBuffer. This means memory resources related to trie node encoding
are tied to the hasher pool.

Co-authored-by: Felix Lange <fjl@twurst.com>
2022-03-09 14:45:17 +01:00
Paweł Bylica 86fe359a56
trie: simplify StackTrie implementation (#23950)
Trim the search key from head as it's being pushed deeper into the trie. Previously the search key was never modified but each node kept information how to slice and compare it in keyOffset. Now the keyOffset is not needed as this information is included in the slice of the search key. This way the keyOffset can be removed and key manipulation
simplified.
2021-11-29 11:02:40 +01:00
Martin Holst Swende 581539c6ee
trie: make stacktrie support binary marshal/unmarshal (#22685) 2021-04-20 10:42:02 +02:00
Martin Holst Swende 4f3ba6742f
trie: make stacktrie not mutate input values (#22673)
The stacktrie is a bit un-untuitive, API-wise: since it mutates input values.
Such behaviour is dangerous, and easy to get wrong if the calling code 'forgets' this quirk. The behaviour is fixed by this PR, so that the input values are not modified by the stacktrie. 

Note: just as with the Trie, the stacktrie still references the live input objects, so it's still _not_ safe to mutate the values form the callsite.
2021-04-16 14:21:01 +02:00
gary rong f566dd305e
all: bloom-filter based pruning mechanism (#21724)
* cmd, core, tests: initial state pruner

core: fix db inspector

cmd/geth: add verify-state

cmd/geth: add verification tool

core/rawdb: implement flatdb

cmd, core: fix rebase

core/state: use new contract code layout

core/state/pruner: avoid deleting genesis state

cmd/geth: add helper function

core, cmd: fix extract genesis

core: minor fixes

contracts: remove useless

core/state/snapshot: plugin stacktrie

core: polish

core/state/snapshot: iterate storage concurrently

core/state/snapshot: fix iteration

core: add comments

core/state/snapshot: polish code

core/state: polish

core/state/snapshot: rebase

core/rawdb: add comments

core/rawdb: fix tests

core/rawdb: improve tests

core/state/snapshot: fix concurrent iteration

core/state: run pruning during the recovery

core, trie: implement martin's idea

core, eth: delete flatdb and polish pruner

trie: fix import

core/state/pruner: add log

core/state/pruner: fix issues

core/state/pruner: don't read back

core/state/pruner: fix contract code write

core/state/pruner: check root node presence

cmd, core: polish log

core/state: use HEAD-127 as the target

core/state/snapshot: improve tests

cmd/geth: fix verification tool

cmd/geth: use HEAD as the verification default target

all: replace the bloomfilter with martin's fork

cmd, core: polish code

core, cmd: forcibly delete state root

core/state/pruner: add hash64

core/state/pruner: fix blacklist

core/state: remove blacklist

cmd, core: delete trie clean cache before pruning

cmd, core: fix lint

cmd, core: fix rebase

core/state: fix the special case for clique networks

core/state/snapshot: remove useless code

core/state/pruner: capping the snapshot after pruning

cmd, core, eth: fixes

core/rawdb: update db inspector

cmd/geth: polish code

core/state/pruner: fsync bloom filter

cmd, core: print warning log

core/state/pruner: adjust the parameters for bloom filter

cmd, core: create the bloom filter by size

core: polish

core/state/pruner: sanitize invalid bloomfilter size

cmd: address comments

cmd/geth: address comments

cmd/geth: address comment

core/state/pruner: address comments

core/state/pruner: rename homedir to datadir

cmd, core: address comments

core/state/pruner: address comment

core/state: address comments

core, cmd, tests: address comments

core: address comments

core/state/pruner: release the iterator after each commit

core/state/pruner: improve pruner

cmd, core: adjust bloom paramters

core/state/pruner: fix lint

core/state/pruner: fix tests

core: fix rebase

core/state/pruner: remove atomic rename

core/state/pruner: address comments

all: run go mod tidy

core/state/pruner: avoid false-positive for the middle state roots

core/state/pruner: add checks for middle roots

cmd/geth: replace crit with error

* core/state/pruner: fix lint

* core: drop legacy bloom filter

* core/state/snapshot: improve pruner

* core/state/snapshot: polish concurrent logs to report ETA vs. hashes

* core/state/pruner: add progress report for pruning and compaction too

* core: fix snapshot test API

* core/state: fix some pruning logs

* core/state/pruner: support recovering from bloom flush fail

Co-authored-by: Péter Szilágyi <peterke@gmail.com>
2021-02-08 13:16:30 +02:00
Martin Holst Swende 81678971db
trie, tests/fuzzers: implement a stacktrie fuzzer + stacktrie fixes (#21799)
* trie: fix error in stacktrie not committing small roots

* fuzzers: make trie-fuzzer use correct returnvalues

* trie: improved tests

* tests/fuzzers: fuzzer for stacktrie vs regular trie

* test/fuzzers: make stacktrie fuzzer use 32-byte keys

* trie: fix error in stacktrie with small nodes

* trie: add (skipped) testcase for stacktrie

* tests/fuzzers: address review comments for stacktrie fuzzer

* trie: fix docs in stacktrie
2020-11-09 15:08:12 +01:00
Martin Holst Swende 348c3bc47d
trie: fix flaw in stacktrie pool reuse (#21699) 2020-10-13 13:21:25 +02:00
gary rong 86dd005544
trie: polish commit function (#21692)
* trie: polish commit function

* trie: fix typo
2020-10-12 12:08:04 +02:00
Guillaume Ballet 6c8310ebb4
trie: use stacktrie for Derivesha operation (#21407)
core/types: use stacktrie for derivesha

trie: add stacktrie file

trie: fix linter

core/types: use stacktrie for derivesha

rebased: adapt stacktrie to the newer version of DeriveSha

Co-authored-by: Martin Holst Swende <martin@swende.se>

More linter fixes

review feedback: no key offset for nodes converted to hashes

trie: use EncodeRLP for full nodes

core/types: insert txs in order in derivesha

trie: tests for derivesha with stacktrie

trie: make stacktrie use pooled hashers

trie: make stacktrie reuse tmp slice space

trie: minor polishes on stacktrie

trie/stacktrie: less rlp dancing

core/types: explain the contorsions in DeriveSha

ci: fix goimport errors

trie: clear mem on subtrie hashing

squashme: linter fix

stracktrie: use pooling, less allocs (#3)

trie: in-place hex prefix, reduce allocs and add rawNode.EncodeRLP

Reintroduce the `[]node` method, add the missing `EncodeRLP` implementation for `rawNode` and calculate the hex prefix in place.

Co-authored-by: Martin Holst Swende <martin@swende.se>

Co-authored-by: Martin Holst Swende <martin@swende.se>
2020-09-29 17:38:13 +02:00