Well done!

2 min readJan 8, 2019

Well done!

I got excited and started thinking about a mitigation.

It would be possible to have some way to change the Bloom filter function dynamically with little overhead in block size. Then, if the current Bloom filter fills up (or goes over some capacity) by an attack like yours, the nodes could switch to a different function for that block. Then the attacker would need to generate twice as many events to fill up the second filter as well, in which case the nodes could switch to a third function, and so on, driving up the cost of this attack. The nodes just make new Bloom filters, discarding them until they reach one that is sufficiently sparse.

According to the yellow paper (Equation (26)-(30), Byzantium version 69351d5), the Bloom filter function takes the (256-bit, or 32-byte) hash of the log event (seen as a byte sequence), then looks at the first 3 pairs of bytes of the hash. Well, what if we shift by 1 byte? Instead of looking at bytes 0, 1,…,5, we look at 1,…,6? That’s a new function, that would give a completely different Bloom filter. And if that filter fills up, we can look at bytes 6,…,11, producing a new filter, and then move on to 7,…,12. That’s 4 different filters, multiplying attack cost by 4, and we could even reuse the hashes computed, which should be the majority of work for making this filter

If we are pushed to the limit, using bytes up to the 29th byte, having produced 8 different filters, we can do another trick. Since the Bloom filter function takes a pair of bytes and takes it modulo 2048 (just taking the lower order 11 bits) to get the index to flip in the filter, we could instead look at the higher order 11 bits, giving a pretty independent result (although 6 bits overlap, so not completely), and then redo the whole process from the paragraph above. That’s 16 different Bloom filters, and no new hashing needed to take place.

Say the attacker is really tenacious, and blasts through those 16 filters. Well, we need to up our game, and hash the first hash. Then we can go again from the top, producing 16 new filters. If the attacker blasts through all those, we triple hash the events, then quadruple hash them, etc.

So how do we indicate which of these algorithms should be used when using the Bloom filter for lookup, i.e., how many filters we discarded? With a single byte, that would in the general case be set to 0, but in an attack would start ticking up, indicating the number of discarded filters. The four lowest bit indicate the shift in the hashed bytes we look at, the four highest how many times we hashed each event. If this byte is 00110010 then the attacker blew through a lot of filters, but for lookup we just need to know that we should hash the event 3 times (0011) and then look at the bytes 6,…,11.

This got me thinking, I’ll see if I can come up with other mitigation strategies!

Written by Rikard A. Hjort

No responses yet