SETI — Peer to peer set intersections
The SETI service implements efficient set intersection between two peers over a CADET tunnel. Elements of a set consist of an element type and arbitrary binary data. The size of an element’s data is limited to around 62 KB.
Intersection Sets
Sets created by a local client can be modified (by adding additional elements) and reused for multiple operations. If elements are to be removed, a fresh set must be created by the client.
Set Intersection Modifications
Even when set operations are active, one can add elements to a set. However, these changes will only be visible to operations that have been created after the changes have taken place. That is, every set operation only sees a snapshot of the set from the time the operation was started. This mechanism is not implemented by copying the whole set, but by attaching generation information to each element and operation.
Set Intersection Operations
Set operations can be started in two ways: Either by accepting an operation request from a remote peer, or by requesting a set operation from a remote peer. Set operations are uniquely identified by the involved peers, an application id and the operation type.
The client is notified of incoming set operations by set listeners. A set listener listens for incoming operations of a specific operation type and application id. Once notified of an incoming set request, the client can accept the set request (providing a local set for the operation) or reject it.
Intersection Result Elements
The SET service has two result modes that determine how an operation’s result set is delivered to the client:
Return intersection. All elements of set resulting from the set intersection are returned to the client.
Removed Elements. Only elements that are in the local peer’s initial set but not in the intersection are returned.
libgnunetseti libgnunetseti ————-
Intersection Set API
New sets are created with GNUNET_SETI_create
. Only the local peer’s
configuration (as each set has its own client connection) must be
provided. The set exists until either the client calls
GNUNET_SET_destroy
or the client’s connection to the service is
disrupted. In the latter case, the client is notified by the return
value of functions dealing with sets. This return value must always be
checked.
Elements are added with GNUNET_SET_add_element
.
Intersection Listeners
Listeners are created with GNUNET_SET_listen
. Each time time a
remote peer suggests a set operation with an application id and
operation type matching a listener, the listener’s callback is invoked.
The client then must synchronously call either GNUNET_SET_accept
or
GNUNET_SET_reject
. Note that the operation will not be started until
the client calls GNUNET_SET_commit
(see Section "Supplying a
Set").
Intersection Operations
Operations to be initiated by the local peer are created with
GNUNET_SET_prepare
. Note that the operation will not be started
until the client calls GNUNET_SET_commit
(see Section "Supplying a
Set").
Supplying a Set for Intersection
To create symmetry between the two ways of starting a set operation
(accepting and initiating it), the operation handles returned by
GNUNET_SET_accept
and GNUNET_SET_prepare
do not yet have a set
to operate on, thus they can not do any work yet.
The client must call GNUNET_SET_commit
to specify a set to use for
an operation. GNUNET_SET_commit
may only be called once per set
operation.
The Intersection Result Callback
Clients must specify both a result mode and a result callback with
GNUNET_SET_accept
and GNUNET_SET_prepare
. The result callback
with a status indicating either that an element was received, or the
operation failed or succeeded. The interpretation of the received
element depends on the result mode. The callback needs to know which
result mode it is used in, as the arguments do not indicate if an
element is part of the full result set, or if it is in the difference
between the original set and the final set.
The SETI Client-Service Protocol
Creating Intersection Sets
For each set of a client, there exists a client connection to the
service. Sets are created by sending the GNUNET_SERVICE_SETI_CREATE
message over a new client connection. Multiple operations for one set
are multiplexed over one client connection, using a request id supplied
by the client.
Listeners for Intersection
Each listener also requires a separate client connection. By sending the
GNUNET_SERVICE_SETI_LISTEN
message, the client notifies the service
of the application id and operation type it is interested in. A client
rejects an incoming request by sending GNUNET_SERVICE_SETI_REJECT
on
the listener’s client connection. In contrast, when accepting an
incoming request, a GNUNET_SERVICE_SETI_ACCEPT
message must be sent
over the set that is supplied for the set operation.
Initiating Intersection Operations
Operations with remote peers are initiated by sending a
GNUNET_SERVICE_SETI_EVALUATE
message to the service. The client
connection that this message is sent by determines the set to use.
Modifying Intersection Sets
Sets are modified with the GNUNET_SERVICE_SETI_ADD
message.
Intersection Results and Operation Status
The service notifies the client of result elements and success/failure
of a set operation with the GNUNET_SERVICE_SETI_RESULT
message.
The SETI Intersection Peer-to-Peer Protocol
The intersection protocol operates over CADET and starts with a GNUNET_MESSAGE_TYPE_SETI_P2P_OPERATION_REQUEST being sent by the peer initiating the operation to the peer listening for inbound requests. It includes the number of elements of the initiating peer, which is used to decide which side will send a Bloom filter first.
The listening peer checks if the operation type and application identifier are acceptable for its current state. If not, it responds with a GNUNET_MESSAGE_TYPE_SETI_RESULT and a status of GNUNET_SETI_STATUS_FAILURE (and terminates the CADET channel).
If the application accepts the request, the listener sends back a
GNUNET_MESSAGE_TYPE_SETI_P2P_ELEMENT_INFO
if it has more elements in
the set than the client. Otherwise, it immediately starts with the Bloom
filter exchange. If the initiator receives a
GNUNET_MESSAGE_TYPE_SETI_P2P_ELEMENT_INFO
response, it beings the
Bloom filter exchange, unless the set size is indicated to be zero, in
which case the intersection is considered finished after just the
initial handshake.
The Bloom filter exchange in SETI
In this phase, each peer transmits a Bloom filter over the remaining
keys of the local set to the other peer using a
GNUNET_MESSAGE_TYPE_SETI_P2P_BF
message. This message additionally
includes the number of elements left in the sender’s set, as well as the
XOR over all of the keys in that set.
The number of bits ‘k’ set per element in the Bloom filter is calculated based on the relative size of the two sets. Furthermore, the size of the Bloom filter is calculated based on ‘k’ and the number of elements in the set to maximize the amount of data filtered per byte transmitted on the wire (while avoiding an excessively high number of iterations).
The receiver of the message removes all elements from its local set that
do not pass the Bloom filter test. It then checks if the set size of the
sender and the XOR over the keys match what is left of its own set. If
they do, it sends a GNUNET_MESSAGE_TYPE_SETI_P2P_DONE
back to
indicate that the latest set is the final result. Otherwise, the
receiver starts another Bloom filter exchange, except this time as the
sender.
Intersection Salt
Bloom filter operations are probabilistic: With some non-zero probability the test may incorrectly say an element is in the set, even though it is not.
To mitigate this problem, the intersection protocol iterates exchanging Bloom filters using a different random 32-bit salt in each iteration (the salt is also included in the message). With different salts, set operations may fail for different elements. Merging the results from the executions, the probability of failure drops to zero.
The iterations terminate once both peers have established that they have sets of the same size, and where the XOR over all keys computes the same 512-bit value (leaving a failure probability of 2-511).