.. index:: double: subsystem; REGEX .. _REGEX-Subsystem: REGEX — Service discovery using regular expressions =================================================== Using the REGEX subsystem, you can discover peers that offer a particular service using regular expressions. The peers that offer a service specify it using a regular expressions. Peers that want to patronize a service search using a string. The REGEX subsystem will then use the DHT to return a set of matching offerers to the patrons. For the technical details, we have Max's defense talk and Max's Master's thesis. .. note:: An additional publication is under preparation and available to team members (in Git). .. todo:: Missing links to Max's talk and Master's thesis .. _How-to-run-the-regex-profiler: How to run the regex profiler ----------------------------- The gnunet-regex-profiler can be used to profile the usage of mesh/regex for a given set of regular expressions and strings. Mesh/regex allows you to announce your peer ID under a certain regex and search for peers matching a particular regex using a string. See `szengel2012ms `__ for a full introduction. First of all, the regex profiler uses GNUnet testbed, thus all the implications for testbed also apply to the regex profiler (for example you need password-less ssh login to the machines listed in your hosts file). **Configuration** Moreover, an appropriate configuration file is needed. In the following paragraph the important details are highlighted. Announcing of the regular expressions is done by the gnunet-daemon-regexprofiler, therefore you have to make sure it is started, by adding it to the START_ON_DEMAND set of ARM: :: [regexprofiler] START_ON_DEMAND = YES Furthermore you have to specify the location of the binary: :: [regexprofiler] # Location of the gnunet-daemon-regexprofiler binary. BINARY = /home/szengel/gnunet/src/mesh/.libs/gnunet-daemon-regexprofiler # Regex prefix that will be applied to all regular expressions and # search string. REGEX_PREFIX = "GNVPN-0001-PAD" When running the profiler with a large scale deployment, you probably want to reduce the workload of each peer. Use the following options to do this. :: [dht] # Force network size estimation FORCE_NSE = 1 [dhtcache] DATABASE = heap # Disable RC-file for Bloom filter? (for benchmarking with limited IO # availability) DISABLE_BF_RC = YES # Disable Bloom filter entirely DISABLE_BF = YES [nse] # Minimize proof-of-work CPU consumption by NSE WORKBITS = 1 **Options** To finally run the profiler some options and the input data need to be specified on the command line. :: gnunet-regex-profiler -c config-file -d log-file -n num-links \ -p path-compression-length -s search-delay -t matching-timeout \ -a num-search-strings hosts-file policy-dir search-strings-file Where\... - \... ``config-file`` means the configuration file created earlier. - \... ``log-file`` is the file where to write statistics output. - \... ``num-links`` indicates the number of random links between started peers. - \... ``path-compression-length`` is the maximum path compression length in the DFA. - \... ``search-delay`` time to wait between peers finished linking and starting to match strings. - \... ``matching-timeout`` timeout after which to cancel the searching. - \... ``num-search-strings`` number of strings in the search-strings-file. - \... the ``hosts-file`` should contain a list of hosts for the testbed, one per line in the following format: - ``user@host_ip:port`` - \... the ``policy-dir`` is a folder containing text files containing one or more regular expressions. A peer is started for each file in that folder and the regular expressions in the corresponding file are announced by this peer. - \... the ``search-strings-file`` is a text file containing search strings, one in each line. You can create regular expressions and search strings for every AS in the Internet using the attached scripts. You need one of the `CAIDA routeviews prefix2as `__ data files for this. Run :: create_regex.py to create the regular expressions and :: create_strings.py to create a search strings file from the previously created regular expressions.