=== NAME ===

blockHashMake - Make a block hash file for Deadwood 

=== DESCRIPTION ===

blockHashMake is a stand alone command line tool which converts a list 
of host names in to a block hash file which Deadwood can read to block 
a large number of hosts quickly while using a minimum amount of memory 
to store the list of blocked hosts. 

A block hash file uses a special binary format for storing a list of 
blocked host names. 

blockHashMake reads the list of host names from the standard input and 
generates a binary file. 

=== COMMAND LINE ARGUMENTS ===

blockHashMake can be invoked without command line arguments. If invoked 
without arguments, blockHashMake reads the list of host names to block 
from standard input and outputs the block hash to a file name 
"bigBlock.bin" 

blockHashMake can be invoked with a single "--help" or "--version" 
command line argument (e.g. "blockHashMake --version") which will 
output the version number of blockHashMake and provide basic usage 
information. 

The command line arguments are as follows:

blockHashMake [filename] [sip hash key] [hash bucket count]

The filename is the name of the file we output the block hash to. 
If not specified, blockHashMake will output to the file named 
"bigBlock.bin". blockHashMake should not clobber an already existing 
file; if a file named "bigBlock.bin" (or the filename specified on the 
command line) already exists, be sure to delete the file before 
invoking blockHashMake to recreate the file. 

The sip hash key is usually set by the blockHashMake program, which, by 
default, uses /dev/urandom to generate a random 64-bit key for the 
block hash file (the Windows port of blockHashMake uses the 
CryptGenRandom function to get a random 64-bit key). If the sip hash 
key is given a value of 0, this can make a block hash file which can be 
shared on the internet. 

Warning: For security purposes, please set the sip hash key to 0 if 
sharing a block hash file on the internet! 

Deadwood will only load a block hash file with a sip hash key of 0 if 
allow_block_hash_zero_key has a value of 1. 

A user specified sip hash key only has up to 16 bits of entropy. sip 
hash key should not be used if a secret key for the hash compression 
algorithm is desired. 

The hash bucket count is the number of hash buckets the resulting block 
hash file will have. Having more hash buckets makes the block hash file 
larger, but sometimes allows searching for a string in a block hash to 
be a little faster. The default value, which is 125% of the number of 
host names given to blockHashMake, is a reasonable compromise between 
speed and size. 

=== HOST LIST FORMAT ===

After being invoked, blockHashMake reads a list of host names from the 
standard input. The format is a single host name per line of input, 
such as the following:

porn.example.com 
naughty.foo 
evil.host.invalid

Each line is a host name. Should there be a duplicate host name, 
blockHashMake will only store one instance of the host name in 
question. Host names are case insensitive; upper case ASCII letters are 
converted in to lower case letters beofre adding the host name to the 
block hash generated by blockHashMake. 

In order to allow there to be notes in files that blockHashMake reads, 
blockHashMake has simple support for comments: Any line which begins 
with the # character will be ignored by the blockHashMake program. 

For example:

# Porn sites 
porn.example.com 
fetish.example.net 
# Phishing sites 
naughty.foo 
evil.host.invalid

This will add porn.example.com, fetish.example.net, naughty.foo, 
and evil.host.invalid to the block hash file, while ignoring the two 
lines which start with #. 

blockHashMake has no support for Punycode. Please use another program 
to convert international domain names with non-ASCII characters in to 
their punycode representation before adding them to a block hash with 
blockHashMake. 

=== LIMITATIONS ===

The block hash format that blockHashMake uses is a 32-bit format, and 
the resulting block hash file should be under 2,147,483,648 bytes in 
size. This is a limitation of around 30 million host names. 

=== LEGAL DISCLAIMERS ===

THIS SOFTWARE IS PROVIDED BY THE AUTHORS ''AS IS'' AND ANY EXPRESS OR 
IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED 
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE 
DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR 
ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL 
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS 
OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) 
HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, 
STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING 
IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE 
POSSIBILITY OF SUCH DAMAGE. 

This is a project developed on a strictly volunteer, non-commercial 
basis. It has been developed outside the course of a commercial 
activity, developed entirely in the Americas (i.e. outside of Europe) 
and therefore is not subject to the restrictions or conditions of the 
proposed EU Cyber Resilience Act. Someone selling a product that uses 
any component of this may be subject to this act and may need to handle 
any and all necessary compliance. 

=== AUTHORS ===

Sam Trenholme (https://www.samiam.org) is responsible for this program 
and man page.  

