######################################################################
## INTRODUCTION

  This is the C++ implementation of the folded hierarchy of
  classifiers for cat detection described in

     F. Fleuret and D. Geman, "Stationary Features and Cat Detection",
     Journal of Machine Learning Research (JMLR), 2008, to appear.

  Please cite this paper when referring to this software.

######################################################################
## INSTALLATION

  This program was developed on Debian GNU/Linux computers with the
  following main tool versions

   * GNU bash, version 3.2.39
   * g++ 4.3.2
   * gnuplot 4.2 patchlevel 4

  If you have installed the RateMyKitten images provided on

    http://www.idiap.ch/folded-ctf

  in the source directory, everything should work seamlessly by
  invoking the ./run.sh script. It will

   * Compile the source code entirely

   * Generate the "pool file" containing the uncompressed images
     converted to gray levels, labeled with the ground truth.

   * Run 20 rounds of training / test (ten rounds for each of HB and
     H+B detectors with different random seeds)

  You can also run the full thing with the following commands if you
  have wget installed

   > wget http://www.idiap.ch/folded-ctf/not-public-yet/data/folding-gpl.tgz
   > tar zxvf folding-gpl.tgz
   > cd folding
   > wget http://www.idiap.ch/folded-ctf/not-public-yet/data/rmk.tgz
   > tar zxvf rmk.tgz
   > ./run.sh

  Note that every one of the twenty rounds of training/testing takes
  more than three days on a powerful PC. However, the script detects
  already running computations by looking at the presence of the
  corresponding result directory. Hence, it can be run in parallel on
  several machines as long as they see the same result directory.

  When all or some of the experimental rounds are over, you can
  generate the ROC curves by invoking the ./graph.sh script.

  You are welcome to send bug reports and comments to fleuret@idiap.ch

######################################################################
## PARAMETERS

  To set the value of a parameter during an experiment, just add an
  argument of the form --parameter-name=value before the commands that
  should take into account that value.

  For every parameter below, the default value is given between
  parenthesis.

  * niceness (5)

    Process priority

  * random-seed (0)

    Global random seed

  * pictures-for-article ("no")

    Should the pictures be generated for printing in black and white.

  * pool-name (no default)

    Where are the data to use

  * test-pool-name (no default)

    Should we use a separate pool file, and ignore proportion-for-test
    then.

  * detector-name ("default.det")

    Where to write or from where to read the detector.

  * result-path ("/tmp/")

    In what directory should we save all the produced files during the
    computation.

  * loss-type ("exponential")

    What kind of loss to use for the boosting. While different losses
    are implemented in the code, only the exponential has been
    thoroughly tested.

  * nb-images (-1)

    How many images to process in list_to_pool or when using the
    write-pool-images command.

  * tree-depth-max (1)

    Maximum depth of the decision trees used as weak learners in the
    classifier. The default value corresponds to stumps.

  * proportion-negative-cells-for-training (0.025)

    Overall proportion of negative cells to use during learning (we
    sample among them)

  * nb-negative-samples-per-positive (10)

    How many negative cells to sample for every positive cell during
    training.

  * nb-features-for-boosting-optimization (10000)

    How many pose-indexed features to use at every step of boosting.

  * force-head-belly-independence ("no")

    Should we force the independence between the two levels of the
    detector (i.e. make an H+B detector)

  * nb-weak-learners-per-classifier (10)

    This parameter corresponds to the value U in the JMLR paper, and
    should be set to 100.

  * nb-classifiers-per-level (25)

    This parameter corresponds to the value B in the JMLR paper.

  * nb-levels (1)

    How many levels in the hierarchy. This should be 2 for the JMLR
    paper experiments.

  * proportion-for-train (0.5)

    The proportion of scenes from the pool to use for training.

  * proportion-for-validation (0.25)

    The proportion of scenes from the pool to use for estimating the
    thresholds.

  * proportion-for-test (0.25)

    The proportion of scenes from the pool to use to test the
    detector.

  * write-validation-rocs ("no")

    Should we compute and save the ROC curves estimated on the
    validation set during training.

  * write-parse-images ("no")

    Should we save one image for every test scene with the resulting
    alarms.

  * write-tag-images ("no")

    Should we save the (very large) tag images when saving the
    materials.

  * wanted-true-positive-rate (0.5)

    What is the target true positive rate. Note that this is the rate
    without post-processing and without pose tolerance in the
    definition of a true positive.

  * nb-wanted-true-positive-rates (10)

    How many true positive rates to visit to generate the pseudo-ROC.

  * min-head-radius (25)

    What is the radius of the smallest heads we are looking for.

  * max-head-radius (200)

    What is the radius of the largest heads we are looking for.

  * root-cell-nb-xy-per-radius (5)

    What is the size of a (x,y) square cell with respect to the radius
    of the head.

  * pi-feature-window-min-size (0.1)

    What is the minimum pose-indexed feature windows size with respect
    to the frame they are defined in.

  * nb-scales-per-power-of-two (5)

    How many scales do we visit between two powers of two.

  * progress-bar ("yes")

    Should we display a progress bar.

######################################################################
## COMMANDS

   * open-pool

     Open the pool of scenes.

   * train-detector

     Create a new detector from the training scenes.

   * compute-thresholds

     Compute the thresholds of the detector classifiers to obtain the
     required wanted-true-positive-rate

   * test-detector

     Run the detector on the test scenes.

   * sequence-test-detector

     Visit nb-wanted-true-positive-rates rates between 0 and
     wanted-true-positive-rate, for each compute the detector
     thresholds on the validation set, estimate the error rate on the
     test set.

   * write-detector

     Write the current detector to the file detector-name

   * read-detector

     Read a detector from the file detector-name

   * write-pool-images

     Write PNG images of the scenes in the pool.

--
Francois Fleuret
October 2008