INTRODUCTION ------------ This is the documentation for the open-source C++ implementation of the folded hierarchy of classifiers for cat detection described in F. Fleuret and D. Geman, "Stationary Features and Cat Detection", Journal of Machine Learning Research (JMLR), 9, 2549-2578, 2008. Please use that citation and the URL http://www.idiap.ch/folded-ctf/ when referring to this software. Contact Francois Fleuret at francois.fleuret@idiap.ch for comments and bug reports. INSTALLATION ------------ If you have installed in the same directory as the source code the RateMyKitten images available on the same web page as the source code, everything should work seamlessly by invoking the ./run.sh script. It will * Compile the source code entirely * Generate the "pool file" containing the uncompressed images converted to gray levels, labelled with the ground truth. * Run 20 rounds of training / test (ten rounds for each of HB and H+B detectors with different random seeds) You can run the full thing with the following commands if you have wget installed > wget http://www.idiap.ch/folded-ctf/data/folding-v1.0.tgz > tar zxvf folding-v1.0.tgz > cd folding > wget http://www.idiap.ch/folded-ctf/data/rmk-v1.0.tgz > tar zxvf rmk-v1.0.tgz > ./run.sh Note that for every round, we have to fully train a detector and run the test through all the test scenes at 10 different thresholds, including at very conservative thresholds for which the computational efforts is very high. Hence, each round takes more than three days on a powerful PC. However, the script detects already running computations by looking at the presence of the corresponding result directories. Hence, it can be run in parallel on several machines as long as they see the same result directory. When all or some of the experimental rounds are over, you can generate ROC curves by invoking ./graph.sh script. You need a fairly recent version of Gnuplot. If you pass the argument "pics" to the ./graphs.sh script, it will save images from the data set with the ground truth plotted on them, the pose-indexed referential, and examples of the pose-indexed feature windows. This program was developed on Debian GNU/Linux computers with the following main tool versions * GNU bash, version 3.2.39 * g++ 4.3.2 * gnuplot 4.2 patchlevel 4 Due to approximations in the optimized arithmetic operations with g++, results may vary with different versions of the compiler and/or different levels of optimization. EXECUTING THE PROGRAM --------------------- The main command has to be invoked with a list of parameter values, followed by commands to execute. A parameter value is modified by adding an argument of the form --parameter-name=value. For instance, to open a scene pool ./something.pool, train a detector and save it with all other parameters kept at their default value, you would do ./folding --pool-name=./something.pool open-pool train-detector write-detector PARAMETERS ---------- For every parameter below, the default value is given between parenthesis. * niceness (5) Process priority * random-seed (0) Global random seed * pictures-for-article ("no") Should the pictures be generated for printing in black and white. * pool-name (none) The scene pool file name. * test-pool-name (none) Should we use a separate test pool file. If none is given, then the test scenes are taken at random from the main pool file according to proportion-for-test. * detector-name ("default.det") Where to write or from where to read the detector. * result-path ("/tmp/") In what directory should we save all the produced files during the computation. * loss-type ("exponential") What kind of loss to use for the boosting. While different losses are implemented in the code, only the exponential has been thoroughly tested. * nb-images (-1) How many images to process in list_to_pool or when using the write-pool-images command. * tree-depth-max (1) Maximum depth of the decision trees used as weak learners in the classifier. The default value of 1 corresponds to stumps. * proportion-negative-cells-for-training (0.025) Overall proportion of negative cells to use during learning (we sample among them for boosting). * nb-negative-samples-per-positive (10) How many negative cells to sample for every positive cell during training. * nb-features-for-boosting-optimization (10000) How many pose-indexed features to look at for optimization at every step of boosting. * force-head-belly-independence ("no") Should we force the independence between the two levels of the detector (i.e. make an H+B detector) * nb-weak-learners-per-classifier (100) This parameter corresponds to the value U in the article. * nb-classifiers-per-level (25) This parameter corresponds to the value B in the article. * nb-levels (2) How many levels in the hierarchy. * proportion-for-train (0.75) The proportion of scenes from the pool to use for training. * proportion-for-validation (0.25) The proportion of scenes from the pool to use for estimating the thresholds. * proportion-for-test (0.25) The proportion of scenes from the pool to use to test the detector. * write-validation-rocs ("no") Should we compute and save the ROC curves estimated on the validation set during training. * write-parse-images ("no") Should we save one image for every test scene with the resulting alarms. This option generates a lot of images for every round and is switched off by default. Switch it on to produce images such as the full page of results in the paper. * write-tag-images ("no") Should we save the (very large) tag images when saving the materials. * wanted-true-positive-rate (0.75) What is the target true positive rate. Note that this is the rate without post-processing and without pose tolerance in the definition of a true positive. * nb-wanted-true-positive-rates (10) How many true positive rates to visit to generate the pseudo-ROC. * min-head-radius (25) What is the radius of the smallest heads we are looking for. * max-head-radius (200) What is the radius of the largest heads we are looking for. * root-cell-nb-xy-per-radius (5) What is the size of a (x,y) square cell with respect to the radius of the head. * pi-feature-window-min-size (0.1) What is the minimum pose-indexed feature windows size with respect to the frame they are defined in. * nb-scales-per-power-of-two (5) How many scales do we visit between two powers of two. * progress-bar ("yes") Should we display a progress bar during long computations. COMMANDS -------- * open-pool Open the pool of scenes. * train-detector Create a new detector from the training scenes. * compute-thresholds Compute the thresholds of the detector classifiers from the validation set to obtain the required wanted-true-positive-rate. * test-detector Run the detector on the test scenes. * sequence-test-detector Visit nb-wanted-true-positive-rates rates between 0 and wanted-true-positive-rate, for each compute the detector thresholds on the validation set and estimate the error rate on the test set. * write-detector Write the current detector to the file detector-name * read-detector Read a detector from the file detector-name * write-pool-images For every of the first nb-images of the pool, save one PNG image with the ground truth, one with the corresponding referential at the reference scale, and one with the feature material-feature-nb from the detector. This last image is not saved if either no detector has been read/trained or if no feature number has been specified. -- Francois Fleuret October 2008