5 This is the documentation for the open-source C++ implementation of
6 the folded hierarchy of classifiers for cat detection described in
8 F. Fleuret and D. Geman, "Stationary Features and Cat Detection",
9 Journal of Machine Learning Research (JMLR), 2008, to appear.
11 Please use that citation and the URL
13 http://www.idiap.ch/folded-ctf/
15 when referring to this software.
17 Contact Francois Fleuret at francois.fleuret@idiap.ch for comments
23 If you have installed in the same directory as the source code the
24 RateMyKitten images available on the same web page as the source
25 code, everything should work seamlessly by invoking the ./run.sh
30 * Compile the source code entirely
32 * Generate the "pool file" containing the uncompressed images
33 converted to gray levels, labelled with the ground truth.
35 * Run 20 rounds of training / test (ten rounds for each of HB and
36 H+B detectors with different random seeds)
38 You can run the full thing with the following commands if you have
41 > wget http://www.idiap.ch/folded-ctf/data/folding-v1.0.tgz
42 > tar zxvf folding-v1.0.tgz
44 > wget http://www.idiap.ch/folded-ctf/data/rmk-v1.0.tgz
45 > tar zxvf rmk-v1.0.tgz
48 Note that for every round, we have to fully train a detector and run
49 the test through all the test scenes at 10 different thresholds,
50 including at very conservative thresholds for which the
51 computational efforts is very high. Hence, each round takes more
52 than three days on a powerful PC. However, the script detects
53 already running computations by looking at the presence of the
54 corresponding result directories. Hence, it can be run in parallel
55 on several machines as long as they see the same result directory.
57 When all or some of the experimental rounds are over, you can
58 generate ROC curves by invoking ./graph.sh script. You need a fairly
59 recent version of Gnuplot.
61 If you pass the argument "pics" to the ./graphs.sh script, it will
62 save images from the data set with the ground truth plotted on them,
63 the pose-indexed referential, and examples of the pose-indexed
66 This program was developed on Debian GNU/Linux computers with the
67 following main tool versions
69 * GNU bash, version 3.2.39
71 * gnuplot 4.2 patchlevel 4
73 Due to approximations in the optimized arithmetic operations with
74 g++, results may vary with different versions of the compiler and/or
75 different levels of optimization.
80 The main command has to be invoked with a list of parameter values,
81 followed by commands to execute. A parameter value is modified by
82 adding an argument of the form --parameter-name=value.
84 For instance, to open a scene pool ./something.pool, train a
85 detector and save it with all other parameters kept at their default
88 ./folding --pool-name=./something.pool open-pool train-detector write-detector
93 For every parameter below, the default value is given between
104 * pictures-for-article ("no")
106 Should the pictures be generated for printing in black and white.
110 The scene pool file name.
112 * test-pool-name (none)
114 Should we use a separate test pool file. If none is given, then
115 the test scenes are taken at random from the main pool file
116 according to proportion-for-test.
118 * detector-name ("default.det")
120 Where to write or from where to read the detector.
122 * result-path ("/tmp/")
124 In what directory should we save all the produced files during the
127 * loss-type ("exponential")
129 What kind of loss to use for the boosting. While different losses
130 are implemented in the code, only the exponential has been
135 How many images to process in list_to_pool or when using the
136 write-pool-images command.
140 Maximum depth of the decision trees used as weak learners in the
141 classifier. The default value of 1 corresponds to stumps.
143 * proportion-negative-cells-for-training (0.025)
145 Overall proportion of negative cells to use during learning (we
146 sample among them for boosting).
148 * nb-negative-samples-per-positive (10)
150 How many negative cells to sample for every positive cell during
153 * nb-features-for-boosting-optimization (10000)
155 How many pose-indexed features to look at for optimization at
156 every step of boosting.
158 * force-head-belly-independence ("no")
160 Should we force the independence between the two levels of the
161 detector (i.e. make an H+B detector)
163 * nb-weak-learners-per-classifier (100)
165 This parameter corresponds to the value U in the article.
167 * nb-classifiers-per-level (25)
169 This parameter corresponds to the value B in the article.
173 How many levels in the hierarchy.
175 * proportion-for-train (0.75)
177 The proportion of scenes from the pool to use for training.
179 * proportion-for-validation (0.25)
181 The proportion of scenes from the pool to use for estimating the
184 * proportion-for-test (0.25)
186 The proportion of scenes from the pool to use to test the
189 * write-validation-rocs ("no")
191 Should we compute and save the ROC curves estimated on the
192 validation set during training.
194 * write-parse-images ("no")
196 Should we save one image for every test scene with the resulting
197 alarms. This option generates a lot of images for every round and
198 is switched off by default. Switch it on to produce images such as
199 the full page of results in the paper.
201 * write-tag-images ("no")
203 Should we save the (very large) tag images when saving the
206 * wanted-true-positive-rate (0.75)
208 What is the target true positive rate. Note that this is the rate
209 without post-processing and without pose tolerance in the
210 definition of a true positive.
212 * nb-wanted-true-positive-rates (10)
214 How many true positive rates to visit to generate the pseudo-ROC.
216 * min-head-radius (25)
218 What is the radius of the smallest heads we are looking for.
220 * max-head-radius (200)
222 What is the radius of the largest heads we are looking for.
224 * root-cell-nb-xy-per-radius (5)
226 What is the size of a (x,y) square cell with respect to the radius
229 * pi-feature-window-min-size (0.1)
231 What is the minimum pose-indexed feature windows size with respect
232 to the frame they are defined in.
234 * nb-scales-per-power-of-two (5)
236 How many scales do we visit between two powers of two.
238 * progress-bar ("yes")
240 Should we display a progress bar during long computations.
247 Open the pool of scenes.
251 Create a new detector from the training scenes.
255 Compute the thresholds of the detector classifiers from the
256 validation set to obtain the required wanted-true-positive-rate.
260 Run the detector on the test scenes.
262 * sequence-test-detector
264 Visit nb-wanted-true-positive-rates rates between 0 and
265 wanted-true-positive-rate, for each compute the detector
266 thresholds on the validation set and estimate the error rate on
271 Write the current detector to the file detector-name
275 Read a detector from the file detector-name
279 For every of the first nb-images of the pool, save one PNG image
280 with the ground truth, one with the corresponding referential at
281 the reference scale, and one with the feature material-feature-nb
282 from the detector. This last image is not saved if either no
283 detector has been read/trained or if no feature number has been