README.md

   1 # Introduction #
   2
   3 This is a port of the Synthetic Visual Reasoning Test problems to the
   4 pytorch framework, with an implementation of two convolutional
   5 networks to solve them.
   6
   7 # Installation and test #
   8
   9 Executing
  10
  11 ```
  12 make -j -k
  13 ./test-svrt.py
  14 ```
  15
  16 should generate an image example.png in the current directory.
  17
  18 Note that the image generation does not take advantage of GPUs or
  19 multi-core, and can be as fast as 10,000 vignettes per second and as
  20 slow as 40 on a 4GHz i7-6700K.
  21
  22 # Vignette generation and compression #
  23
  24 ## Vignette sets ##
  25
  26 The svrtset.py implements the classes `VignetteSet` and
  27 `CompressedVignetteSet` with the following constructor
  28
  29 ```
  30 __init__(problem_number, nb_samples, batch_size, cuda = False, logger = None)
  31 ```
  32
  33 and the following method to return one batch
  34
  35 ```
  36 (torch.FloatTensor, torch.LongTensor) get_batch(b)
  37 ```
  38
  39 as a pair composed of a 4d 'input' Tensor (i.e. single channel 128x128
  40 images), and a 1d 'target' Tensor (i.e. Boolean labels).
  41
  42 ## Low-level functions ##
  43
  44 The main function for genering vignettes is
  45
  46 ```
  47 torch.ByteTensor svrt.generate_vignettes(int problem_number, torch.LongTensor labels)
  48 ```
  49
  50 where
  51
  52  * `problem_number` indicates which of the 23 problem to use
  53  * `labels` indicates the boolean labels of the vignettes to generate
  54
  55 The returned ByteTensor has three dimensions:
  56
  57  * Vignette index
  58  * Pixel row
  59  * Pixel col
  60
  61 The two additional functions
  62
  63 ```
  64 torch.ByteStorage svrt.compress(torch.ByteStorage x)
  65 ```
  66
  67 and
  68
  69 ```
  70 torch.ByteStorage svrt.uncompress(torch.ByteStorage x)
  71 ```
  72
  73 provide a lossless compression scheme adapted to the ByteStorage of
  74 the vignette ByteTensor (i.e. expecting a lot of 255s, a few 0s, and
  75 no other value).
  76
  77 This compression reduces the memory footprint by a factor ~50, and may
  78 be usefull to deal with very large data-sets and avoid re-generating
  79 images at every batch. It induces a little overhead for decompression,
  80 and moving from CPU to GPU memory.
  81
  82 See vignette_set.py for a class CompressedVignetteSet using it.
  83
  84 # Testing convolution networks #
  85
  86 The file `cnn-svrt.py` provides the implementation of two deep
  87 networks designed by Afroze Baqapuri during an internship at Idiap,
  88 and allows to train them with several millions vignettes on a PC with
  89 16Gb and a GPU with 8Gb.