This is a simple profiler for torch to estimate processing time per module per function.

It seems to work okay, but there was no heavy testing so far. See test-profiler.lua for a short example.


A simple example would be

require 'torch'
require 'nn'
require 'profiler'

local input = torch.Tensor(100000, 20):uniform()
local target = torch.Tensor(input:size(1), 2):uniform()
local criterion = nn.MSECriterion()

local model = nn.Sequential()
   :add(nn.Linear(input:size(2), 1000))
   :add(nn.Linear(1000, target:size(2)))

profiler.color = nil

local output = model:forward(input)
criterion:forward(output, target)
local dloss = criterion:backward(output, target)
model:backward(input, dloss)

profiler.print(model, input:size(1))

which prints

 nn.Sequential 3.03s [100.00%] (30.3mus/sample)
  :backward 1.94s [64.16%] (19.4mus/sample)
  :updateOutput 1.08s [35.84%] (10.8mus/sample)
  * nn.Linear 1.22s [40.41%] (12.2mus/sample)
    :backward 0.64s [21.24%] (6.4mus/sample)
    :updateOutput 0.58s [19.17%] (5.8mus/sample)
  * nn.ReLU 1.05s [34.62%] (10.5mus/sample)
    :backward 0.74s [24.30%] (7.4mus/sample)
    :updateOutput 0.31s [10.32%] (3.1mus/sample)
  * nn.Linear 0.76s [24.97%] (7.6mus/sample)
    :backward 0.56s [18.62%] (5.6mus/sample)
    :updateOutput 0.19s [6.34%] (1.9mus/sample)



This is a Boolean flag to state if the printing should be done in color. It is true by default.

profiler.decorate(model, [functionsToDecorate])

This function should be called before starting the computation.

It replaces functions specified in functionsToDecorate by instrumented versions which keep track of computation times. If functionsToDecorate is not provided, it decorates by default updateOutput and backward.

It also resets the accumulated timings to zero.

profiler.print(model, [nbSamples], [totalTime])

Prints the measured processing times. If nbSamples is provided, the time per samples will also be printed. If totalTime is not provided, the total at the top is used.

Non-Containers are hilighted with a '*' or in red.