Trainer¶

Todo

This part of the tutorial is currently being done.

The trainer is perhaps the module that is most unique to tensorflow and is most different from theano. Tensroflow uses tf.Session() to parse computational graphs unlike in theano where we’d use the theano.function() methods. For a detailed tutorial on how tensorflow processes and runs graphs, refer this page.

The lenet.trainer.trainer class takes as input an object of lenet.network.lenet5 and lenet.dataset.mnist. After adding them as attributes, it then initializes a new tensorflow session to run the computational graph and initializes all the variables in the graph.

self.network = network
self.dataset = dataset
self.session = tf.InteractiveSession()
tf.global_variables_initializer().run()

The initializer class also calls the lenet.trainer.trainer.summaries() method that initializes the summary writer (tf.summary.FileWriter()) so that any processing on this computational graph could be monitored at tensorboard.

self.summary = tf.summary.merge_all()
self.tensorboard = tf.summary.FileWriter("tensorboard")
self.tensorboard.add_graph(self.session.graph)

The mnist example from tf.examples.tutorials.mnist.input_data() that we use here as self.dataset is written in such a way that given a mini_batch_size, we can easily query and retrieve the next batch as follows:

x, y = self.dataset.train.next_batch(mini_batch_size)

While in theano, we would use the theano.function() method to produce the function to run back prop updates, here we can use the minimizer that we created in lenet.network.lenet5 (self.network.back_prop in lenet.trainer.trainer) to run one update step. We also want to collect (fetch is the tensorflow terminology) self.network.obj and self.network.cost (see definitions at lenet.network.lenet5) to be able to monitor the network training. All this can be done using the following code:

_, obj, cost  = self.session.run(
                    fetches = [self.network.back_prop, self.network.obj, self.network.cost], \
                    feed_dict = {self.network.images:x, self.network.labels:y, \
                                self.network.dropout_prob: DROPOUT_PROBABILITY})

This is similar to how we’d run a theano.function(). The givens operation which is used in theano to feed values to placeholders is now supplied here using feed_dict which takes in a dictionary, whose key, value pair is a node and its initialization value. Here we assign to self.network.images the images we just retrieved, to self.network.labels the y we just queried and to self.network.dropout_prob which was the node controlling the dropout Bernoulli probability, the gloabl defined dropout. We use this value of dropout, since this does back prop. If we were just feeding forward without updating the weights (such as during inference or test) we would not use this probability, instead we would use,

acc = self.session.run(self.network.accuracy,\
                        feed_dict = { self.network.images: x,
                                        self.network.labels: y,
                                        self.network.dropout_prob: 1.0} )

as was used in the lenet.trainer.trainer.accuracy(). The same lenet.trainer.trainer.accuracy() method with different placeholders could be used for testing and training accuracies.

# Testing
x = self.dataset.test.images
y = self.dataset.test.labels
acc = self.accuracy (images =x, labels = y)

# Training
x, y = self.dataset.train.next_batch(mini_batch_size)
acc = self.accuracy (images =x, labels = y)

After a desired number of iterations, we might want to update the tensorboard summaries or print out a cost to use for reference on how well we are training. We can use self.session, which is the same session previously used, to write out all summaries. This run call to session will write out everything we have added to the summaries along the way of building the network itself using our self.tensorboard writer.

x = self.dataset.test.images
y = self.dataset.test.labels
s = self.session.run(self.summary, feed_dict = {self.network.images: x,
                                                self.network.labels: y,
                                                self.network.dropout_prob: 1.0})
self.tensorboard.add_summary(s, iter)

The last thing we have to define is the the lenet.trainer.trainer.train() method. This method will run the training loops for the network that we have definied, taking in input arguments iter= 10000, mini_batch_size = 500, update_after_iter = 1000, summarize = True, with obviously named variables.

The trainer loop can be coded as:

# Iterate over iter
for it in range(iter):
    obj, cost = self.bp_step(mini_batch_size)  # Run a step of back prop minimizer
    if it % update_after_iter == 0:            # Check if it is time to flush out summaries.
        train_acc = self.training_accuracy(mini_batch_size = 50000)
        acc = self.test()                      # Measure training and testing accuracies.
        print(  " Iter " + str(it) +           # Print them on terminal.
                " Objective " + str(obj) +
                " Cost " + str(cost) +
                " Test Accuracy " + str(acc) +
                " Training Accuracy " + str(train_acc)
                )
        if summarize is True:                  # Write summaries to tensorflow
            self.write_summary(iter = it, mini_batch_size = mini_batch_size)

The above code essentially iterates over iter supplied to the method and runs one step of self.network.back_prop method, which we cooked in lenet.network.lenet5.cook(). If it was time to flush out summaries it does so. Finally, once the training is complete, we can call the lenet.trainer.trainer.test() method to produce testing accuracies.

acc = self.test()
print ("Final Test Accuracy: " + str(acc))

Since everything else, including the first layer filters and confusion matrices were all stored in summaries, they would have been adequately flushed out.

The trainer class documentation can be found in Trainer.