Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting /tmp/tensorflow/mnist/input_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting /tmp/tensorflow/mnist/input_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting /tmp/tensorflow/mnist/input_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting /tmp/tensorflow/mnist/input_data/t10k-labels-idx1-ubyte.gz
0.9147
So what's going on in the code? The high level view is:
- import libraries
- define a main method
- import MNIST data
- define the model
- some stuff that seems important but also seems like a bit of a distraction at the moment
- training happens
- testing happens
- and then we actually execute the main that was defined in the earlier code to get our output of 92%
If I'm not focused on the distracting stuff, what does have my attention? I know about importing libraries, and I get the idea of the MNIST data, so my first stop is the model definition. What is going on in that section of code?
- x = tf.placeholder(tf.float32, [None, 784])
- W = tf.Variable(tf.zeros([784, 10]))
- b = tf.Variable(tf.zeros([10]))
- y = tf.matmul(x, W) + b
Okay, I'm confused a bit because I kind of think of variables as placeholders, so I need a new way to think of variables if they're different from placeholders - which they clearly are in TensorFlow. In TensorFlow, a placeholder allows for the data to be fed into the placeholder at execution. "While you can replace any Tensor with feed data, including variables and constants, the best practice is to use a placeholder op node". That means that placeholders have a very specific purpose particularly in relation to variables and constants which can also be fed, though they shouldn't be fed.
What is a variable in TensorFlow anyway? Well, unlike the placeholder, a variable is initialized with a value as opposed to having that value fed to it later.
It's easier to see what happens to the placeholder 'x' later during the training and testing of the model. When the training phase of the program executes it feeds data in the 'x' as well as the 'y_' placeholders. Obviously a little more is going on, I plan to read more about this cross_entropy section of the tutorial, but this was interesting. Far more going on than in a simple Hello World application you might write when learning a new language.
Speaking of learning a new language, I find it exceptionally helpful that TensorFlow and Python work so well out of the box.