TensorFlow’s eager execution is an imperative programming environment that evaluates operations immediately, without building graphs: operations return concrete values instead of constructing a computational graph to run later.
A TensorFlow variable are manipulated via the tf.Variable class. A tf.Variable
represents a tensor whose value can be changed by running ops on it.
1 | my_variable = tf.Variable(tf.zeros([1, 2, 3])) |
To do operation with Tensor include math operations (such as tf.add
and tf.reduce_mean
), array operations (such as tf.concat
and tf.tile
), string manipulation ops (such as tf.substr
)
https://www.tensorflow.org/guide/ragged_tensor
1 | import tensorflow as tf |
Conventional Neural Network
Sparse Connected
If we have fully connected neural network, such as a 1000*1000 image. Then we have 1M hidden units and parameter. This also lead to an overfit, that is why we need sparse connected or locally connected neural network.
Parameter Sharing
Many hidden layer can share same features in the image
Equivariance
Which means it not sensitive to the position of some features in the image.
Input Layer
The input layer has depth, width and height. We always need to the do normalization(归一化), centralisation(去均值) and Whitening(白化).
Normalization can make all data in the different features have same “range”. Max-min normalization and variance normalization(for no clear range features). When we try to find the best solution by using the gradient descent, the data after normalization can converage faster.
Centralisation let all features data minus the mean of this feature to make sure all features has centre at 0. If we do not have centralised data, the may has a large value and the the W may become small which lead to overfit.
Whitening makes features have less relationship and all features has variance 1. PCA divide the standard deviation.
Gradient Descent
Random initial points in Neural Network will ignore the local minimum.