如何ps4slim使用指南slim.metrics显示多分类任务

Training Routines in TF-Slim
Training Tensorflow model requirements
- a model represented has a computational graph.
- a loss function to minimize and optimize over.
- the gradient computation of the model weights relative to the loss to perform backpropagation of the error signal.
- a training routine that iteratively does all of the above and updates the weights accordingly.
images, labels = LoadData(...)
predictions = MyModel(images)
slim.losses.log_loss(predictions, labels)
total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = "/logdir/path"
slim.learning.train(
number_of_steps=1000,
save_summaries_secs=60,
save_interval_secs=300)
Evaluation Routines in TF-Slim
It is important to monitor the ‘health’ of the training because optimization could stop functioning properly.
For example for the following reasons:
overfitting (use early stopping, regularization, more data)
vanishing or exploding gradients (clip gradient norm, change activation function, residual skip connection)
non-converging learning (bad initialization, large learning rate, tune optimizer, bug in network)
reaching local minima (update learning rate, dropout)
covariate shift in very deep network (batch normalization)
low performance, high bias (modify model architecture, larger network)
Evaluation for a Single Run
In the simplest use case, we use a model to create the predictions, then specify
the metrics and finally call the evaluation method:
slim.evaluation.evaluation() will perform a single evaluation run.
images, labels = LoadData(...)
predictions = MyModel(images)
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
"accuracy": slim.metrics.accuracy(predictions, labels),
"mse": slim.metrics.mean_squared_error(predictions, labels),
inital_op = tf.group(
tf.global_variables_initializer(),
tf.local_variables_initializer())
with tf.Session() as sess:
metric_values = slim.evaluation.evaluation(
num_evals=10,
inital_op=initial_op,
eval_op=names_to_updates.values(),
final_op=name_to_values.values())
for metric, value in zip(names_to_values.keys(), metric_values):
logging.info('Metric %s has value: %f', metric, value)
Evaluating a Checkpointed Model with Metrics
Often, one wants to evaluate a model checkpoint saved on disk.
The evaluation can be performed periodically during training on a set schedule.
Instead of calling the evaluation() method, we now call evaluation_loop() method. We now provide in addition the logging and checkpoint directory, as well as, a evaluation time interval.
images, labels = load_data(...)
predictions = MyModel(images)
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
'accuracy': slim.metrics.accuracy(predictions, labels),
'precision': slim.metrics.precision(predictions, labels),
'recall': slim.metrics.recall(predictions, targets),
for metric_name, metric_value in names_to_values.iteritems():
tf.summary.scalar(metric_name, metric_value)
tf.summary.scalar(...)
tf.summary.histogram(...)
checkpoint_dir = '/tmp/my_model_dir/'
log_dir = '/tmp/my_model_eval/'
num_evals = 1000
slim.get_or_create_global_step()
slim.evaluation.evaluation_loop(
master='',
checkpoint_dir,
num_evals=num_evals,
eval_op=names_to_updates.values(),
summary_op=tf.summary.merge(summary_ops),
eval_interval_secs=600)
Evaluating at a Given Checkpoint.
When a model has already been trained, and we only wish to evaluate it from its last checkpoint, TF-Slim has provided us with a method calle evaluate_once(). It only evaluates the model at the given checkpoint path.
logits, nodes = CNN_model(inputs, dropout = 0.5, is_training=False)
predictions = tf.argmax(logits, 1)
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
'eval/Accuracy': slim.metrics.streaming_accuracy(predictions, targets),
'eval/Recall@3': slim.metrics.streaming_sparse_recall_at_k(
tf.to_float(logits), tf.expand_dims(targets,1), 3),
'eval/Precision': slim.metrics.streaming_precision(predictions, targets),
'eval/Recall': slim.metrics.streaming_recall(predictions, targets)
print('Running evaluation Loop...')
checkpoint_path = tf.train.latest_checkpoint(checkpoint_dir)
metric_values = slim.evaluation.evaluate_once(
num_evals=num_evals,
master='',
checkpoint_path=checkpoint_path,
logdir=checkpoint_dir,
eval_op=names_to_updates.values(),
final_op=names_to_values.values())
names_to_values = dict(zip(names_to_values.keys(), metric_values))
for name in names_to_values:
print('%s: %f' % (name, names_to_values[name]))原文地址:
TensorFlow-Slim
TF-Slim是一个轻量级库,用于定义,训练和评估TensorFlow中的复杂模型。 tf-slim的组件可以与本地tensorflow以及其他框架(如tf.contrib.learn)自由混合。
import tensorflow as tf
import tensorflow.contrib.slim as slim
d:\program files\python36\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
from ._conv import register_converters as _register_converters
1.Why TF-Slim?
TF-Slim is a library that makes building, training and evaluation neural
networks simple:
Allows the user to define models much more compactly by eliminating
boilerplate code. This is accomplished through the use of
and numerous high level
These tools increase readability and maintainability, reduce the likelihood
of an error from copy-and-pasting hyperparameter values and simplifies
hyperparameter tuning.
Makes developing models simple by providing commonly used
Several widely used computer vision models (e.g., VGG, AlexNet) have been
developed in slim, and are
to users. These can either be used as black boxes, or can be extended in various
ways, e.g., by adding “multiple heads” to different internal layers.
Slim makes it easy to extend complex models, and to warm start training
algorithms by using pieces of pre-existing model checkpoints.
2.What are the various components of TF-Slim?
TF-Slim is composed of several parts which were design to exist independently.
These include the following main pieces (explained in detail below).
provides a new scope named arg_scope that allows a user to define default
arguments for specific operations within that scope.
contains TF-slim’s
definition,
utilities.
contains routines for evaluating models.
contains high level layers for building models using tensorflow.
contains routines for training models.
contains commonly used loss functions.
contains popular evaluation metrics.
contains popular network definitions such as
provides a context manager for easily and safely starting and closing
QueueRunners.
contains weight regularizers.
provides convenience wrappers for variable creation and manipulation.
3.Defining Models
Models can be succinctly defined using TF-Slim by combining its variables, layers and scopes. Each of these elements are defined below.
3.1 Variables
在tensorflow中创建需要预定义的值或初始化 机制(例如从高斯随机采样)。此外,如果一个变量需要在特定设备(如GPU)上创建,必须.
为了减轻变量创建所需的代码,TF-Slim提供了一套函数在
它允许调用者轻松定义变量。
For example, to create a weight variable, initialize it using a truncated
normal distribution, regularize it with an l2_loss and place it on the CPU,
one need only declare the following:
weights = slim.variable('weights',
shape=[10, 10, 3 , 3],
initializer=tf.truncated_normal_initializer(stddev=0.1),
regularizer=slim.l2_regularizer(0.05),
device='/CPU:0')
Note that in native TensorFlow, there are two types of variables: regular
variables and local (transient) variables. The vast majority of variables are
regular variables: once created, they can be saved to disk using a
Local variables are those variables that only exist for the duration of a
session and are not saved to disk.
TF-Slim通过定义模型变量进一步区分变量,模型变量是表示模型参数的变量。模型变量在学习期间经过训练或微调,并在评估或推理过程中从检查点加载。示例包括由slim.fully_connected或slim.conv2d图层创建的变量。非模型变量是在学习或评估过程中使用的所有其他变量,但不是实际执行推理所需的变量。例如,global_step是在学习和评估过程中使用的变量,但它实际上并不是模型的一部分。同样,移动平均值变量可能会反映模型变量,但移动平均值本身不是模型变量。
模型变量和常规变量都可以很容易地通过TF-Slim创建和检索 :
weights = slim.model_variable('weights',
shape=[10, 10, 3 , 3],
initializer=tf.truncated_normal_initializer(stddev=0.1),
regularizer=slim.l2_regularizer(0.05),
device='/CPU:0')
model_variables = slim.get_model_variables()
my_var = slim.variable('my_var',
shape=[20, 1],
initializer=tf.zeros_initializer())
regular_variables_and_model_variables = slim.get_variables()
这个怎么用?当您通过TF-Slim的图层或直接通过slim.model_variable函数创建模型变量时,TF-Slim将该变量添加到tf.GraphKeys.MODEL_VARIABLES集合中。如果您有自己的自定义图层或变量创建例程,但仍希望TF-Slim管理或了解模型变量,该怎么办? TF-Slim提供了一个方便的功能来将模型变量添加到其集合中:
my_model_variable = CreateViaCustomCode()
slim.add_model_variable(my_model_variable)
3.2 Layers
尽管TensorFlow操作集相当广泛,但神经网络的开发人员通常会根据“层”,“损失”,“度量”和“网络”等更高级概念来考虑模型。诸如卷积层,完全连接层或BatchNorm层比单个TensorFlow操作更抽象,并且通常涉及多个操作。此外,与更原始的操作不同,通常(但并非总是)层具有与其关联的变量(可调参数)。例如,神经网络中的卷积层由几个低层操作组成:
Creating the weight and bias variables
Convolving the weights with the input from the previous layer
Adding the biases to the result of the convolution.
Applying an activation function.
只使用普通的TensorFlow代码,这可能是相当费力的:
input = ...
with tf.name_scope('conv1_1') as scope:
kernel = tf.Variable(tf.truncated_normal([3, 3, 64, 128], dtype=tf.float32,
stddev=1e-1), name='weights')
conv = tf.nn.conv2d(input, kernel, [1, 1, 1, 1], padding='SAME')
biases = tf.Variable(tf.constant(0.0, shape=[128], dtype=tf.float32),
trainable=True, name='biases')
bias = tf.nn.bias_add(conv, biases)
conv1 = tf.nn.relu(bias, name=scope)
为了减少重复复制代码的需要,TF-Slim提供了许多方便的操作,这些操作在神经网络层的更抽象层次上定义。例如,将上面的代码与相应TF-Slim代码的调用进行比较:
input = ...
net = slim.conv2d(input, 128, [3, 3], scope='conv1_1')
TF-Slim为构建神经网络的众多组件提供了标准实现。这些包括:
Conv2dInPlane
Conv2dTranspose (Deconv)
FullyConnected
OneHotEncoding
SeparableConv2
TF-Slim还提供了两个称为repeat和stack的元操作 允许用户重复执行相同的操作。例如,考虑一下网络以下片段,其层在池化层之间连续执行多个卷积:
net = slim.conv2d(net, 256, [3, 3], scope='conv3_1')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_2')
net = slim.conv2d(net, 256, [3, 3], scope='conv3_3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
减少代码重复的一种方法是通过for循环:
for i in range(3):
net = slim.conv2d(net, 256, [3, 3], scope='conv3_%d' % (i+1))
net = slim.max_pool2d(net, [2, 2], scope='pool2')
通过使用TF-Slim的repeat操作,这可以变得更清洁:
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
请注意,slim.repeat不仅适用于同一个参数,它还足够聪明地展开范围,以便分配给每个后续调用slim.conv2d的范围都附加下划线和迭代编号。更具体地说,上例中的范围将被命名为’conv3 / conv3_1’,’conv3 / conv3_2’和’conv3 / conv3_3’。
此外,TF-Slim的slim.stack运算符允许调用者重复应用具有不同参数的相同操作来创建stack或图层塔。slim.stack也为每个创建的操作创建一个新的tf.variable_scope。例如,一种创建多层感知器(MLP)的简单方法:
x = slim.fully_connected(x, 32, scope='fc/fc_1')
x = slim.fully_connected(x, 64, scope='fc/fc_2')
x = slim.fully_connected(x, 128, scope='fc/fc_3')
slim.stack(x, slim.fully_connected, [32, 64, 128], scope='fc')
在这个例子中,slim.stack调用slim.fully_connected三次,将函数调用的输出传递给下一个。但是,每次调用中隐藏单元的数量从32变为64再变为128.同样,可以使用堆栈来简化多个卷积塔:
x = slim.conv2d(x, 32, [3, 3], scope='core/core_1')
x = slim.conv2d(x, 32, [1, 1], scope='core/core_2')
x = slim.conv2d(x, 64, [3, 3], scope='core/core_3')
x = slim.conv2d(x, 64, [1, 1], scope='core/core_4')
slim.stack(x, slim.conv2d, [(32, [3, 3]), (32, [1, 1]), (64, [3, 3]), (64, [1, 1])], scope='core')
3.3 Scopes
在TensorFlow中除了(,
中的作用域机制类型之外,TF-Slim还添加了一个名为的新作用域机制,该新作用域允许用户指定一个或多个操作以及一组参数将传递给每个在arg_scope中定义的操作。这个功能最好用例子来说明。考虑下面的代码片段:
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv2')
net = slim.conv2d(net, 256, [11, 11], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005), scope='conv3')
应该清楚的是,这三个卷积层共享许多相同的超参数。两个具有相同的填充,所有三个都具有相同的weights_initializer和weight_regularizer。这段代码很难阅读,并且包含很多重复的值,这些值应该被分解出来。一种解决方案是使用变量指定默认值:
padding = 'SAME'
initializer = tf.truncated_normal_initializer(stddev=0.01)
regularizer = slim.l2_regularizer(0.0005)
net = slim.conv2d(inputs, 64, [11, 11], 4,
padding=padding,
weights_initializer=initializer,
weights_regularizer=regularizer,
scope='conv1')
net = slim.conv2d(net, 128, [11, 11],
padding='VALID',
weights_initializer=initializer,
weights_regularizer=regularizer,
scope='conv2')
net = slim.conv2d(net, 256, [11, 11],
padding=padding,
weights_initializer=initializer,
weights_regularizer=regularizer,
scope='conv3')
该解决方案确保所有三个卷积共享完全相同的参数值,但不会完全减少代码混乱。通过使用arg_scope,我们可以确保每个图层使用相同的值并简化代码:
with slim.arg_scope([slim.conv2d], padding='SAME',
weights_initializer=tf.truncated_normal_initializer(stddev=0.01)
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.conv2d(inputs, 64, [11, 11], scope='conv1')
net = slim.conv2d(net, 128, [11, 11], padding='VALID', scope='conv2')
net = slim.conv2d(net, 256, [11, 11], scope='conv3')
如示例所示,使用arg_scope可使代码更清洁,更简单且更易于维护。请注意,虽然参数值是在arg_scope中指定的,但它们可以在本地覆盖。特别是,当填充参数被设置为’SAME’时,第二个卷积用’VALID’的值覆盖它。
你也可以嵌套arg_scopes并在同一个作用域中使用多个操作。 例如:
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(stddev=0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
with slim.arg_scope([slim.conv2d], stride=1, padding='SAME'):
net = slim.conv2d(inputs, 64, [11, 11], 4, padding='VALID', scope='conv1')
net = slim.conv2d(net, 256, [5, 5],
weights_initializer=tf.truncated_normal_initializer(stddev=0.03),
scope='conv2')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc')
在这个例子中,第一个arg_scope在其作用域中的conv2d和fully_connected层上应用相同的weights_initializer和weights_regularizer参数。在第二个arg_scope中,仅指定了conv2d的其他默认参数。
3.4 Working Example: Specifying the VGG16 Layers
通过结合TF-Slim变量,操作和范围,我们可以用很少的代码行编写一个通常非常复杂的网络。例如,整个体系结构可以通过以下代码片段来定义:
def vgg16(inputs):
with slim.arg_scope([slim.conv2d, slim.fully_connected],
activation_fn=tf.nn.relu,
weights_initializer=tf.truncated_normal_initializer(0.0, 0.01),
weights_regularizer=slim.l2_regularizer(0.0005)):
net = slim.repeat(inputs, 2, slim.conv2d, 64, [3, 3], scope='conv1')
net = slim.max_pool2d(net, [2, 2], scope='pool1')
net = slim.repeat(net, 2, slim.conv2d, 128, [3, 3], scope='conv2')
net = slim.max_pool2d(net, [2, 2], scope='pool2')
net = slim.repeat(net, 3, slim.conv2d, 256, [3, 3], scope='conv3')
net = slim.max_pool2d(net, [2, 2], scope='pool3')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv4')
net = slim.max_pool2d(net, [2, 2], scope='pool4')
net = slim.repeat(net, 3, slim.conv2d, 512, [3, 3], scope='conv5')
net = slim.max_pool2d(net, [2, 2], scope='pool5')
net = slim.fully_connected(net, 4096, scope='fc6')
net = slim.dropout(net, 0.5, scope='dropout6')
net = slim.fully_connected(net, 4096, scope='fc7')
net = slim.dropout(net, 0.5, scope='dropout7')
net = slim.fully_connected(net, 1000, activation_fn=None, scope='fc8')
return net
4.训练模型
训练Tensorflow模型需要一个model,一个损失函数,梯度计算和一个迭代计算关于损失的模型权重的梯度并相应地更新权重的训练例程。 TF-Slim提供了常见的损失函数和一组运行训练和评估例程的辅助函数。
4.1 Losses
损失函数定义了我们想要最小化的量。对于分类问题,这通常是跨分类的真实分布与预测概率分布之间的交叉熵。对于回归问题,这通常是预测值和真值之间的平方和差异。
某些模型(如多任务学习模型)需要同时使用多种损失函数。换句话说,最终被最小化的损失函数是各种其他损失函数的总和。例如,考虑预测两者的模型图像中的场景类型以及每个像素相机的深度。该模型的损失函数将是分类损失和深度预测损失的总和。
TF-Slim提供了一种易于使用的机制,通过模块来定义和跟踪损失功能。考虑我们想要训练VGG网络的简单情况:
import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
vgg = nets.vgg
images, labels = ...
predictions, _ = vgg.vgg_16(images)
loss = slim.losses.softmax_cross_entropy(predictions, labels)
在这个例子中,我们从创建模型开始(使用TF-Slim的VGG实现),并添加标准分类损失。现在,让我们看看我们有一个产生多个输出的多任务模型的情况:
images, scene_labels, depth_labels = ...
scene_predictions, depth_predictions = CreateMultiTaskModel(images)
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
total_loss = classification_loss + sum_of_squares_loss
total_loss = slim.losses.get_total_loss(add_regularization_losses=False)
在这个例子中,我们有两个损失,我们通过调用slim.losses.softmax_cross_entropy和slim.losses.sum_of_squares添加。我们可以通过(total_loss)或调用slim.losses.get_total_loss()将它们加在一起来获得全部损失。这是如何工作的?当您通过TF-Slim创建损失函数时,TF-Slim将损失添加到特殊的TensorFlow损失函数集合中。这使您可以手动管理全部损失,或允许TF-Slim为您管理它们。
如果你想让TF-Slim为你管理损失但是有自定义损失函数呢?
也有一个函数,增加了这种损失TF-Slims收集。例如:
images, scene_labels, depth_labels, pose_labels = ...
scene_predictions, depth_predictions, pose_predictions = CreateMultiTaskModel(images)
classification_loss = slim.losses.softmax_cross_entropy(scene_predictions, scene_labels)
sum_of_squares_loss = slim.losses.sum_of_squares(depth_predictions, depth_labels)
pose_loss = MyCustomLossFunction(pose_predictions, pose_labels)
slim.losses.add_loss(pose_loss)
regularization_loss = tf.add_n(slim.losses.get_regularization_losses())
total_loss1 = classification_loss + sum_of_squares_loss + pose_loss + regularization_loss
total_loss2 = slim.losses.get_total_loss()
在这个例子中,我们可以手动产生总损失函数,或者让TF-Slim知道额外损失,并让TF-Slim处理损失。
4.2 训练循环
TF-Slim在为训练模型提供了一套简单但功能强大的工具。
这些函数包括训练函数,可以重复测量损失,计算梯度并将模型保存到磁盘,以及用于操纵梯度的多种便利函数。例如,一旦我们指定了模型,损失函数和优化方案,我们可以调用slim.learning.create_train_op和slim.learning.train来执行优化:
g = tf.Graph()
total_loss = slim.losses.get_total_loss()
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
train_op = slim.learning.create_train_op(total_loss, optimizer)
logdir = ...
slim.learning.train(
number_of_steps=1000,
save_summaries_secs=300,
save_interval_secs=600):
在这个例子中,train_op提供给slim.learning.train,它用于(a)计算损失和(b)应用梯度步骤。logdir指定存储检查点和事件文件的目录。我们限制任何的梯度步数。在这种情况下,我们要求采取1000步骤。最后,save_summaries_secs = 300表示我们每5分钟计算一次摘要,save_interval_secs = 600表示我们每10分钟就会保存一次模型检查点。
4.3 Working Example: Training the VGG16 Model
为了说明这一点,我们来看看下面的VGG网络训练样例:
import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
slim = tf.contrib.slim
vgg = nets.vgg
train_log_dir = ...
if not tf.gfile.Exists(train_log_dir):
tf.gfile.MakeDirs(train_log_dir)
with tf.Graph().as_default():
images, labels = ...
predictions = vgg.vgg_16(images, is_training=True)
slim.losses.softmax_cross_entropy(predictions, labels)
total_loss = slim.losses.get_total_loss()
tf.summary.scalar('losses/total_loss', total_loss)
optimizer = tf.train.GradientDescentOptimizer(learning_rate=.001)
train_tensor = slim.learning.create_train_op(total_loss, optimizer)
slim.learning.train(train_tensor, train_log_dir)
5.Fine-Tuning 现有模型
5.1 从检查点恢复变量
模型训练完成后,可以使用tf.train.Saver()恢复给定检查点的Variables。对于很多情况,tf.train.Saver()提供了一个简单的机制来恢复所有或仅仅一些变量。
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
restorer = tf.train.Saver()
restorer = tf.train.Saver([v1, v2])
with tf.Session() as sess:
restorer.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
有关更多详细信息,请参阅并页面部分的。
5.2 部分恢复模型
通常需要在全新的数据集上对预先训练好的模型进行微调,甚至是新的任务。在这些情况下,可以使用TF-Slim的辅助函数来选择要恢复的变量子集:
v1 = slim.variable(name="v1", ...)
v2 = slim.variable(name="nested/v2", ...)
variables_to_restore = slim.get_variables_by_name("v2")
variables_to_restore = slim.get_variables_by_suffix("2")
variables_to_restore = slim.get_variables(scope="nested")
variables_to_restore = slim.get_variables_to_restore(include=["nested"])
variables_to_restore = slim.get_variables_to_restore(exclude=["v1"])
restorer = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
restorer.restore(sess, "/tmp/model.ckpt")
print("Model restored.")
5.3 使用不同的变量名恢复模型
当从检查点恢复变量时,Saver将在检查点文件中定位变量名,并将它们映射到当前图中变量。上面,我们通过传递一个变量列表来创建一个保存器。在这种情况下,在每个var.op.name中提供的变量隐式地获取要在检查点文件中定位的变量名。
当检查点文件中的变量名称与图中的变量名称匹配时,可以很好地工作。但是,有时我们想从一个检查点恢复一个模型,该检查点的变量名称与当前图形中的名称不同。在这种情况下,我们必须给Saver提供一个字典,从每个检查点变量名映射到每个图变量。考虑下面的例子,通过一个简单的函数获取检查点变量名称:
def name_in_checkpoint(var):
return 'vgg16/' + var.op.name
def name_in_checkpoint(var):
if "weights" in var.op.name:
return var.op.name.replace("weights", "params1")
if "bias" in var.op.name:
return var.op.name.replace("bias", "params2")
variables_to_restore = slim.get_model_variables()
variables_to_restore = {name_in_checkpoint(var):var for var in variables_to_restore}
restorer = tf.train.Saver(variables_to_restore)
with tf.Session() as sess:
restorer.restore(sess, "/tmp/model.ckpt")
5.4 在不同的任务上对模型进行微调
考虑一下我们有预先训练好的VGG16模型的情况。该模型在ImageNet数据集上进行了训练,该数据集有1000个类。但是,我们想将其应用于仅有20个类别的Pascal VOC数据集。为此,我们可以使用不包括最后一层的预先训练的模型的值来初始化我们的新模型:
image, label = MyPascalVocDataLoader(...)
images, labels = tf.train.batch([image, label], batch_size=32)
predictions = vgg.vgg_16(images)
train_op = slim.learning.create_train_op(...)
model_path = '/path/to/pre_trained_on_imagenet.checkpoint'
log_dir = '/path/to/my_pascal_model_dir/'
variables_to_restore = slim.get_variables_to_restore(exclude=['fc6', 'fc7', 'fc8'])
init_fn = assign_from_checkpoint_fn(model_path, variables_to_restore)
slim.learning.train(train_op, log_dir, init_fn=init_fn)
6.评估模型
一旦我们训练了一个模型(或者甚至在模型忙于训练时),我们希望看到模型在实践中的表现如何。这是通过选择一组评估指标来完成的,评估指标将评估模型的性能,以及实际加载数据,执行评估,将结果与实际情况进行比较并记录评估分数的评估代码。该步骤可以执行一次或定期重复。
我们将度量定义为不是损失函数的性能度量(损失在训练期间直接进行优化),但我们仍然对评估模型感兴趣。例如,我们可能想要最大限度地减少对数损失,但我们的感兴趣的指标可能是F1分数(测试准确性)或IOU分数(这是不可区分的,因此不能用作损失)。
TF-Slim提供了一系列度量操作,使评估模型变得简单。抽象地说,计算度量值可以分为三部分:
初始化:初始化用于计算度量标准的变量。
聚合:执行用于计算度量标准的操作(总和等)。
完成:(可选)执行任何最终操作来计算度量值。For example, computing means, mins, maxes, etc.
例如,为了计算mean_absolute_error,将两个变量count和 total变量初始化为零。 聚合期间,我们观察了一些预测和标签集合,计算它们的绝对差异并将总和加到total。每次我们观察另一个值时,count递增。最后,在定稿过程中,total除以count以获得平均值。
以下示例演示了用于声明度量标准的API。由于度量标准通常是在与训练集不同的测试集上进行评估的,因此我们假定我们使用的是测试数据:
images, labels = LoadTestData(...)
predictions = MyModel(images)
mae_value_op, mae_update_op = slim.metrics.streaming_mean_absolute_error(predictions, labels)
mre_value_op, mre_update_op = slim.metrics.streaming_mean_relative_error(predictions, labels)
pl_value_op, pl_update_op = slim.metrics.percentage_less(mean_relative_errors, 0.3)
如示例所示,创建度量标准返回两个值:value_op and an update_op.
value_op是一个幂等操作,返回度量的当前值。update_op是执行上述aggregation步骤并返回度量值的操作。
跟踪每个value_op和update_op可能会很费力。为了解决这个问题,TF-Slim提供了两个便利函数:
value_ops, update_ops = slim.metrics.aggregate_metrics(
slim.metrics.streaming_mean_absolute_error(predictions, labels),
slim.metrics.streaming_mean_squared_error(predictions, labels))
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
6.2 Working example: Tracking Multiple Metrics
Putting it all together:
import tensorflow as tf
import tensorflow.contrib.slim.nets as nets
slim = tf.contrib.slim
vgg = nets.vgg
images, labels = load_data(...)
predictions = vgg.vgg_16(images)
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
"eval/mean_absolute_error": slim.metrics.streaming_mean_absolute_error(predictions, labels),
"eval/mean_squared_error": slim.metrics.streaming_mean_squared_error(predictions, labels),
num_batches = 1000
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
sess.run(tf.local_variables_initializer())
for batch_id in range(num_batches):
sess.run(names_to_updates.values())
metric_values = sess.run(names_to_values.values())
for metric, value in zip(names_to_values.keys(), metric_values):
print('Metric %s has value: %f' % (metric, value))
can be used in isolation without using either
6.3 Evaluation Loop
TF-Slim提供了评估模块(),其中包含辅助函数,用于使用模块的指标编写模型评估脚本。这些函数包括定期运行评估,评估各批数据的指标以及打印和汇总度量结果。例如:
import tensorflow as tf
slim = tf.contrib.slim
images, labels = load_data(...)
predictions = MyModel(images)
names_to_values, names_to_updates = slim.metrics.aggregate_metric_map({
'accuracy': slim.metrics.accuracy(predictions, labels),
'precision': slim.metrics.precision(predictions, labels),
'recall': slim.metrics.recall(mean_relative_errors, 0.3),
summary_ops = []
for metric_name, metric_value in names_to_values.iteritems():
op = tf.summary.scalar(metric_name, metric_value)
op = tf.Print(op, [metric_value], metric_name)
summary_ops.append(op)
num_examples = 10000
batch_size = 32
num_batches = math.ceil(num_examples / float(batch_size))
slim.get_or_create_global_step()
output_dir = ...
eval_interval_secs = ...
slim.evaluation.evaluation_loop(
checkpoint_dir,
num_evals=num_batches,
eval_op=names_to_updates.values(),
summary_op=tf.summary.merge(summary_ops),
eval_interval_secs=eval_interval_secs)

我要回帖

更多关于 slim 图像分类 的文章

 

随机推荐