加快TensorFlow在树莓派上的执行速度――模型预热

转载请注明出处： http://www.codelast.com/

本文软硬件环境：

树莓派：3代 Model B V1.2，内存1GB

OS：Arch linux ARM

在上一篇文章中，我写了在树莓派上用TensorFlow做的一个深度学习（图像识别）实验，但正如文中所说，50秒执行一次预测的实用性为0。因此，有必要采取一些措施来加快TensorFlow的执行速度，其中一个可行的方法就是“预热”（warm-up），把TensorFlow移植到树莓派上的作者Sam Abrahams已经比较详细地在GitHub上列出了性能测试的结果。依照作者的描述，我也测试了一下，看看那悲催的50秒时间能减少到多少秒。

『1』什么是预热（warm-up）

首先，本文还是对TensorFlow的python图像分类程序 classify_image.py 来描述的。

预热就是指在真正执行一次预测之前，先执行若干次 Session.run() 方法，从而达到加快一次预测的执行速度的目的。

文章来源： http://www.codelast.com/

『2』代码修改

代码改起来其实很简单。为了能衡量程序运行时间，需要使用Python的time模块，因此在一开始需要import：

import time

然后对 run_inference_on_image 方法做一些修改，如下：

def run_inference_on_image(image): """Runs inference on an image. Args: image: Image file name. Returns: Nothing """ if not tf.gfile.Exists(image): tf.logging.fatal('File does not exist %s', image) image_data = tf.gfile.FastGFile(image, 'rb').read() # the image used to warm-up TensorFlow model warm_up_image_data = tf.gfile.FastGFile('/root/tensorflow-related/test-images/ubike.jpg', 'rb').read() # Creates graph from saved GraphDef. create_graph() with tf.Session() as sess: # Some useful tensors: # 'softmax:0': A tensor containing the normalized prediction across # 1000 labels. # 'pool_3:0': A tensor containing the next-to-last layer containing 2048 # float description of the image. # 'DecodeJpeg/contents:0': A tensor containing a string providing JPEG # encoding of the image. # Runs the softmax tensor by feeding the image_data as input to the graph. softmax_tensor = sess.graph.get_tensor_by_name('softmax:0') print("Warm-up start") for i in range(10): print("Warm-up for time {}".format(i)) predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': warm_up_image_data}) print("Warm-up finished") # record the start time of the actual prediction start_time = time.time() predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': image_data}) predictions = np.squeeze(predictions) # Creates node ID --> English string lookup. node_lookup = NodeLookup() top_k = predictions.argsort()[-FLAGS.num_top_predictions:][::-1] for node_id in top_k: human_string = node_lookup.id_to_string(node_id) score = predictions[node_id] print('%s (score = %.5f)' % (human_string, score)) print("Prediction used time:{} S".format(time.time() - start_time))

其中，我们自己添加的代码有如下几部分：

# the image used to warm-up TensorFlow model warm_up_image_data = tf.gfile.FastGFile('/root/tensorflow-related/test-images/ubike.jpg', 'rb').read()

这里使用了另外一张图片来预热模型（和真正预测时使用的不是同一张图片），为了简单写死了路径。

文章来源： http://www.codelast.com/

print("Warm-up start") for i in range(10): print("Warm-up for time {}".format(i)) predictions = sess.run(softmax_tensor, {'DecodeJpeg/contents:0': warm_up_image_data}) print("Warm-up finished")

这里循环10次来预热模型。

文章来源： http://www.codelast.com/

# record the start time of the actual prediction start_time = time.time() # (中间省略) print("Prediction used time:{} S".format(time.time() - start_time))

这里打印出了真正预测一张图片的执行时间（秒数），这个时间就是我们真正需要关心的，看它能减少到多少秒。

文章来源： http://www.codelast.com/

『3』测试结果

执行和上一篇文章一样的命令，输出如下：

/usr/lib/python3.5/site-packages/tensorflow/python/ops/array_ops.py:1750: VisibleDeprecationWarning: converting an array with ndim > 0 to an index will result in an error in the future

result_shape.insert(dim, 1)

Warm-up start

Warm-up for time 0

W tensorflow/core/framework/op_def_util.cc:332] Op BatchNormWithGlobalNormalization is deprecated. It will cease to work in GraphDef version 9. Use tf.nn.batch_normalization().

Warm-up for time 1

Warm-up for time 2

Warm-up for time 3

Warm-up for time 4

Warm-up for time 5

Warm-up for time 6

Warm-up for time 7

Warm-up for time 8

Warm-up for time 9

Warm-up finished

mountain bike, all-terrain bike, off-roader (score = 0.56671)

tricycle, trike, velocipede (score = 0.12035)

bicycle-built-for-two, tandem bicycle, tandem (score = 0.08768)

lawn mower, mower (score = 0.00651)

alp (score = 0.00387)

Prediction used time: 4.141446590423584 Seconds

文章来源： http://www.codelast.com/

可见：在10次预热之后，一次预测消耗的时间是 4.14 秒，虽然4秒多还是没有达到我们心目中的理想速度，但这已经比之前的50秒强太多了。

此外，从测试结果我们可以体会到的是：预热（Session.run()）的头几次特别慢，后面就快起来了，所以，预热次数太少是不行的。

加快TensorFlow在树莓派上的执行速度――模型预热

Trending Articles

SM3268AB 8CE三星量产无法格式化

[下载工具]Think4V utubedown(Youtube高清视频下载工具) v2.1.6 官方版2.1.3

出售: SINE Othello 電源線

博讯｜张磊帮助下，李源潮的儿子被耶鲁录取

FullEventLogView 1.73 免安裝中文版 - 事件檢視器取代工具

同門四角戀？李沛旭喇舌「小郭雪芙」曾智希，蔡淑臻拍完婚紗...怒毀婚

五代RAV4 降車身（機械車位因素）

[攻略] 《魔獸世界》6.2.2 白色魚人蛋再現！來去收編魚人寶寶特基！

jetBrains Product crack 2024 Java based

2013 KUGA 6G轉動方向盤會聽到摳摳摳的異音，有人知道原因嗎?

【豌豆字幕組】[藥屋少女的呢喃（藥師少女的獨語）/ Kusuriya no Hitorigoto][25][繁體][1080P][MP4]

好用的照片后期处理软件【DxO PhotoLab Elite 5.4.0.4765 (x64) 多语言便携版】..

出售: Thixar Silence Plus 啫喱板

df-dferh-01 中国区 Android 安装 Google Play Store 后报错的解决办法

三條崙討海人故事…重建烏倉寮憶43年前船難

致喬立建設道歉聲明

[一般] 神州全地圖掉寶資料

方易通7862 8/128G 無360 刷機

動感校園小記者・瑪利諾修院學校｜採訪王瑋駿陳晞文帶領試玩風帆

有藍電流行車紀錄器分享文嗎