zonghan程式筆記: Tensorflow 使用筆記

Input 數據種類

tf.Variable

通常用在model 中需要不斷更新的變數，像是需要training 的weights 或是在LSTM中不用training 但需更新的memory。

Creation :

宣告範例如下：
w = tf.Variable(<value>, name=<optional-name>)
name是變數的名稱，可加可不加
value 為要輸入的數據，格式為tensor。
以上只是宣告，但數據還沒有初始化。

初始化：

開始run data 前，必須先初始variable 裡的data。初始的方法我用過兩個，其中一個就是呼tensorflow 內建的 function " tf.global_variables_initializer() " 直接初始化。
ex:
with tf.Session() as session:
tf.global_variables_initializer().run()
........

另一個方法是讀取之前儲存的model 參數:
ex:
saver = tf.train.Saver()
saved_file_path = "...saved model path..."
with tf.Session() as sess:
# Restore variables from disk.
ckpt = tf.train.get_checkpoint_state(saved_file_path)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess,ckpt.model_checkpoint_path)

tf.Placeholder:

這種型態數據，在run data 時，必須手動餵數據給他。通常用在model 裡的input data.

tf.placeholder(dtype, shape=None, name=None)

dtype :數據格式

shape: 數據維度

feed data:

在呼session run 時，必須給data 給他，範例如下：
x = tf.placeholder(tf.float32, shape=(1024, 1024))
y = tf.matmul(x, x)
with tf.Session() as sess:
rand_array = np.random.rand(1024, 1024)
print(sess.run(y, feed_dict={x: rand_array}))

#or using eval()
print(y.eval(feed_dict={x: rand_array}))

數據輸入feed_dict 是以dict 的格式輸入，dict 的名子為placeholder 的變數名稱

tf.Constant

宣告的同時，就必須給輸入的數據，並且之後無法再做更動

tf.constant(value, dtype=None, shape=None, name='Const')

value ：輸入的數據，格式可以是scalar 或是tensor。也可以用numpy 的matrix 輸入

ex.

import numpy as np

import random

import tensorflow as tf

def twoD_array_random(dim0,dim1):

assert dim0 > 0, 'dim0 need > 0'

assert dim1 > 0, 'dim0 need > 0'

b = np.random.uniform(-0.1 , 0.1, size=[dim0, dim1])

return b

np_ix = twoD_array_random(5,5)

sess = tf.InteractiveSession()

ix = tf.constant(np_ix)

print (ix.eval())

sess.close()

DeBug 技巧

因為tensorflow 運作邏輯是先把所有data flow 路徑先定義好形成"graph"之後再呼session function 跑實際數據。所以寫graph 的當下想要確認某個data flow 節點跑出來的數據是否正確。必須呼叫session function，然後將節點加上.eval() tensorflow 才會跑出數據結果。直接print 是映不出來的，以下為範例:

sess = tf.InteractiveSession()
inp = tf.constant(np_input)
xx = tf.constant(np_conca_x)
mm = tf.constant(np_conca_m)
bb = tf.constant(np_conca_b)
saved_output = tf.constant(np_saved_output)
saved_state = tf.constant(np_saved_state)

temp = tf.matmul(inp, xx) + tf.matmul(saved_output, mm) + bb
i_gate,f_gate,update,o_gate = tf.split(temp,num_or_size_splits=4,axis=1)
saved_state =  saved_state * tf.sigmoid(f_gate) + tf.sigmoid(i_gate) *tf.tanh(update)
saved_output =  tf.tanh(saved_state) * tf.sigmoid(o_gate)
print (saved_state.eval())
print (saved_output.eval())

sess.close()

在debug 時，可以在最前面用"sess = tf.InteractiveSession()" 呼叫session，如範例中我想要知道最後save_state saved_output 的數據結果，在後面加上"eval"，去跑數據再print 即可。如果data input 的形式是constant，eval function 不必加任何參數。但如果是placeholder，因為數據是你要餵給電腦，eval 內必須給feed_dict 來源（ y.eval(feed_dict={x: rand_array}) 。

Gradient update for Optimizer

再tensorflow 中，loss function 的最佳化是call optimizer fuction。optimizer function 有很多種不同演算法，這邊用GradientDescent 當作範例：

ex1.

optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(loss,global_step=global_step)

這是比較簡單的方法，直接將gradient 直接對變數做更新。如果需要對gradient 做而外的處理

可以用下面的例子

ex2:

optimizer = tf.train.GradientDescentOptimizer(learning_rate)

gradients, v = zip(*optimizer.compute_gradients(loss)) #split gradient and variable

gradients, _ = tf.clip_by_global_norm(gradients, 1.25) #control gradient to avoid gradients vanishing

optimizer = optimizer.apply_gradients(zip(gradients, v), global_step=global_step)

上面是LSTM的例子，因為怕gradient 太大，所以gradient 計算出來後不直接update，而是用f.clip_by_global_norm 處理，讓gradient 限制再一定大小之下，最後在update 到parameter 上。

其餘function 使用

split

將tensor 分割成多的tensor

split(value, num_or_size_splits, axis=0,num=None, name='split')

value : 要分割的tensor

num_or_size_splits: 如果為scalar，"num"，表示平均分割出來的tensors 數量。如果為tensors,"size_splits"，表示輸出 tensors 的size 大小

axis : 要執行分割的dimension

Tensorflow 網站example:

# 'value' is a tensor with shape [5, 30]

# Split 'value' into 3 tensors with sizes [4, 15, 11] along dimension 1

split0, split1, split2 = tf.split(value, [4, 15, 11], 1)

tf.shape(split0) ==> [5, 4]

tf.shape(split1) ==> [5, 15]

tf.shape(split2) ==> [5, 11]

# Split 'value' into 3 tensors along dimension 1

split0, split1, split2 = tf.split(value, num_or_size_splits=3, axis=1)

tf.shape(split0) ==> [5, 10]

concat

將tensors 結合成一個tensor

tf.concat(values,axis,name='concat')

values: list of tensors (以list 格式輸入多個tensors)

axis: 執行合併的dimension

example:

# tensor1, tensor2, tensor3 with shape [5, 4], [5, 15], [5, 11]

tensor = tf.concat([ tensor1 , tensor2, tensor3], axis=1)

tf.shape(tensor)==>[5, 30]

2017年6月8日 星期四

Tensorflow 使用筆記