2018年1月26日 星期五

Point Matching Algorithm for Rigid Body


因為在工作上遇到影像對位的問題,此演算法可以很好的解決(雖然不是100%解決)所以特別將此演算法紀錄下來。這也是我第一次將paper 的演算法用code 從0開始完成實踐。演算法是參考這篇paper "New algorithms for 2d and 3d point matching:: pose estimation and correspondence"

1.問題描述

Network switch (Fig.1)代工廠在機台出場前,要將機台上的燈號全數打開做檢測,因為燈太多,用人眼檢測有困難,所以想用影像自動檢測。檢測的目標很單純:確認每個燈號的亮度、顏色是否異常 (燈不亮、顏色異常)。而我的演算法主要分成1)前處理:標出每個燈號亮點在圖上的位置. 2)對位:找出亮光與機台燈的對應關係(該亮點屬於機台上哪個燈號)。前處理沒什麼困難,只要機台正面沒有反光,用threshold就能將燈號凸顯出來(Fig.2)。但對位就沒這麼容易
Fig.1 Network Switch



Fig.2  前處理threshold 後,亮燈凸顯出來

2.為何需要對位?

如果每次前處理的圖都像Fig.2 這麼完美,沒有燈是不亮的(簡稱Off),也沒有其他不是燈號的光被抓出來(之後簡稱漏光),那其實只要把每個亮點從上往下,從左往右照順序排,就可以知道應該對應機台上哪個位置的燈號。但實際情況會有燈故障(不然幹嘛要檢測)或者會有一些漏光必須排除,只要一發生就無法對位,所以最直接的方法是:將一個golden image (沒有Off 或漏光) 去跟檢測時的亮點作對位,位置最靠近哪個golden 亮點就標記為哪個燈號。


3.困難點在哪?

要兩張圖golden 跟 檢測(以後稱test)作對位,因為記錄golden 時其機台位置不可能剛好跟test時的機台位置完全相同,會有些許差異,像是1)位置差異 2)與camera 距離差異  3)擺放角度差異。反應在camera 2D影像上就有 平移(translation)、scaling、rotation、projection(1-point) 的差異。如下圖所示(平移好理解,就不表示了)



Camera距離差異造成image的縮放差異(scaling),不同方向的旋轉就產生rotation 跟 projection 的差異,所以單純將golden 跟test image 疊在一起,燈位置是不可能對在一起的,所以才用了point matching 演算法,自動修正golden test 的影像差異,得到燈號的對應關係。

4.Point matching Algorithm

此演算法可以將兩組Point set {X} {Y} 之間的 1)空間差異找出 2)同時將XY之間每個點的對應找出來,X point set 的每個點最多只能對應一個Y Point,反之一個Y Point 最多只能對應一個X Point。其作法是用一個Cost function 來量化XY point set之間matching 的程度,matching 的愈好function 愈小,matching愈差funciton 值愈大。然後最佳化Cost Function (找出最小值時的參數)。

4.1 Cost Function 組成

Cost Function組成為Two Point set {Xj} {Yk},空間差異修正 {A, t , fp},對應關係的matching matrix {mjk}:

J K is the number of points in the point set X and Y
t 是translation 修正. A包含了Scaling and rotation 修正,由 {a,  theta} 參數組合(paper 中 shear transform 我用不到,所以沒列)


,

fp 是project 修正,這是paper 上沒有的


p 是projection 參數. Ykx : x component of the k-th point of point set Y
這邊我只修正了x 方向上 projection 造成的差異,因為我的case points x 方向分佈範圍的遠比y方向大,所以修正x方向即可,如果其他case y 方向也需要修正,分母須再加上和外參數乘上y component of point。

.* 是element-wise 乘法

matching matrix element mjk 其實際意義是 j-th point in X 跟 k-th point in Y 之間相關性的機率,必須有以下限制

,,

在這限制下,1個point 在另一個point set 中最多只會對應到一個點(有可能一個點都沒有)

g(A) 是regularization term,讓參數 a , p 不會過大

α 當作是outlier threshold ,如果某點跟另一個point set 所有點的距離都大於α,就被當作是outlier。這非常重要,因為這可以排除我們不要的漏光



4.2 The Softassign

這個演算法最大的重點是matching matrix的決定,如上一段講到最終目標是a)matrix 的每一個element 不是0 就是1,b)且每個row 或 column 的合小於等於1。但如果一開始matching 時,matrix 的element 就只有0跟1兩個選擇,只要一開始有matching 錯誤產生,就沒有修正回來的可能。而Softassign 是讓matrix element 可以介於0~1之間,讓matching 變成連續性的,這樣matrix 決定過程中才更有彈性。

4.2.1 簡化問題,不考慮outlier 的存在,且J=K。
假設現在只看{Y}其中一個點,要對應到{X}裡距離最近的點,點跟點的間距為Qj,matching matrix 為mj,但要滿足∑mj = 1 條件。我們的目標是:

這也等於找到讓 ∑mjQj 會有最小值時的mj。現在我們來讓mj變成連續分佈

這是softmax function,可以確保mj 都是正數且總和為1,當Qj 愈小mj就愈接近1。
這邊的β用處是當其愈大,最小值Qj 對應的mj 會愈近於1,其他會愈近於0 。這在我們的演算法很重要,當一開始matching 混屯未明時,β設定較小,讓matching 的自由度較大,然後β隨著matching 的進行讓其越來越大,到matching 完成時,matching matrix 就會達到我們的目標,這個方法叫作deterministic annealing method。

現在更進一步,不只找一個點,而是point set {X}{Y} 中每個點的matching。mjk ∈{0,1} 且須滿足∀j ∑kmjk = 1 and ∀k ∑j mjk = 1。目標是要找下面式子的最小值時的mjk


方法跟{X}只有一點的情況有點不同,因為現在matrix 的每一個 row 跟 column 都要滿足總和為1,不能單純用一個softmax function 達到。必須先將mjk = exp(-β*Qjk),然後對row 跟 column 反覆normalize,直到matrix 中每一個element 都收斂。

我們可以看的上面E function 其實跟我們的Cost function 的型式很像,cost function 為
兩個式子一對照,Qjk 就等於


4.2.2 考慮outlier 的存在,且J 不等於K
現在我們把outlier 以及{X}{Y} 的點數不同的情況考慮進來。這情況下mjk的限制條件為不等式:∀j ∑kmjk ≤1 and ∀k ∑mjk ≤ 1 ,但softassign必須再等式下才能對row column 進行normalize ,matrix每個row 跟column 必須多加一個slack term element讓不等式就變成等式:

下面範例是一個2*3 matching matrix normalize 前後的樣子。Slack term 起始用1是有意義的。mjk = exp(-β*Qjk) 後,如果某個 row 或 column 全部的element 都小於slack term,經過normalize 後全部都會接近0,也就是找不到對應點,也就是outlier。那slack term 定成1的話 ,只要距離大於α 的兩點,其 Qjk > 0  ,然後 mjk < 1 也就是小於slack,就會被當成outlier,因此α就是outlier threshold 原因就在此 
j=2 就是所謂的outlier 

4.3 Algorithm step by step

下面將演算法每一步驟列出來

Initialize :θ, t, a and p to zero, β to β0, mjk slack term to 1, γ to γ0
Begin A: Do A until (β ≥ βf)
    Begin B: Do B until θ, t, a and p converge or number of iterations > I0
         Begin C : update matching matrix by softassign
         
             mjk = exp(-β*Qjk)
             Begin D : Do untial m converge or number of iterations > I1
                 Update m by normalizing across all rows and columns
             End D
          End C
          Begin E: update θ, t, a and p by coordinate descent
              update  θ  using analytical solution
              update  t   using analytical solution
              update  a  using Newton`s method
              update  p  using Newton`s method
          End E
     End B
     β = βr*β, γ = γ/βr
End A

與paper 不太一樣的是 Qjk 在paper 裡的計算跟我寫的有正負差異,因為我把負號寫在了exp() 裡了,所以最後結果跟paper 一樣。

 θ 的update equations paper 裡給的有錯誤,正確的equation 如下:
 paper 裡的分母+-號搞錯。這邊W = s(a)*Yk   。index 1,2 分別代表座標 x,y。

我的hyper-parameter 設定為 β0 = 0.00009, βr =1.05, βf = 0.1,  γ0 = 100,  α = 200, I0 = 100, I1 = 30. 這些是實驗結果試出最好的參數,其中最重要的是β0,設的太小自由度太高pose parameter 會跑掉,設的太大自由度太低pose parameter 無法變動太多導致沒有matching 效果。次重要的是βr 跟 βf,βr 在不影響結果下可以設大一點減少Step A 的iteration 次數來加快matching速度。βf 也一樣可以設小一點來加快速度,雖然小的βf 會讓matching 的mjk < 1 ,但只要設個threshold就行了,例如設0.6 為threshold ,只要mjk > 0.6 就設成1就行。γ對實驗結果沒什麼影響,可能是因為cost function 裡點間距貢獻要遠大於 γ regularized term 的貢獻吧。

5. Result

藍色為golden LED point,紅色為test 時的亮點。matching 後golden  完美的跟test 重疊,沒有重疊到的紅色點位就是漏光。所以此演算法可以很好的將兩個point set 自動去對位,不論是有outlier 或是有漏點,都不受影響。

matching 前

matching 後


6. 例外情況

此演算法雖然有效,但也不是萬能的,有些特殊狀況會造成對位失敗

6.1 對位前跟太多錯誤的點距離太近

例子如下,matching 前藍色點整個往右shift,剛好point set 是左右長狀分佈,所以跟很多不該對到位的點距離太近,導致跑演算法時被這些錯誤的對應關係(本該為0的mjk 太大)限制住而找不到最適合的 {θ, t, a, p}。解決辦法是先將兩point set 的平均位置移到相同處,再開始matching。
matching 前

matching 後










6.2 頭或尾缺點

這是我這種左右長條分佈會遇到的問題,如果左邊開頭有缺點,matching 時 a, p 會調過度讓兩point set 最左邊的點去對齊,反而讓中間的點沒對到,而被誤認是outlier。我認為原因是因為outlier 的數量不會被算進cost function 裡,所以當update a, p時,頭尾對齊的cost 會比較小(因為整體的 || Xj - Yk ||會較小 ),即使這樣會讓outlier 變多,因為cost 不管outlier 的數量。解決辦法是跑兩次演算法,第1次只更新 {θ, t},將結果當作第2次時 {θ, t}的起始值,並且將第二次時的β0 調大10倍,減少自由度以免a, p 跑太多。但這麼做的前題是兩個point ste 之間 a, p 差異不能太大,我這個case 機台位置不會差異太大,所以能用。 
matching 前

matching 後










6.3 不只一組最佳解

如果缺點後剛好有一組以上的{θ, t, a, p} 可以完美matching,就可能發生錯誤。如下圖,紅色point set 往左或右平移兩個點都可以完美matching,那結果會隨機找到一組最佳解。所以point 分佈太簡單,太有規律的分佈,很可能會發生這種狀況,目前還沒想到解法。不過我的case 目前還沒發生這種極端情況。





Source Code


from __future__ import print_function
import os
import numpy as np
import random
from six.moves import xrange
import math
from matplotlib import pylab
import time

class Point_match:
    
    def __init__(self):
        self.theta = 0  #rotation angle
        self.a = 0  #scaling parameter
        self.p = 0  #projecting parameter
        self.t = np.array([[0.0] , [300.0]]) #translation
        self.degree_to_radian = 3.1415926 / 180
        self.radian_to_degree = 180 / 3.1415926
        
        self.gamma_0 = 30  #initial regularization hyper-parameter
        self.beta_0 = 0.00009 # initial annealing hyper-parameter
        self.beta_f = 0.1  # final annealing hyper-parameter
        self.beta_r = 1.05 # annealing changing rate
        self.ep0 = 0.01 
        self.ep1 = 0.01  # error threshold for point_match step D
        self.ep2 = 0.005 # error threshold for point_match step B
        self.Get_A_matrix = {'Affine': self.Affine,'DA_Da': self.DA_Da,'DA_Da_2': self.DA_Da_2}
        
    def Affine (self,a,theta):
        radians = theta * self.degree_to_radian
        sa = np.array([[math.exp(a) , 0] , [ 0 , math.exp(a)]])
        s_rotate = np.array([[math.cos(radians) , -math.sin(radians)] , [ math.sin(radians) , math.cos(radians)]])
        return sa.dot(s_rotate)

    def DA_Da (self,a,theta):
        radians = theta * self.degree_to_radian
        d_sa = np.array([[math.exp(a) , 0] , [ 0 , math.exp(a)]])
        s_rotate = np.array([[math.cos(radians) , -math.sin(radians)] , [ math.sin(radians) , math.cos(radians)]])
        return d_sa.dot(s_rotate)
    
    def DA_Da_2 (self,a,theta):
        radians = theta * self.degree_to_radian
        d_sa_2 = np.array([[math.exp(a) , 0] , [ 0 , math.exp(a)]])
        s_rotate = np.array([[math.cos(radians) , -math.sin(radians)] , [ math.sin(radians) , math.cos(radians)]])
        return d_sa_2.dot(s_rotate)
    
    def project (self,p,Y):
        scale_low = 1e-4
        if len(Y.shape) == 1:
            assert np.amax(Y[0])*p*scale_low + 1. != 0, '1+p*Y 不能等於0'
            fp = np.ndarray(Y.shape,np.float64)
            for i in range (fp.shape[0]):
                fp[i] = 1./(1.+p*Y[0]*scale_low)
        elif len(Y.shape) == 2:
            assert np.amax(Y[0,:])*p*scale_low + 1. != 0, '1+p*Y 不能等於0'
            fp = np.ndarray(Y.shape,np.float64)
            for i in range (fp.shape[0]):
                fp[i,:] = 1./(1.+p*Y[0,:]*scale_low)
        else:
            print ('input Y not 1D or 2D array,return 1 !!')
            return 1
        return fp


    def D_project_Dp (self,p,Y):
        scale_low = 1e-4
        if len(Y.shape) == 1:
            assert np.amax(Y[0])*p*scale_low + 1. != 0, '1+p*Y 不能等於0'
            fp = np.ndarray(Y.shape,np.float64)
            for i in range (fp.shape[0]):
                fp[i] = -Y[0]*scale_low/( (1+p*Y[0]*scale_low)**2 )
        elif len(Y.shape) == 2:
            assert np.amax(Y[0,:])*p*scale_low + 1 != 0, '1+p*Y 不能等於0'
            fp = np.ndarray(Y.shape,np.float64)
            for i in range (fp.shape[0]):
                fp[i,:] = -Y[0,:]*scale_low/( (1+p*Y[0,:]*scale_low)**2 )
        else:
            print ('input Y not 1D or 2D array,return 1 !!')
            return 1
        return fp

    def D_project_Dp_2 (self,p,Y):
        scale_low = 1e-4
        if len(Y.shape) == 1:
            assert np.amax(Y[0])*p*scale_low + 1 != 0, '1+p*Y 不能等於0'
            fp = np.ndarray(Y.shape,np.float64)
            for i in range (fp.shape[0]):
                fp[i] = 2*Y[0]*Y[0]*scale_low*scale_low/( (1+p*Y[0]*scale_low)**3 )
        elif len(Y.shape) == 2:
            assert np.amax(Y[0,:])*p*scale_low + 1 != 0, '1+p*Y 不能等於0'
            fp = np.ndarray(Y.shape,np.float64)
            for i in range (fp.shape[0]):
                fp[i,:] = 2*Y[0,:]*Y[0,:]*scale_low*scale_low/( (1+p*Y[0,:]*scale_low)**3 )
        else:
            print ('input Y not 1D or 2D array,return 1 !!')
            return 1
        return fp
    
    def Cost_function(self,X,Y,M,t,a,theta,p,gamma,arpha):
        A = self.Get_A_matrix['Affine'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y) + t

        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        Y_k[:,:,:]=Y_change.T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        diff_ik = np.linalg.norm( (X_i-Y_k),axis = 2 )**2
        
        return np.sum(M*diff_ik) + gamma*(a**2 + p**2) - arpha * np.sum(M)
    
    def Cost_function_da1(self,X,Y,M,t,a,theta,p,gamma,arpha):
        A = self.Get_A_matrix['Affine'](a,theta)
        dA = self.Get_A_matrix['DA_Da'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y) + t
    
        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        dA_Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        Y_k[:,:,:]=Y_change.T
        dA_Y_k[:,:,:] = (self.project (p,Y)*dA.dot(Y)).T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        diff_ik = np.sum( ( (X_i-Y_k)* dA_Y_k ),axis = 2 )
       
        return -2*np.sum(M*diff_ik) + gamma*2*a

    def Cost_function_da2(self,X,Y,M,t,a,theta,p,gamma,arpha):
        A = self.Get_A_matrix['Affine'](a,theta)
        dA = self.Get_A_matrix['DA_Da'](a,theta)
        dA2 = self.Get_A_matrix['DA_Da_2'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y) + t

        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        dA_Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        dA2_Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        Y_k[:,:,:]=Y_change.T
        dA_Y_k[:,:,:] = (self.project (p,Y)*dA.dot(Y)).T
        dA2_Y_k[:,:,:] = (self.project (p,Y)*dA2.dot(Y)).T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        diff_ik = np.sum( ( (X_i-Y_k)* dA2_Y_k - dA_Y_k**2 ),axis = 2 )

        return -2*np.sum(M*diff_ik) + gamma*2 
    
    def Cost_function_dp1(self,X,Y,M,t,a,theta,p,gamma,arpha):
        A = self.Get_A_matrix['Affine'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y) + t
        diff_ik = np.zeros(shape = [X.shape[1], Y.shape[1]])

        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        dA_Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        Y_k[:,:,:]=Y_change.T
        dA_Y_k[:,:,:] = (self.D_project_Dp (p,Y)*A.dot(Y)).T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        diff_ik = np.sum( ( (X_i-Y_k)* dA_Y_k ),axis = 2 )
        
        return -2*np.sum(M*diff_ik) + gamma*2*p

    def Cost_function_dp2(self,X,Y,M,t,a,theta,p,gamma,arpha):
        A = self.Get_A_matrix['Affine'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y) + t

        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        dA_Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        dA2_Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        Y_k[:,:,:]=Y_change.T
        dA_Y_k[:,:,:] = (self.D_project_Dp (p,Y)*A.dot(Y)).T
        dA2_Y_k[:,:,:] = (self.D_project_Dp_2 (p,Y)*A.dot(Y)).T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        diff_ik = np.sum( ( (X_i-Y_k)* dA2_Y_k - dA_Y_k**2 ),axis = 2 )
        
        return -2*np.sum(M*diff_ik) + gamma*2
    
    def gradient_check_D1(self,X,Y,M,t,a,theta,p,gamma,arpha, epsilon = 1e-7):
        num_parameters = 2 
        parameters_values = np.array([a , p], np.float64)
        grad = np.zeros((num_parameters), np.float64)
        J_plus = np.zeros((num_parameters), np.float64)
        J_minus = np.zeros((num_parameters), np.float64)
        gradapprox = np.zeros((num_parameters), np.float64)


        grad[0] = self.Cost_function_da1(X,Y,M,t,a,theta,p,gamma,arpha)
        grad[1] = self.Cost_function_dp1(X,Y,M,t,a,theta,p,gamma,arpha)

        for i in range(num_parameters):
            param_plus = np.copy(parameters_values)
            print (param_plus)
            param_plus[i] = param_plus[i] +  epsilon
            J_plus[i] = self.Cost_function(X,Y,M,t,param_plus[0] ,theta,param_plus[1],gamma,arpha)

            param_minus = np.copy(parameters_values)
            param_minus[i] = param_minus[i] -  epsilon
            J_minus[i] = self.Cost_function(X,Y,M,t,param_minus[0] ,theta,param_minus[1],gamma,arpha)

            gradapprox[i] = (J_plus[i] - J_minus[i])  / (2*epsilon)


        numerator = np.linalg.norm(grad - gradapprox)                       # Step 1'
        denominator = np.linalg.norm(grad) + np.linalg.norm(gradapprox)     # Step 2'
        difference = numerator / denominator                                # Step 3'

        print (grad)
        print (gradapprox)

        if difference > 2e-7:
            print ("\033[93m" + "There is a mistake in first partial derivative! difference = " + str(difference) + "\033[0m")
        else:
            print ("\033[92m" + "First partial derivative works perfectly fine! difference = " + str(difference) + "\033[0m")

        return difference
    
    def gradient_check_D2(self,X,Y,M,t,a,theta,p,gamma,arpha, epsilon = 1e-7):
        num_parameters = 2 
        parameters_values = np.array([a , p], np.float64)
        grad = np.zeros((num_parameters), np.float64)
        J_plus = np.zeros((num_parameters), np.float64)
        J_minus = np.zeros((num_parameters), np.float64)
        gradapprox = np.zeros((num_parameters), np.float64)


        grad[0] = self.Cost_function_da2(X,Y,M,t,a,theta,p,gamma,arpha)
        grad[1] = self.Cost_function_dp2(X,Y,M,t,a,theta,p,gamma,arpha)

        J_plus[0] = self.Cost_function_da1(X,Y,M,t,a + epsilon,theta,p,gamma,arpha)
        J_minus[0] = self.Cost_function_da1(X,Y,M,t,a - epsilon,theta,p,gamma,arpha)

        J_plus[1] = self.Cost_function_dp1(X,Y,M,t,a,theta,p+ epsilon,gamma,arpha)
        J_minus[1] = self.Cost_function_dp1(X,Y,M,t,a,theta,p- epsilon,gamma,arpha)

        for i in range (num_parameters):
            gradapprox[i] = (J_plus[i] - J_minus[i])  / (2*epsilon)


        numerator = np.linalg.norm(grad - gradapprox)                       # Step 1'
        denominator = np.linalg.norm(grad) + np.linalg.norm(gradapprox)     # Step 2'
        difference = numerator / denominator                                # Step 3'

        print (grad)
        print (gradapprox)

        if difference > 2e-7:
            print ("\033[93m" + "There is a mistake in second partial derivative! difference = " + str(difference) + "\033[0m")
        else:
            print ("\033[92m" + "Second partial derivative works perfectly fine! difference = " + str(difference) + "\033[0m")

        return difference
    
    def get_new_theta(self,X,Y,M,t,a,p):
        A = self.Get_A_matrix['Affine'](a,0)
        W = self.project (p,Y)*A.dot(Y)

        W_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        W_k[:,:,:]=W.T
        X_i[:,:,:]=(X-t).T
        X_i = X_i.transpose(1,0,2)

        sin_ = np.sum( M*( X_i[:,:,1] * W_k[:,:,0] - X_i[:,:,0] * W_k[:,:,1] ) )
        cos_ = np.sum( M*np.sum(X_i*W_k,axis=2))

        theta = math.atan(sin_/(cos_ + 1e-7)) * self.radian_to_degree
           
        return theta
    
    def get_new_t(self,X,Y,M,a,theta,p):
        A = self.Get_A_matrix['Affine'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y)


        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        M_ik = np.ndarray((X.shape[0],M.shape[0],M.shape[1]))
        Y_k[:,:,:]=Y_change.T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        M_ik[:,:,:] = M
        M_ik = M_ik.transpose(1,2,0)

        Diff = np.sum( ( M_ik * (X_i-Y_k) ),axis = 1 )
        Diff = np.sum( ( Diff ),axis = 0 ).reshape((-1,1))
        
        Diff = Diff / ( np.sum(M) + 1e-7 )
        
        return Diff
    
    def get_M_matrix(self,X,Y,t,a,theta,p,arpha,beta):
        A = self.Get_A_matrix['Affine'](a,theta)
        Y_change = self.project (p,Y)*A.dot(Y) + t

        Y_k = np.ndarray((X.shape[1],Y.shape[1],Y.shape[0]))
        X_i = np.ndarray((Y.shape[1],X.shape[1],X.shape[0]))
        Y_k[:,:,:]=Y_change.T
        X_i[:,:,:]=X.T
        X_i = X_i.transpose(1,0,2)
        M_matrix = np.exp( beta * ( arpha - np.linalg.norm( (X_i-Y_k),axis = 2 )**2 ) )

        return M_matrix

    def soft_assign (self,M_matrix,epsilon = 2e-2):
        sl_M_matrix = np.insert(M_matrix,M_matrix.shape[1],1+epsilon,axis=1)
        sl_M_matrix = np.insert(sl_M_matrix,M_matrix.shape[0],1+epsilon,axis=0)
        sl_M_matrix_pre = np.copy(sl_M_matrix)
        for i in range(30):
            #print (sl_M_matrix)
            sum_axis_0 = np.sum(sl_M_matrix,axis=0)
            sum_axis_0[-1] = 1
            sl_M_matrix = sl_M_matrix / ( sum_axis_0 + 1e-7 )
            #print ( sl_M_matrix )
            sum_axis_1 = np.reshape(np.sum(sl_M_matrix,axis=1), (-1, 1)) 
            sum_axis_1[-1][0] = 1
            sl_M_matrix = sl_M_matrix / ( sum_axis_1 + 1e-7 )

            diff = np.sum( np.abs(sl_M_matrix_pre[0:-1,0:-1] - sl_M_matrix[0:-1,0:-1]))
            if diff < epsilon :
                break
            sl_M_matrix_pre = sl_M_matrix
        return sl_M_matrix[0:-1,0:-1]
    
    def update_pose_Bynewton(self,X,Y,M,t,a,theta,p,gamma,arpha,mode = True):
    
        u_theta = self.get_new_theta(X,Y,M,t,a,p)
        u_t = self.get_new_t(X,Y,M,a,u_theta,p)
        
        u_a = a
        u_p = p

        ax0 = 5
        px0 = 0
        count_a = 0
        count_p = 0

        if mode:
            for i in range(50):
                dv1 = self.Cost_function_da1(X,Y,M,u_t,ax0,u_theta,u_p,gamma,arpha)
                dv2 = self.Cost_function_da2(X,Y,M,u_t,ax0,u_theta,u_p,gamma,arpha)
                u_a = ax0 - dv1/(dv2+1e-5)
               
                if ( abs(u_a) > 200 ):
                    u_a = 0
                    break

                if ( abs((ax0 - u_a)/(u_a+1e-5) ) < 1e-4 ):
                    ax0 = u_a
                    break
                ax0 = u_a

            for i in range(50):
                dv1 = self.Cost_function_dp1(X,Y,M,u_t,u_a,u_theta,px0,gamma,arpha)
                dv2 = self.Cost_function_dp2(X,Y,M,u_t,u_a,u_theta,px0,gamma,arpha)
                u_p = px0 - dv1/(dv2+1e-5)
                if ( abs((px0 - u_p)/(u_p+1e-5) ) < 1e-4 ):
                    px0 = u_p
                    break
                px0 = u_p


            if (abs(u_p) > 0.2):
                u_p = 0
                
        return u_theta, u_t, u_a,u_p
    
    def point_matching(self,X,Y,arpha, mode = True):
        # X,Y : point set. Y will be matched to X
        # arpha : outlier threshold
        # mode: false when rigid transform only (no scaling and project), true when all transform
        
        u_t0 = self.t
        u_a0,u_theta0,u_p0 = self.a,self.theta,self.p
        beta = self.beta_0
        gamma = self.gamma_0
        M = np.ndarray((X.shape[1],Y.shape[1]))
        #count_while = 0
        while (beta < self.beta_f): #stepA
            #count_stepB = 0
            count_inerror = 0
            for i in range(100): # stepB
                
                #stepC and stepD
                M = self.soft_assign(self.get_M_matrix(X,Y,u_t0,u_a0,u_theta0,u_p0,arpha,beta), self.ep1)
               
                #stepE
                u_theta, u_t, u_a,u_p = self.update_pose_Bynewton(X,Y,M,u_t0,u_a0,u_theta0,u_p0,gamma,arpha,mode)

                para_error_t = np.sum(abs(u_t0-u_t))
                para_error =abs(u_a0 - u_a)  + abs(u_theta0 - u_theta) + abs(u_p0 - u_p)
                u_t0,u_a0,u_theta0,u_p0 = u_t,u_a,u_theta, u_p
                #count_stepB = count_stepB + 1
                if (para_error < self.ep2 and para_error_t < 1):
                    count_inerror = count_inerror + 1
                    if (count_inerror > 0):
                        break

            beta = beta*self.beta_r
            gamma = gamma*self.beta_r
            #count_while = count_while+1
          
        A  = self.Get_A_matrix['Affine'](u_a0,u_theta0)
        Y_ = (self.project (u_p0,Y)*A.dot(Y)) + u_t0
        
        return Y_,M,u_t0,u_theta0,u_a0,u_p0 # Y_: Y point set after matching, M:matching matrix


def draw_XY_withM(X,Y,match_XtoY):
    name_X = []
    name_Y = []
    for i in match_XtoY:
        name_X.append(str(i+1))

    for i in range(Y.shape[1]):
        name_Y.append(str(i+1))
    #print (name)

    pylab.figure(figsize=(30,10)) #figsize=(width,height) by inch
    pylab.ylim([0,3000])
    pylab.xlim([0,4000])
    for i, label in enumerate(name_X):
        x,y=X[:,i]
        pylab.scatter(x,y,color=['red']) #plot scatter point
        pylab.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points',ha='right', va='bottom') #label on each point


    for i, label in enumerate(name_Y):
        x,y=Y[:,i]
        pylab.scatter(x,y,color=['blue']) #plot scatter point
        pylab.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points',ha='right', va='bottom') #label on each point

    pylab.show()
    
def draw_XY(X,Y):
    name_X = []
    name_Y = []
    for i in range(X.shape[1]):
        name_X.append(str(i+1))

    for i in range(Y.shape[1]):
        name_Y.append(str(i+1))
    #print (name)

    pylab.figure(figsize=(30,10)) #figsize=(width,height) by inch
    pylab.ylim([0,3000])
    pylab.xlim([0,4000])
    for i, label in enumerate(name_X):
        x,y=X[:,i]
        pylab.scatter(x,y,color=['red']) #plot scatter point
        pylab.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points',ha='right', va='bottom') #label on each point


    for i, label in enumerate(name_Y):
        x,y=Y[:,i]
        pylab.scatter(x,y,color=['blue']) #plot scatter point
        pylab.annotate(label, xy=(x, y), xytext=(5, 2), textcoords='offset points',ha='right', va='bottom') #label on each point

    pylab.show()

X = np.array([[ 325,332,362,492,542, 2872,2899,2926, 2952, 3007, 3034, 3061, 3087, 3145,
              3172, 3199, 3225,  536,  602,  622,  687,  709,  775,  796,  863,  884,  951,  973,
              1041, 1062, 1130, 1153, 1221, 1287, 1355, 1378, 1448, 1470, 1540, 1562, 1632, 1654,
              1724, 1748, 1818, 1841, 1911, 1934, 2004, 2070, 2141, 2164, 2235, 2258, 2327, 2350,
              2419, 2443, 2513, 2536, 2605, 2627, 2697, 2719, 2789,  654, 2873, 2899, 2928, 2954,
              3010, 3036, 3064, 3090, 3148, 3174, 3202, 3228],
             [ 279,  291,  318,  327,  330,  363,  364,  365,  367,  367,  368,  369,  371,  373,
               374,  376,  377,  401,  397,  398,  394,  395,  392,  392,  389,  390,  387,  388,
               385,  385,  383,  384,  382,  382,  380,  380,  379,  379,  378,  378,  377,  378,
               376,  377,  376,  377,  376,  377,  376,  378,  376,  378,  378,  380,  378,  380,
               380,  381,  381,  384,  383,  384,  385,  388,  387,  572,  573,  573,  574,  575,
               576,  577,  577,  579,  581,  581,  581,  584]])

Y = np.array([[ 773,  840,  861,  927,  948, 1015, 1036, 1102, 1123, 1194, 1215, 1281, 1302, 1373,
              1394, 1463, 1527, 1598, 1619, 1690, 1711, 1781, 1804, 1873, 1896, 1965, 1990, 2056,
              2081, 2150, 2173, 2244, 2311, 2379, 2402, 2469, 2494, 2563, 2586, 2654, 2677, 2744,
              2769, 2838, 2856, 2925, 2950, 3017, 3098, 3125, 3150, 3177, 3231, 3254, 3283, 3306,
              3365, 3392, 3417, 3444, 3098, 3125, 3152, 3179, 3234, 3256, 3286, 3308, 3369, 3394,
              3421, 3446],
             [ 462,  459,  459,  456,  456,  453,  453,  453,  453,  453,  453,  450,  450,  450,
               450,  450,  450,  447,  450,  447,  450,  447,  450,  447,  450,  447,  450,  450,
               450,  450,  450,  450,  453,  453,  456,  456,  456,  459,  459,  462,  462,  462,
               464,  464,  467,  467,  470,  470,  445,  445,  450,  450,  450,  450,  456,  456,
               459,  462,  462,  462,  656,  656,  656,  656,  659,  659,  659,  662,  662,  664,
               664,  667,]])

draw_XY(X,Y)

arpha = 200
PM = Point_match()
Y_,M,u_t,u_theta, u_a,u_p = PM.point_matching(X,Y,arpha,mode = True)


match_YtoX = np.argmax(np.insert(M>0.6,0,0,axis=0),axis=0 )-1
match_XtoY = np.argmax(np.insert(M>0.6,0,0,axis=1),axis=1 )-1

draw_XY_withM(X,Y_,match_XtoY)
P_diff = X[:,match_YtoX[match_YtoX>=0]] - Y_[:,match_YtoX>=0]
print ('Point lose number = %d'%(Y.shape[1] - P_diff.shape[1]))
print ('Point mathcing number = %d'%(P_diff.shape[1]))
print ('point matching difference = %f' %(np.linalg.norm( P_diff )) )