Python利用Canny算法檢測(cè)硬幣邊緣

更新時(shí)間：2022年01月20日 15:45:45 作者：十木

這篇文章主要介紹了如何使用Canny算法檢測(cè)出紙面上硬幣的邊緣。文中的示例代碼講解詳細(xì)，感興趣的小伙伴可以跟隨小編一起動(dòng)手試一試

一、問(wèn)題背景

紙面上有一枚一元錢的銀幣，你能在 Canny 和 Hough 的幫助下找到它的坐標(biāo)方程嗎？

確定一個(gè)圓的坐標(biāo)方程，首先我們要檢測(cè)到其邊緣，然后求出其在紙面上的相對(duì)位置以及半徑大小。

在這篇文章中我們使用 Canny 算法來(lái)檢測(cè)出紙面上銀幣的邊緣。

二、Canny 算法

Canny 可以用于拿到圖像中物體的邊緣，其步驟如下

進(jìn)行高斯平滑
計(jì)算圖像梯度(記錄其強(qiáng)度、方向)
進(jìn)行非極大化抑制
進(jìn)行滯后邊緣跟蹤

進(jìn)行上面的四步之后，我們拿到的紙面上硬幣邊緣提取效果圖如下

（一）、高斯平滑

class GaussianSmoothingNet(nn.Module):
    def __init__(self) -> None:
        super(GaussianSmoothingNet, self).__init__()

        filter_size = 5
        # shape為(1, 5), 方差為 1.0 的高斯濾波核
        generated_filters = gaussian(filter_size,std=1.0).reshape([1,filter_size]) 

        # GFH(V): gaussian filter of horizontal(vertical) 水平（豎直）方向的高斯濾波核
        self.GFH = nn.Conv2d(1, 1, kernel_size=(1,filter_size), padding=(0,filter_size//2))
        self.GFV = nn.Conv2d(1, 1, kernel_size=(filter_size,1), padding=(filter_size//2,0))

        # 設(shè)置 w 的值為 高斯平滑核, b 的值為 0.0
        init_parameter(self.GFH, generated_filters, np.array([0.0])) 
        init_parameter(self.GFV, generated_filters.T, np.array([0.0])) 

    def forward(self, img):
        img_r = img[:,0:1]  # 取出RGB三個(gè)通道的數(shù)據(jù)
        img_g = img[:,1:2]
        img_b = img[:,2:3]

        # 對(duì)圖片的三個(gè)通道進(jìn)行水平、垂直濾波
        blurred_img_r = self.GFV(self.GFH(img_r))
        blurred_img_g = self.GFV(self.GFH(img_g))
        blurred_img_b = self.GFV(self.GFH(img_b))

        # 合并成一張圖
        blurred_img = torch.stack([blurred_img_r, blurred_img_g, blurred_img_b], dim=1)
        blurred_img = torch.stack([torch.squeeze(blurred_img)])

        return blurred_img

進(jìn)行高斯平滑（模糊）之后的圖片較原圖更為模糊如下圖右側(cè)銀幣所示

完整代碼見：gaussian_smoothing

（二）Sobel算子計(jì)算梯度

PAI = 3.1415926

class SobelFilterNet(nn.Module):
    def __init__(self) -> None:
        super(SobelFilterNet, self).__init__()
        sobel_filter = np.array([[-1, 0, 1],
                                 [-2, 0, 2],
                                 [-1, 0, 1]])
        self.SFH = nn.Conv2d(1, 1, kernel_size=sobel_filter.shape, padding=sobel_filter.shape[0]//2)
        self.SFV = nn.Conv2d(1, 1, kernel_size=sobel_filter.shape, padding=sobel_filter.shape[0]//2)

        init_parameter(self.SFH, sobel_filter, np.array([0.0]))
        init_parameter(self.SFV, sobel_filter.T, np.array([0.0]))

    def forward(self, img):
        img_r = img[:,0:1]
        img_g = img[:,1:2]
        img_b = img[:,2:3]

        # # SFH(V): sobel filter of horizontal(vertical) 水平（豎直）方向的Sobel濾波
        grad_r_x = self.SFH(img_r)  # 通道 R 的 x 方向梯度
        grad_r_y = self.SFV(img_r)
        grad_g_x = self.SFH(img_g)
        grad_g_y = self.SFV(img_g)
        grad_b_x = self.SFH(img_b)
        grad_b_y = self.SFV(img_b)

        # 計(jì)算強(qiáng)度（magnitude） 和 方向（orientation）
        magnitude_r = torch.sqrt(grad_r_x**2 + grad_r_y**2) # Gr^2 = Grx^2 + Gry^2
        magnitude_g = torch.sqrt(grad_g_x**2 + grad_g_y**2) 
        magnitude_b = torch.sqrt(grad_b_x**2 + grad_b_y**2)

        grad_magnitude = magnitude_r + magnitude_g + magnitude_b

        grad_y = grad_r_y + grad_g_y + grad_b_y
        grad_x = grad_r_x + grad_g_x + grad_b_x

        # tanθ = grad_y / grad_x 轉(zhuǎn)化為角度 （方向角）
        grad_orientation = (torch.atan2(grad_y, grad_x) * (180.0 / PAI)) 
        grad_orientation =  torch.round(grad_orientation / 45.0) * 45.0  # 轉(zhuǎn)化為 45 的倍數(shù)
        
        return grad_magnitude, grad_orientation

將梯度強(qiáng)度當(dāng)作圖片進(jìn)行輸出，得到右下圖最右側(cè)圖片，可知硬幣的邊緣區(qū)域梯度值較大（越大越亮）

完整代碼見：sobel_filter

（三）非極大化抑制

非極大化抑制（NMS）的過(guò)程為：

將梯度強(qiáng)度矩陣grad_magnitude的每一點(diǎn)都作為中心像素點(diǎn)，與其同向或者反向的兩個(gè)相鄰點(diǎn)（共有8個(gè)）的梯度強(qiáng)度進(jìn)行比較。
若中心點(diǎn)的梯度小于這兩個(gè)方向上的梯度，則點(diǎn)中心的的梯度值設(shè)為0

進(jìn)過(guò)上面的兩個(gè)步驟，可以用一個(gè)像素的寬度替代了梯度屋脊效應(yīng)，同時(shí)保留了屋脊的梯度強(qiáng)度（最大的梯度）。

class NonMaxSupression(nn.Module):
    def __init__(self) -> None:
        super(NonMaxSupression, self).__init__()

        all_orient_magnitude = np.stack([filter_0, filter_45, filter_90, filter_135, filter_180, filter_225, filter_270, filter_315])
        
        '''
        directional_filter功能見下面詳細(xì)說(shuō)明
        '''
        self.directional_filter = nn.Conv2d(1, 8, kernel_size=filter_0.shape, padding=filter_0.shape[-1] // 2)

        init_parameter(self.directional_filter, all_filters[:, None, ...], np.zeros(shape=(all_filters.shape[0],)))

    def forward(self, grad_magnitude, grad_orientation):

        all_orient_magnitude = self.directional_filter(grad_magnitude)     # 當(dāng)前點(diǎn)梯度分別與其其他8個(gè)方向鄰域點(diǎn)做差（相當(dāng)于二階梯度）

        '''
                \ 3|2 /
                 \ | /
            4     \|/    1
        -----------|------------
            5     /|\    8
                 / | \ 
                / 6|7 \ 
        注: 各個(gè)區(qū)域都是45度
        '''

        positive_orient = (grad_orientation / 45) % 8             # 設(shè)置正方向的類型，一共有八種不同類型的方向
        negative_orient = ((grad_orientation / 45) + 4) % 8       # +4 = 4 * 45 = 180 即旋轉(zhuǎn)180度(如 1 -(+4)-> 5)

        height = positive_orient.size()[2]                        # 得到圖片的寬高
        width = positive_orient.size()[3]
        pixel_count = height * width                                # 計(jì)算圖片所有的像素點(diǎn)數(shù)
        pixel_offset = torch.FloatTensor([range(pixel_count)])

        position = (positive_orient.view(-1).data * pixel_count + pixel_offset).squeeze() # 角度 * 像素?cái)?shù) + 像素所在位置

        # 拿到圖像中所有點(diǎn)與其正向鄰域點(diǎn)的梯度的梯度（當(dāng)前點(diǎn)梯度 - 正向鄰域點(diǎn)梯度，根據(jù)其值與0的大小判斷當(dāng)前點(diǎn)是不是鄰域內(nèi)最大的）
        channel_select_filtered_positive = all_orient_magnitude.view(-1)[position.long()].view(1, height, width)

        position = (negative_orient.view(-1).data * pixel_count + pixel_offset).squeeze()

        # 拿到圖像中所有點(diǎn)與其反向鄰域點(diǎn)的梯度的梯度
        channel_select_filtered_negative = all_orient_magnitude.view(-1)[position.long()].view(1, height, width)

        # 組合成兩個(gè)通道
        channel_select_filtered = torch.stack([channel_select_filtered_positive, channel_select_filtered_negative])

        is_max = channel_select_filtered.min(dim=0)[0] > 0.0 # 如果min{當(dāng)前梯度-正向點(diǎn)梯度, 當(dāng)前梯度-反向點(diǎn)梯度} > 0，則當(dāng)前梯度最大
        is_max = torch.unsqueeze(is_max, dim=0)

        thin_edges = grad_magnitude.clone()
        thin_edges[is_max==0] = 0.0

        return thin_edges

directional_filter的用處是什么？

# 輸入
tensor([[[[1., 1., 1.],   
          [1., 1., 1.],   
          [1., 1., 1.]]]])
# 輸出
tensor([[[[0., 0., 1.], 
          [0., 0., 1.], 
          [0., 0., 1.]],

         [[0., 0., 1.], 
          [0., 0., 1.], 
          [1., 1., 1.]],

         [[0., 0., 0.], 
          [0., 0., 0.], 
          [1., 1., 1.]],

         [[1., 0., 0.], 
          [1., 0., 0.], 
          [1., 1., 1.]],

         [[1., 0., 0.], 
          [1., 0., 0.], 
          [1., 0., 0.]],

         [[1., 1., 1.], 
          [1., 0., 0.], 
          [1., 0., 0.]],

         [[1., 1., 1.], 
          [0., 0., 0.], 
          [0., 0., 0.]],

         [[1., 1., 1.],
          [0., 0., 1.],
          [0., 0., 1.]]]], grad_fn=<ThnnConv2DBackward0>)

可知其獲取輸入的八個(gè)方向的梯度值（在當(dāng)前項(xiàng)目的代碼中，為獲取當(dāng)前點(diǎn)梯度與其它8個(gè)方向梯度之差）

根據(jù)梯度的強(qiáng)度和方向，將方向分成8個(gè)類別（即對(duì)于每一點(diǎn)有八個(gè)可能方向），如上代碼中 "米" 型圖所示。

下面給出計(jì)算當(dāng)前點(diǎn)正向鄰域的相鄰點(diǎn)的梯度強(qiáng)度的過(guò)程（反向同理）

梯度方向grad_orientation: 0, 1,, 2, 3, 4, 5, 6, 7 (共有8哥方向)

各方向梯度強(qiáng)度all_orient_magnitude: [[..方向0的梯度..], [..方向1的梯度..], ..., [..方向7的梯度..]]

故對(duì)于方向?yàn)?i 的點(diǎn)，其在梯度強(qiáng)度中的位置為 all_orient_magnitude[i][x, y]，將all_orient_magnitude變化為一維向量后，對(duì)應(yīng)的位置為position = current_orient × pixel_count + pixel_offset，我們就可以根據(jù)這個(gè)位置信息拿到當(dāng)前點(diǎn)與其正向鄰域點(diǎn)梯度強(qiáng)度之差（同理也可以拿到反向的）。

以下為輔助圖示：

最后效果如下右側(cè)圖所示（左側(cè)為未進(jìn)行最大化抑制的圖）

完整代碼見：nonmax_supression

（四）滯后邊緣跟蹤

我們思考后發(fā)現(xiàn)，到目前為止仍有如下幾個(gè)問(wèn)題：

如果圖像中有噪聲，可能會(huì)出現(xiàn)邊緣無(wú)關(guān)的點(diǎn)（偽邊）
邊緣點(diǎn)時(shí)陰時(shí)明

所以最后我們就需要進(jìn)行滯后邊緣跟蹤了，其步驟如下：

設(shè)定兩個(gè)閾值（一高一低），將梯度強(qiáng)度小于低閾值的像素點(diǎn)的梯度強(qiáng)度設(shè)為0，得到圖像A
將梯度強(qiáng)度小于高閾值的像素點(diǎn)的梯度強(qiáng)度設(shè)為0，得到圖像B

我們知道由于A的閾值較低，故邊緣保留較完整，連續(xù)性較好，但是偽邊可能也較多，B正好與A相反。

據(jù)此我們?cè)O(shè)想以B為基礎(chǔ)，A為補(bǔ)充，通過(guò)遞歸追蹤來(lái)補(bǔ)全B中邊缺失的像素點(diǎn)。

to_bw = lambda image: (image > 0.0).astype(float)

class HysteresisThresholding(nn.Module):
    def __init__(self, low_threshold=1.0, high_threshold=3.0) -> None:
        super(HysteresisThresholding, self).__init__()
        self.low_threshold = low_threshold
        self.high_threshold = high_threshold

    def thresholding(self, low_thresh: torch.Tensor, high_thresh: torch.Tensor):
        died = torch.zeros_like(low_thresh).squeeze()
        low_thresh = low_thresh.squeeze()
        final_image = high_thresh.squeeze().clone()

        height = final_image.shape[0] - 1 
        width = final_image.shape[1] - 1

        def connected(x, y, gap = 1):
            right = x + gap
            bottom = y + gap
            left = x - gap
            top = y - gap

            if left < 0 or top < 0 or right >= width or bottom >= height:
                return False
            
            return final_image[top, left] > 0  or final_image[top, x] > 0 or final_image[top, right] > 0 \
                or final_image[y, left] > 0 or final_image[y, right] > 0 \
                or final_image[bottom, left] > 0 or final_image[bottom, x] > 0 or final_image[bottom, right] > 0

        # 先高再寬
        def trace(x:int, y:int):
            right = x + 1
            bottom = y + 1
            left = x - 1
            top = y - 1
            if left < 0 or top < 0 or right >= width or bottom >= height or died[y, x] or final_image[y, x] > 0:
                return

            pass_high = final_image[y, x] > 0.0
            pass_low = low_thresh[y, x] > 0.0

            died[y, x] = True

            if pass_high:
                died[y, x] = False
            elif pass_low and not pass_high:
                if connected(x, y) or connected(x, y, 2): # 如果其他方向有連接
                    final_image[y, x] = low_thresh[y, x]
                    died[y, x] = False
            
            # 往回
            if final_image[y, x] > 0.0: # 當(dāng)前點(diǎn)有連接
                if low_thresh[top, left] > 0: trace(left, top)
                if low_thresh[top, x] > 0: trace(x, top)    
                if low_thresh[top, right] > 0: trace(right, top)
                if low_thresh[y, left] > 0: trace(left, y)
                if low_thresh[bottom, left] > 0: trace(left, bottom)

            # 往下
            trace(right, y)
            trace(x, bottom)
            trace(right, bottom)
        
        for i in range(width):
            for j in range(height):
                trace(i, j)

        final_image = final_image.unsqueeze(dim=0).unsqueeze(dim=0)

        return final_image

    def forward(self, thin_edges, grad_magnitude, grad_orientation):
        low_thresholded: torch.Tensor = thin_edges.clone()
        low_thresholded[thin_edges<self.low_threshold] = 0.0

        high_threshold: torch.Tensor = thin_edges.clone()
        high_threshold[thin_edges<self.high_threshold] = 0.0

        final_thresholded = self.thresholding(low_thresholded, high_threshold)

        return low_thresholded, high_threshold, final_thresholded

如下圖為依次為低閾值、高閾值的效果圖