paper目录

# 阅读摘要

## 1.A Deep Neural Networks Approach for Pixel-Level Runway Pavement Crack Segmentation Using Drone-Captured Images

arXiv:2001.03257 [pdf] cs.CV eess.IV
Authors: Liming Jiang, Yuanchang Xie, Tianzhu Ren
Abstract: Pavement conditions are a critical aspect of asset management and directly affect safety. This study introduces a deep neural network method called U-Net for pavement crack segmentation based on drone-captured images to reduce the cost and time needed for airport runway inspection. The proposed approach can also be used for highway pavement conditions assessment during off-peak periods when there are few vehicles on the road. In this study, runway pavement images are collected using drone at various heights from the Fitchburg Municipal Airport (FMA) in Massachusetts to evaluate their quality and applicability for crack segmentation, from which an optimal height is determined. Drone images captured at the optimal height are then used to evaluate the crack segmentation performance of the U-Net model. Deep learning methods typically require a huge set of annotated training datasets for model development, which can be a major obstacle for their applications. An online annotated pavement image dataset is used together with the FMA data to train the U-Net model. The results show that U-Net performs well on the FMA testing data even with limited FMA training images, suggesting that it has good generalization ability and great potential to be used for both airport runways and highway pavements.
Submitted 9 January, 2020; originally announced January 2020.

### 3.方法

• U-Net

在将其应用于分析无人机收集的跑道路面图像之前，对原始u形网进行了一些超参数调整。在(4,5)的启发下，考虑更深层次的结构，每个卷积层的通道数增加0.5倍，提高模型拟合和泛化能力。另外，将图像输入维数设置为256×256像素。

• Data Augmentation

使用数据增强，进一步增加训练集数量。
• Hyperparameters

损失函数：binary cross entropy

激活函数：最后一层使用Sigmoid函数，其他层使用ReLu激活函数。

训练集(training episodes)设为1000。

Batch size:5

### 4.效果

U-Net_Crack500模型还在高速公路，由激光路面扫描系统采集的图像，有着不错的表现。

## 2.Automated Pavement Crack Segmentation Using Fully Convolutional U-Net with a Pretrained ResNet-34 Encoder

arXiv:2001.01912 [pdf] cs.CV
Authors: Stephen L. H. Lau, Xin Wang, Xu Yang, Edwin K. P. Chong
Abstract: Automated pavement crack segmentation is a challenging task because of inherent irregular patterns and lighting conditions, in addition to the presence of noise in images. Conventional approaches require a substantial amount of feature engineering to differentiate crack regions from non-affected regions. In this paper, we propose a deep learning technique based on a convolutional neural network to perform segmentation tasks on pavement crack images. Our approach requires minimal feature engineering compared to other machine learning techniques. The proposed neural network architecture is a modified U-Net in which the encoder is replaced with a pretrained ResNet-34 network. To minimize the dice coefficient loss function, we optimize the parameters in the neural network by using an adaptive moment optimizer called AdamW. Additionally, we use a systematic method to find the optimum learning rate instead of doing parametric sweeps. We used a “one-cycle” training schedule based on cyclical learning rates to speed up the convergence. We evaluated the performance of our convolutional neural network on CFD, a pavement crack image dataset. Our method achieved an F1 score of about 96%. This is the best performance among all other algorithms tested on this dataset, outperforming the previous best method by a 1.7% margin.
Submitted 10 January, 2020; v1 submitted 7 January, 2020; originally announced January 2020.

### 3.方法

• ResNet34 base U-Net

ResNet34是在ImageNet上预训练的模型，去除其最后的平均池化层和全连接层。接上了上采样模块。解码器由重复上采样块(图2中的洋红色和紫色块)组成，它将输出激活的空间分辨率提高一倍，同时将特征通道的数量减半。每个上采样层由1个BN层，ReLU层和1个转置卷积层组成（2*2kernel,2 stride）。在BN层个转置卷积层之间添加了SCSE模块(concurrent spatial and channel squeeze and excitation module)。

• dice coefficient loss

dice coefficient loss相当于F1 score，将其作为损失函数，相当于对F1分数直接进行优化。

#### 训练

• 参数初始化：下文的初始化方法(高斯分布，其均值为0、方差为$2/n_l$，其中$n_l$为卷积层通道数。)ResNet34部分使用预训练参数。

K. He, X. Zhang, S. Ren, and J. Sun, “Delving deep into rectifiers:surpassing human-level performance on ImageNet classification,” in IEEE ICCV, Santiago, Chile, Dec. 2015.
• 优化器：AdamW优化器($\lambda =0.01$, $\alpha$是学习率, $\epsilon =10^{-8}$)
$$\boldsymbol{\theta}_{t}=(1-\lambda) \boldsymbol{\theta}_{t-1}-\alpha\left(\frac{\widehat{\boldsymbol{m}_{t}}}{\sqrt{\widehat{\boldsymbol{v}}_{t}}+\epsilon}\right)$$
• 学习率：使用较大篇幅讲学习率

### 4.效果

• 评价方法
\begin{aligned} \operatorname{Pr} &=\frac{T P}{T P+F P} \ R e &=\frac{T P}{T P+F N} \ F 1 &=\frac{2 \times \text { Pr } \times R e}{\text { Pr }+R e} \end{aligned}

## 3. CrackGAN: A Labor-Light Crack Detection Approach Using Industrial Pavement Images Based on Generative Adversarial Learning

arXiv:1909.08216 [pdf, other] cs.CV cs.LG eess.IV
Authors: Kaige Zhang, Yingtao Zhang, Heng-Da Cheng
Abstract: Fully convolutional network is a powerful tool for per-pixel semantic segmentation/detection. However, it is problematic when coping with crack detection using industrial pavement images: the network may easily “converge” to the status that treats all the pixels as background (BG) and still achieves a very good loss, named “All Black” phenomenon, due to the data imbalance and the unavailability of accurate ground truths (GTs). To tackle this problem, we introduce crack-patch-only (CPO) supervision and generative adversarial learning for end-to-end training, which forces the network to always produce crack-GT images while reserves both crack and BG-image translation abilities by feeding a larger-size crack image into an asymmetric U-shape generator to overcome the “All Black” issue. The proposed approach is validated using four crack datasets; and achieves state-of-the-art performance comparing with that of the recently published works in efficiency and accuracy.
Submitted 18 September, 2019; originally announced September 2019.

### 1.简述

FCN网络是一个强有力的像素级分割网络，但用于工业级路面图像裂缝分割是有问题的。由于裂缝和背景的样本严重不平衡。网络很容易“收敛”到将所有像素作为背景(BG)的状态，但仍然会有很好的损失，称为“All Black”现象。

• 1.All Black问题：网络收敛到所有像素都是背景的状态
• 2.提出crack-patch-only (CPO) supervision and generative adversarial

learning
• 只需要少量劳动力标注的GTs，减少标注的劳动力。即使网络在小图片块上训练，也可以有效的在全尺寸图片上检测。

### 3.方法

• 模型结构

D是一个预训练鉴别器，它从只在crack-GT patches上训练的DC-GAN得来。

#### 训练

\begin{aligned} \max _{D} V(D, G) &=E_{x \sim p_{d}(x)}[\log D(x)] \ &+E_{z \sim p_{d}(z)}[\log (1-D(G(z)))] \ \max _{G} V(D, G) &=E_{z \sim p_{d}(z)}[\log (D(G(z)))] \end{aligned}

## 4. A Cost Effective Solution for Road Crack Inspection using Cameras and Deep Neural Networks

arXiv:1907.06014 [pdf] cs.CV cs.LG eess.IV
Authors: Qipei Mei, Mustafa Gül
Abstract: Automatic crack detection on pavement surfaces is an important research field in the scope of developing an intelligent transportation infrastructure system. In this paper, a cost effective solution for road crack inspection by mounting commercial grade sport camera, GoPro, on the rear of the moving vehicle is introduced. Also, a novel method called ConnCrack combining conditional Wasserstein generative adversarial network and connectivity maps is proposed for road crack detection. In this method, a 121-layer densely connected neural network with deconvolution layers for multi-level feature fusion is used as generator, and a 5-layer fully convolutional network is used as discriminator. To overcome the scattered output issue related to deconvolution layers, connectivity maps are introduced to represent the crack information within the proposed ConnCrack. The proposed method is tested on a publicly available dataset as well our collected data. The results show that the proposed method achieves state-of-the-art performance compared with other existing methods in terms of precision, recall and F1 score.
Submitted 22 October, 2019; v1 submitted 13 July, 2019; originally announced July 2019.

### 2.数据集

• 挡风玻璃可以反射汽车内部的光线，降低前面安装配置的图像质量。
• 前摄像头离地面较远，它的大部分视场(FOV)被汽车的引擎盖挡住了。因此，前面的安装配置牺牲了太多的空间分辨率与我们上面的分析。
• 我们的最终目标是在车辆上直接使用备用摄像头进行行车时的裂纹检测。在这种情况下，不需要安装任何外部设备。

• 数据集对比

### 3.方法

• ConnCrack

- 生成器：cWGAN

#### 训练

• loss function
\begin{aligned} &L_{c \mathrm{WG}, A N}(G, D)=E_{x, y}[D(x, y)]-E_{x}[D(x, G(x))]\ &G^{*}=\arg \min _{G} \max _{D}\left(\lambda L_{c W G A N}(G, D)+L_{\text {content }}(G)\right) \end{aligned}
• 预训练

在ImageNet和CFD数据集

学习率：$1x10^{-6}$
• EdmCrack600数据集

学习率：$1x10^{-5}$， λ is set to$5×10^{-6}$

• 预训练数据

- EdmCrack600数据集

## 5.FPCNet: Fast Pavement Crack Detection Network Based on Encoder-Decoder Architecture

arXiv:1907.02248 [pdf, other] cs.CV
Authors: Wenjun Liu, Yuchun Huang, Ying Li, Qi Chen
Abstract: Timely, accurate and automatic detection of pavement cracks is necessary for making cost-effective decisions concerning road maintenance. Conventional crack detection algorithms focus on the design of single or multiple crack features and classifiers. However, complicated topological structures, varying degrees of damage and oil stains make the design of crack features difficult. In addition, the contextual information around a crack is not investigated extensively in the design process. Accordingly, these design features have limited discriminative adaptability and cannot fuse effectively with the classifiers. To solve these problems, this paper proposes a deep learning network for pavement crack detection. Using the Encoder-Decoder structure, crack characteristics with multiple contexts are automatically learned, and end-to-end crack detection is achieved. Specifically, we first propose the Multi-Dilation (MD) module, which can synthesize the crack features of multiple context sizes via dilated convolution with multiple rates. The crack MD features obtained in this module can describe cracks of different widths and topologies. Next, we propose the SE-Upsampling (SEU) module, which uses the Squeeze-and-Excitation learning operation to optimize the MD features. Finally, the above two modules are integrated to develop the fast crack detection network, namely, FPCNet. This network continuously optimizes the MD features step-by-step to realize fast pixel-level crack detection. Experiments are conducted on challenging public CFD datasets and G45 crack datasets involving various crack types under different shooting conditions. The distinct performance and speed improvements over all the datasets demonstrate that the proposed method outperforms other state-of-the-art crack detection methods.
Submitted 4 July, 2019; originally announced July 2019.

### 2.数据集

• CFD datasets
• G45 crack datasets

### 3.方法

• Multi-Dilation (MD)

MD模块通过结合不同速率的多个扩展卷积[23]和一个全局池，提取不同上下文大小的裂缝特征，检测不同宽度和拓扑结构的裂缝。

• rate=1:这种卷积适用于薄而简单的裂纹，但不能有效地检测宽裂纹和拓扑复杂的裂纹。
• 这些裂纹可以用更大的r值(例如，4)的膨胀卷积来鲁棒检测。
• SE-Upsampling (SEU)

• FPCNet

#### 训练

• loss function: binary cross entropy (BCE) + dice coefficient loss
\begin{aligned} L\left(Y^{_}, Y\right)=& \frac{1}{N} \sum_{P \in N}\left(Y_{P}^{_} \cdot \lg Y_{P}+\left(1-Y_{P}^{*}\right) \cdot \lg \left(1-Y_{P}\right)\right.\ &+1-\frac{2 \times T P}{2 \times T P+F P+F N} \end{aligned}
• 优化器：SGD with Momentum (0.9) a batch size of 1 and a weight decay of 0.0001.
• 学习率：初始为0.01，在第50/80/110 epoch分别缩小10倍，在120epoch终止。

• 效率