TensorFlow 2.0教程 | 六、经典卷积神经网络

经典卷积神经网络

6.1 LeNet

LeNet卷积神经网络是L eCun于1998年提出,时卷积神经网络的开篇之作。通过共享卷积核减少了网络的参数

在统计卷积神经网络层数时,一般只统计卷积计算层和全连接计算层,其余操作可以认为是卷积计算层的附属,LeNet共有五层网络。

经过C1和C3两层卷积后,再经过连续的三层全链接。

  1. 第一层卷积:6个55的卷积核;卷积步长时1;不使用全零填充;LeNet提出时还没有BN操作,所以不使用BN操作;LeNet时代sigmoid是主流的激活函数,所以使用sigmoid激活函数;用22的池化核,步长为2,不使用全零填充;LeNet时代还没有Dropout,不使用Dropout。
  2. 第二层是16个55的卷积核;卷积步长为1;不使用全零填充;不使用BN操作;使用sigmoid激活函数;用22的池化核,步长为2,不使用全零填充;不使用Dropout。
  3. FLatten拉直,连续三层全连接网络,神经元分别是120、84、10,前两层全连接使用sigmoid激活函数,最后一层使用softmax使输出符合概率分布。

代码示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class LeNet5(Model):
def __init__(self):
super(LeNet5, self).__init__()
self.c1 = Conv2D(filters=6, kernel_size=(5, 5),
activation='sigmoid')
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2)

self.c2 = Conv2D(filters=16, kernel_size=(5, 5),
activation='sigmoid')
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2)

self.flatten = Flatten()
self.f1 = Dense(120, activation='sigmoid')
self.f2 = Dense(84, activation='sigmoid')
self.f3 = Dense(10, activation='softmax')

def call(self, x):
x = self.c1(x)
x = self.p1(x)

x = self.c2(x)
x = self.p2(x)

x = self.flatten(x)
x = self.f1(x)
x = self.f2(x)
y = self.f3(x)
return y


model = LeNet5()

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/LeNet5.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()

############################################### show ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
......
48224/50000 [===========================>..] - ETA: 0s - loss: 1.4812 - sparse_categorical_accuracy: 0.4590
48608/50000 [============================>.] - ETA: 0s - loss: 1.4803 - sparse_categorical_accuracy: 0.4594
48960/50000 [============================>.] - ETA: 0s - loss: 1.4796 - sparse_categorical_accuracy: 0.4597
49312/50000 [============================>.] - ETA: 0s - loss: 1.4795 - sparse_categorical_accuracy: 0.4597
49696/50000 [============================>.] - ETA: 0s - loss: 1.4798 - sparse_categorical_accuracy: 0.4597
50000/50000 [==============================] - 8s 166us/sample - loss: 1.4795 - sparse_categorical_accuracy: 0.4599 - val_loss: 1.4530 - val_sparse_categorical_accuracy: 0.4695
Model: "le_net5"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) multiple 456
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple 0
_________________________________________________________________
conv2d_1 (Conv2D) multiple 2416
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 multiple 0
_________________________________________________________________
flatten (Flatten) multiple 0
_________________________________________________________________
dense (Dense) multiple 48120
_________________________________________________________________
dense_1 (Dense) multiple 10164
_________________________________________________________________
dense_2 (Dense) multiple 850
=================================================================
Total params: 62,006
Trainable params: 62,006
Non-trainable params: 0
_________________________________________________________________

6.2 AlexNet

AlexNet网络诞生于2012年,是Hinton的代表作之一,当年ImageNet竞赛的冠军,Top5错 误率为16.4%。

AlexNet使用relu激活函数,提升了训练速度,使用Dropout缓解了过拟合,AlexNet共有8层

  1. 第一层使用了96个33卷积核;步长为1;不使用全零填充;原论文中使用局部响应标准化LRN,由于LRN操作近些年用得很少,它的功能与批标准化BN相似,所以选择当前主流的BN操作实现特征标准化;使用relu激活函数;用33的池化核步长是2做最大池化;不使用Dropout;
  2. 第二层使用了256个33卷积核;步长为1;不使用全零填充;选择BN操作实现特征标准化;使用relu激活函数;用33的池化核步长是2做最大池化;不使用Dropout;
  3. 第三层使用了384个3*3卷积核;步长为1;使用全零填充;不使用BN操作实现特征标准化;使用relu激活函数;不使用池化;不使用Dropout;
  4. 第四层使用了384个3*3卷积核;步长为1;使用全零填充;不使用BN操作实现特征标准化;使用relu激活函数;不使用池化;不使用Dropout;
  5. 第五层使用了256个33卷积核;步长为1;使用全零填充;不使用BN操作;使用relu激活函数;用33的池化核步长是2做最大池化;不使用Dropout;
  6. 第六七八层是全连接层,六七层使用2048个神经元,relu激活函数50%Dropout;第八层使用10个神经元,用softmax使输出符合概率分布

代码示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class AlexNet8(Model):
def __init__(self):
super(AlexNet8, self).__init__()
self.c1 = Conv2D(filters=96, kernel_size=(3, 3))
self.b1 = BatchNormalization()
self.a1 = Activation('relu')
self.p1 = MaxPool2D(pool_size=(3, 3), strides=2)

self.c2 = Conv2D(filters=256, kernel_size=(3, 3))
self.b2 = BatchNormalization()
self.a2 = Activation('relu')
self.p2 = MaxPool2D(pool_size=(3, 3), strides=2)

self.c3 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',
activation='relu')

self.c4 = Conv2D(filters=384, kernel_size=(3, 3), padding='same',
activation='relu')

self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same',
activation='relu')
self.p3 = MaxPool2D(pool_size=(3, 3), strides=2)

self.flatten = Flatten()
self.f1 = Dense(2048, activation='relu')
self.d1 = Dropout(0.5)
self.f2 = Dense(2048, activation='relu')
self.d2 = Dropout(0.5)
self.f3 = Dense(10, activation='softmax')

def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.p1(x)

x = self.c2(x)
x = self.b2(x)
x = self.a2(x)
x = self.p2(x)

x = self.c3(x)

x = self.c4(x)

x = self.c5(x)
x = self.p3(x)

x = self.flatten(x)
x = self.f1(x)
x = self.d1(x)
x = self.f2(x)
x = self.d2(x)
y = self.f3(x)
return y


model = AlexNet8()

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/AlexNet8.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()

############################################### show ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
......
49888/50000 [============================>.] - ETA: 0s - loss: 0.9936 - sparse_categorical_accuracy: 0.6564
49952/50000 [============================>.] - ETA: 0s - loss: 0.9934 - sparse_categorical_accuracy: 0.6564
50000/50000 [==============================] - 75s 1ms/sample - loss: 0.9934 - sparse_categorical_accuracy: 0.6564 - val_loss: 1.2769 - val_sparse_categorical_accuracy: 0.5684
Model: "alex_net8"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) multiple 2688
_________________________________________________________________
batch_normalization (BatchNo multiple 384
_________________________________________________________________
activation (Activation) multiple 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple 0
_________________________________________________________________
conv2d_1 (Conv2D) multiple 221440
_________________________________________________________________
batch_normalization_1 (Batch multiple 1024
_________________________________________________________________
activation_1 (Activation) multiple 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 multiple 0
_________________________________________________________________
conv2d_2 (Conv2D) multiple 885120
_________________________________________________________________
conv2d_3 (Conv2D) multiple 1327488
_________________________________________________________________
conv2d_4 (Conv2D) multiple 884992
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 multiple 0
_________________________________________________________________
flatten (Flatten) multiple 0
_________________________________________________________________
dense (Dense) multiple 2099200
_________________________________________________________________
dropout (Dropout) multiple 0
_________________________________________________________________
dense_1 (Dense) multiple 4196352
_________________________________________________________________
dropout_1 (Dropout) multiple 0
_________________________________________________________________
dense_2 (Dense) multiple 20490
=================================================================
Total params: 9,639,178
Trainable params: 9,638,474
Non-trainable params: 704
_________________________________________________________________

6.3 VGGNet

VGGNet诞生于2014年,当年ImageNet竞 赛的亚军,Top5错 误率减小到7.3%。

​VGGNet使用小尺寸卷积核,在减少参数的同时提高了识别准确率,VGGNetl的网络结构规整,非常适合硬件加速

设计这个网络时,卷积核的个数从64到128到256到512,逐渐增加,因为越靠后,特征图尺寸越小。通过增加卷积核的个数,增加了特征图深度,保持了信息的承载能力。

代码示例

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class VGG16(Model):
def __init__(self):
super(VGG16, self).__init__()
self.c1 = Conv2D(filters=64, kernel_size=(3, 3), padding='same') # 卷积层1
self.b1 = BatchNormalization() # BN层1
self.a1 = Activation('relu') # 激活层1
self.c2 = Conv2D(filters=64, kernel_size=(3, 3), padding='same', )
self.b2 = BatchNormalization() # BN层1
self.a2 = Activation('relu') # 激活层1
self.p1 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d1 = Dropout(0.2) # dropout层

self.c3 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b3 = BatchNormalization() # BN层1
self.a3 = Activation('relu') # 激活层1
self.c4 = Conv2D(filters=128, kernel_size=(3, 3), padding='same')
self.b4 = BatchNormalization() # BN层1
self.a4 = Activation('relu') # 激活层1
self.p2 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d2 = Dropout(0.2) # dropout层

self.c5 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b5 = BatchNormalization() # BN层1
self.a5 = Activation('relu') # 激活层1
self.c6 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b6 = BatchNormalization() # BN层1
self.a6 = Activation('relu') # 激活层1
self.c7 = Conv2D(filters=256, kernel_size=(3, 3), padding='same')
self.b7 = BatchNormalization()
self.a7 = Activation('relu')
self.p3 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d3 = Dropout(0.2)

self.c8 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b8 = BatchNormalization() # BN层1
self.a8 = Activation('relu') # 激活层1
self.c9 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b9 = BatchNormalization() # BN层1
self.a9 = Activation('relu') # 激活层1
self.c10 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b10 = BatchNormalization()
self.a10 = Activation('relu')
self.p4 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d4 = Dropout(0.2)

self.c11 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b11 = BatchNormalization() # BN层1
self.a11 = Activation('relu') # 激活层1
self.c12 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b12 = BatchNormalization() # BN层1
self.a12 = Activation('relu') # 激活层1
self.c13 = Conv2D(filters=512, kernel_size=(3, 3), padding='same')
self.b13 = BatchNormalization()
self.a13 = Activation('relu')
self.p5 = MaxPool2D(pool_size=(2, 2), strides=2, padding='same')
self.d5 = Dropout(0.2)

self.flatten = Flatten()
self.f1 = Dense(512, activation='relu')
self.d6 = Dropout(0.2)
self.f2 = Dense(512, activation='relu')
self.d7 = Dropout(0.2)
self.f3 = Dense(10, activation='softmax')

def call(self, x):
x = self.c1(x)
x = self.b1(x)
x = self.a1(x)
x = self.c2(x)
x = self.b2(x)
x = self.a2(x)
x = self.p1(x)
x = self.d1(x)

x = self.c3(x)
x = self.b3(x)
x = self.a3(x)
x = self.c4(x)
x = self.b4(x)
x = self.a4(x)
x = self.p2(x)
x = self.d2(x)

x = self.c5(x)
x = self.b5(x)
x = self.a5(x)
x = self.c6(x)
x = self.b6(x)
x = self.a6(x)
x = self.c7(x)
x = self.b7(x)
x = self.a7(x)
x = self.p3(x)
x = self.d3(x)

x = self.c8(x)
x = self.b8(x)
x = self.a8(x)
x = self.c9(x)
x = self.b9(x)
x = self.a9(x)
x = self.c10(x)
x = self.b10(x)
x = self.a10(x)
x = self.p4(x)
x = self.d4(x)

x = self.c11(x)
x = self.b11(x)
x = self.a11(x)
x = self.c12(x)
x = self.b12(x)
x = self.a12(x)
x = self.c13(x)
x = self.b13(x)
x = self.a13(x)
x = self.p5(x)
x = self.d5(x)

x = self.flatten(x)
x = self.f1(x)
x = self.d6(x)
x = self.f2(x)
x = self.d7(x)
y = self.f3(x)
return y


model = VGG16()

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/VGG16.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()

############################################### show ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
......
49920/50000 [============================>.] - ETA: 0s - loss: 0.8387 - sparse_categorical_accuracy: 0.7211
49952/50000 [============================>.] - ETA: 0s - loss: 0.8385 - sparse_categorical_accuracy: 0.7212
49984/50000 [============================>.] - ETA: 0s - loss: 0.8381 - sparse_categorical_accuracy: 0.7213
50000/50000 [==============================] - 196s 4ms/sample - loss: 0.8381 - sparse_categorical_accuracy: 0.7213 - val_loss: 0.9402 - val_sparse_categorical_accuracy: 0.6882
Model: "vg_g16"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) multiple 1792
_________________________________________________________________
batch_normalization (BatchNo multiple 256
_________________________________________________________________
activation (Activation) multiple 0
_________________________________________________________________
conv2d_1 (Conv2D) multiple 36928
_________________________________________________________________
batch_normalization_1 (Batch multiple 256
_________________________________________________________________
activation_1 (Activation) multiple 0
_________________________________________________________________
max_pooling2d (MaxPooling2D) multiple 0
_________________________________________________________________
dropout (Dropout) multiple 0
_________________________________________________________________
conv2d_2 (Conv2D) multiple 73856
_________________________________________________________________
batch_normalization_2 (Batch multiple 512
_________________________________________________________________
activation_2 (Activation) multiple 0
_________________________________________________________________
conv2d_3 (Conv2D) multiple 147584
_________________________________________________________________
batch_normalization_3 (Batch multiple 512
_________________________________________________________________
activation_3 (Activation) multiple 0
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 multiple 0
_________________________________________________________________
dropout_1 (Dropout) multiple 0
_________________________________________________________________
conv2d_4 (Conv2D) multiple 295168
_________________________________________________________________
batch_normalization_4 (Batch multiple 1024
_________________________________________________________________
activation_4 (Activation) multiple 0
_________________________________________________________________
conv2d_5 (Conv2D) multiple 590080
_________________________________________________________________
batch_normalization_5 (Batch multiple 1024
_________________________________________________________________
activation_5 (Activation) multiple 0
_________________________________________________________________
conv2d_6 (Conv2D) multiple 590080
_________________________________________________________________
batch_normalization_6 (Batch multiple 1024
_________________________________________________________________
activation_6 (Activation) multiple 0
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 multiple 0
_________________________________________________________________
dropout_2 (Dropout) multiple 0
_________________________________________________________________
conv2d_7 (Conv2D) multiple 1180160
_________________________________________________________________
batch_normalization_7 (Batch multiple 2048
_________________________________________________________________
activation_7 (Activation) multiple 0
_________________________________________________________________
conv2d_8 (Conv2D) multiple 2359808
_________________________________________________________________
batch_normalization_8 (Batch multiple 2048
_________________________________________________________________
activation_8 (Activation) multiple 0
_________________________________________________________________
conv2d_9 (Conv2D) multiple 2359808
_________________________________________________________________
batch_normalization_9 (Batch multiple 2048
_________________________________________________________________
activation_9 (Activation) multiple 0
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 multiple 0
_________________________________________________________________
dropout_3 (Dropout) multiple 0
_________________________________________________________________
conv2d_10 (Conv2D) multiple 2359808
_________________________________________________________________
batch_normalization_10 (Batc multiple 2048
_________________________________________________________________
activation_10 (Activation) multiple 0
_________________________________________________________________
conv2d_11 (Conv2D) multiple 2359808
_________________________________________________________________
batch_normalization_11 (Batc multiple 2048
_________________________________________________________________
activation_11 (Activation) multiple 0
_________________________________________________________________
conv2d_12 (Conv2D) multiple 2359808
_________________________________________________________________
batch_normalization_12 (Batc multiple 2048
_________________________________________________________________
activation_12 (Activation) multiple 0
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 multiple 0
_________________________________________________________________
dropout_4 (Dropout) multiple 0
_________________________________________________________________
flatten (Flatten) multiple 0
_________________________________________________________________
dense (Dense) multiple 262656
_________________________________________________________________
dropout_5 (Dropout) multiple 0
_________________________________________________________________
dense_1 (Dense) multiple 262656
_________________________________________________________________
dropout_6 (Dropout) multiple 0
_________________________________________________________________
dense_2 (Dense) multiple 5130
=================================================================
Total params: 15,262,026
Trainable params: 15,253,578
Non-trainable params: 8,448
_________________________________________________________________

6.4 InceptionNet

InceptionNet诞生于2014年,当年ImageNet竞赛冠军,Top5错 误率为6.67%,InceptionNet引入Inception结构块,在同一层网络内使用不同尺寸的卷积核,提升了模型感知力,使用了批标准化,缓解了梯度消失。

InceptionNet的核心是它的基本单元Inception结构块,无论是GoogLeNet 也就是Inception v1,还是inceptionNet的后续版本,比如v2、v3、v4,都是基于Inception结构块搭建的网络。

Inception结构块在同层网络中便用了多个尺导的卷积核,可以提取不同尺寸的特征,通过1×1卷积核,作用到输入特征图的每个像素点,通过设定少于输入特征图深度的1×1卷积核个数,减少了输出特征图深度,起到了降维的作用,减少了参数量和计算量。

图中给出了一个Inception的结构块,Inception结构块包含四个分支,分别经过

  • 1×1卷积核输出到卷积连接器,
  • 1×1卷积核配合3×3卷积核输出到卷积连接器
  • 1×1卷积核配合5×5卷积核输出到卷积连接器
  • 3×3最大池化核配合1×1卷积核输出到卷积连接器

送到卷积连接器的特征数据尺导程同,卷积连接器会把收到的这四路特征数据按深度方向拼接,形成Inception结构块的输出。

Inception结构块用CBAPD描述出来,用颜色对应上Inception结构块内的操作。

  • 第一分支卷积采用了16个1*1卷积核,步长为1全零填充,采用BN操作relu激活函数;
  • 第二分支先用16个1×1卷积核降维,步长为1全零填充,采用BN操作relu激活函数;再用16个3×3卷积核,步长为1全零填充,采用BN操作relu激活函数;
  • 第三分支先用16个1×1卷积核降维,步长为1全零填充,采用BN操作relu激活函数;再用16个5×5卷积核,步长为1全零填充,采用BN操作relu激活函数;
  • 第四分支先采用最大池化,池化核尺可是3×3,步长为1全零填充;再用16个1×1卷积核降维,步长为1全零填充,采用BN操作relu激活函数;

卷积连接器把这四个分支按照深度方向堆叠在一起,构成Inception结构块的输出,由于Inception结构块中的卷积操作均采用了CBA结构,即先卷积再BN再采用relu激活函数,所以将其定义成一个新的类ConvBNRelu,减少代码长度,增加可读性。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class ConvBNRelu(Model):
def __init__(self, ch, kernelsz=3, strides=1, padding='same'):
# 定义了默认卷积核边长是3步长为1全零填充
super(ConvBNRelu, self).__init__()
self.model = tf.keras.models.Sequential([
Conv2D(ch, kernelsz, strides=strides, padding=padding),
BatchNormalization(),
Activation('relu')
])

def call(self, x):
x = self.model(x, training=False)
#在training=False时,BN通过整个训练集计算均值、方差去做批归一化,training=True时,通过当前batch的均值、方差去做批归一化。推理时 training=False效果好
return x

参数 ch 代表特征图的通道数,也即卷积核个数;kernelsz 代表卷积核尺寸;strides 代表 卷积步长;padding 代表是否进行全零填充。​ 完成了这一步后,就可以开始构建 InceptionNet 的基本单元了,同样利用 class 定义的方式,定义一个新的 InceptionBlk类。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
class InceptionBlk(Model):
def __init__(self, ch, strides=1):
super(InceptionBlk, self).__init__()
self.ch = ch
self.strides = strides
self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)
self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1)
self.p4_1 = MaxPool2D(3, strides=1, padding='same')
self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)

def call(self, x):
x1 = self.c1(x)
x2_1 = self.c2_1(x)
x2_2 = self.c2_2(x2_1)
x3_1 = self.c3_1(x)
x3_2 = self.c3_2(x3_1)
x4_1 = self.p4_1(x)
x4_2 = self.c4_2(x4_1)
# concat along axis=channel
x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
return x

参数 ch 仍代表通道数,strides 代表卷积步长,与 ConvBNRelu 类中一致;tf.concat 函数将四个输出按照深度方向连接在一起,x1、x2_2、x3_2、x4_2 分别代表四列输出,结合结构图和代码很容易看出二者的对应关系。​ InceptionNet 网络的主体就是由其基本单元构成的,有了Inception结构块后,就可以搭建出一个精简版本的InceptionNet,网络共有10层,其模型结构如图

第一层采用16个3*3卷积核,步长为1,全零填充,BN操作,rule激活。随后是4个Inception结构块顺序相连,每两个Inception结构块组成俞block,每个bIock中的第一个Inception结构块,卷积步长是2,第二个Inception结构块,卷积步长是1,这使得第一个Inception结构块输出特征图尺寸减半,因此把输出特征图深度加深,尽可能保证特征抽取中信息的承载量一致。

block_0设置的通道数是16,经过了四个分支,输出的深度为4×16=64;在self.out_channels ×= 2给通道数加倍了,所以block_1通道数是block_0通道数的两倍是32,经过了四个分支,输出的深度为4×32=128,这128个通道的数据会被送入平均池化,送入10个分类的全连接。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
class Inception10(Model):
def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
super(Inception10, self).__init__(**kwargs)
self.in_channels = init_ch
self.out_channels = init_ch
self.num_blocks = num_blocks
self.init_ch = init_ch

# 第一层采用16个3*3卷积核,步长为1,全零填充,BN操作,rule激活。
# 设定了默认init_ch=16,默认输出深度是16,
# 定义ConvBNRe lu类的时候,默认卷积核边长是3步长为1全零填充,所以直接调用
self.c1 = ConvBNRelu(init_ch)

# 每个bIock中的第一个Inception结构块,卷积步长是2,
# 第二个Inception结构块,卷积步长是1,
self.blocks = tf.keras.models.Sequential()
for block_id in range(num_blocks):
for layer_id in range(2):
if layer_id == 0:
block = InceptionBlk(self.out_channels, strides=2)
else:
block = InceptionBlk(self.out_channels, strides=1)
self.blocks.add(block)
# enlarger out_channels per block

# 给通道数加倍了,所以block_1通道数是block_0通道数的两倍是32
self.out_channels *= 2

self.p1 = GlobalAveragePooling2D() # 128个通道的数据会被送入平均池化
self.f1 = Dense(num_classes, activation='softmax') # 送入10个分类的全连接。

def call(self, x):
x = self.c1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y
  • 参数 num_block 代表 InceptionNet 的 Block 数,每个 Block 由两个基本单元构成;
  • num_classes 代表分类数,对于 cifar10 数据集来说即为 10;
  • init_ch 代表初始通道数,也即 InceptionNet 基本单元的初始卷积核个数。 InceptionNet 网络不再像 VGGNet 一样有三层全连接层(全连接层的参数量占 VGGNet 总参数量的 90 %),而是采用“全局平均池化+全连接层”的方式,这减少了大量的参数。

全部代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense, \
GlobalAveragePooling2D
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class ConvBNRelu(Model):
def __init__(self, ch, kernelsz=3, strides=1, padding='same'):
# 定义了默认卷积核边长是3步长为1全零填充
super(ConvBNRelu, self).__init__()
self.model = tf.keras.models.Sequential([
Conv2D(ch, kernelsz, strides=strides, padding=padding),
BatchNormalization(),
Activation('relu')
])

def call(self, x):
x = self.model(x, training=False) #在training=False时,BN通过整个训练集计算均值、方差去做批归一化,training=True时,通过当前batch的均值、方差去做批归一化。推理时 training=False效果好
return x


class InceptionBlk(Model):
def __init__(self, ch, strides=1):
super(InceptionBlk, self).__init__()
self.ch = ch
self.strides = strides
self.c1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c2_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c2_2 = ConvBNRelu(ch, kernelsz=3, strides=1)
self.c3_1 = ConvBNRelu(ch, kernelsz=1, strides=strides)
self.c3_2 = ConvBNRelu(ch, kernelsz=5, strides=1)
self.p4_1 = MaxPool2D(3, strides=1, padding='same')
self.c4_2 = ConvBNRelu(ch, kernelsz=1, strides=strides)

def call(self, x):
x1 = self.c1(x)
x2_1 = self.c2_1(x)
x2_2 = self.c2_2(x2_1)
x3_1 = self.c3_1(x)
x3_2 = self.c3_2(x3_1)
x4_1 = self.p4_1(x)
x4_2 = self.c4_2(x4_1)
# concat along axis=channel
x = tf.concat([x1, x2_2, x3_2, x4_2], axis=3)
return x


class Inception10(Model):
def __init__(self, num_blocks, num_classes, init_ch=16, **kwargs):
super(Inception10, self).__init__(**kwargs)
self.in_channels = init_ch
self.out_channels = init_ch
self.num_blocks = num_blocks
self.init_ch = init_ch

# 第一层采用16个3*3卷积核,步长为1,全零填充,BN操作,rule激活。
# 设定了默认init_ch=16,默认输出深度是16,
# 定义ConvBNRe lu类的时候,默认卷积核边长是3步长为1全零填充,所以直接调用
self.c1 = ConvBNRelu(init_ch)

# 每个bIock中的第一个Inception结构块,卷积步长是2,
# 第二个Inception结构块,卷积步长是1,
self.blocks = tf.keras.models.Sequential()
for block_id in range(num_blocks):
for layer_id in range(2):
if layer_id == 0:
block = InceptionBlk(self.out_channels, strides=2)
else:
block = InceptionBlk(self.out_channels, strides=1)
self.blocks.add(block)
# enlarger out_channels per block

# 给通道数加倍了,所以block_1通道数是block_0通道数的两倍是32
self.out_channels *= 2

self.p1 = GlobalAveragePooling2D() # 128个通道的数据会被送入平均池化
self.f1 = Dense(num_classes, activation='softmax') # 送入10个分类的全连接。

def call(self, x):
x = self.c1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y

# num_blocks指定inceptionNet的Block数是2,block_0和block_1;
# num_classes指定网络10分类
model = Inception10(num_blocks=2, num_classes=10)

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/Inception10.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)

history = model.fit(x_train, y_train, batch_size=1024, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()

############################################### show ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
......
45056/50000 [==========================>...] - ETA: 2s - loss: 1.4840 - sparse_categorical_accuracy: 0.4566
46080/50000 [==========================>...] - ETA: 2s - loss: 1.4839 - sparse_categorical_accuracy: 0.4567
47104/50000 [===========================>..] - ETA: 1s - loss: 1.4829 - sparse_categorical_accuracy: 0.4567
48128/50000 [===========================>..] - ETA: 0s - loss: 1.4828 - sparse_categorical_accuracy: 0.4565
49152/50000 [============================>.] - ETA: 0s - loss: 1.4827 - sparse_categorical_accuracy: 0.4570
50000/50000 [==============================] - 28s 556us/sample - loss: 1.4827 - sparse_categorical_accuracy: 0.4569 - val_loss: 1.4545 - val_sparse_categorical_accuracy: 0.4645
Model: "inception10"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv_bn_relu (ConvBNRelu) multiple 512
_________________________________________________________________
sequential_1 (Sequential) multiple 119616
_________________________________________________________________
global_average_pooling2d (Gl multiple 0
_________________________________________________________________
dense (Dense) multiple 1290
=================================================================
Total params: 121,418
Trainable params: 120,234
Non-trainable params: 1,184
_________________________________________________________________

6.4 ResNet

ResNet诞生于2015年,当年ImageNet竞赛冠军,Top5错 误率为3.57%,ResNet提出了层间残差跳连,引入了前方信息,缓解梯度消失,使神经网络层数增加成为可能,我们纵览刚刚讲过的四个卷积网络层数,网络层数加深提高识别准确率

模型名称 网络层数
LetNet 5
AlexNet 8
VGG 16/19
InceptionNet 22

可见人们在探索卷积实现特征提取的道路上,通过加深网络层数,取得了越来约好的效果。

​ ResNet的作者何凯明在cifar10数据集上做了个实验,他发现56层卷积网络的错误率,要高于20层卷积网络的错误率,他认为单纯堆叠神经网络层数会使神经网络模型退化,以至于后边的特征丢失了前边特征的原本模样。

​ 于是他用了一根跳连线,将前边的特征直接接到了后边,使这里的输出结果H(x),包含了堆叠卷积的非线性输出F (x),和跳过这两层堆叠卷积直接连接过来的恒等映射x,让他们对应元素相加,这一操作有效缓解了神经网络模型堆叠导致的退化,使得神经网络可以向着更深层级发展。

注意,ResNet块中的"+“与Inception块中的”+”是不同的

  • Inception块中的“+”是沿深度方向叠加(千层蛋糕层数叠加)
  • ResNet块中的“+”是特征图对应元素值相加(矩阵值相加)

ResNet块中有两种情况

一种情况用图中的实线表示,这种情况两层堆叠卷积没有改变特征图的维度,也就它们特征图的个数、高、宽和深度都相同,可以直接将F(x)与x相加。

另一种情用图中的處线表示,这种情况中这两层堆叠卷积改变了特征图的维度,需要借助1*1的卷积来调整x的维度,使W (x)与F (x)的维度一致。

1×1卷积操作可通过步长改变特征图尺寸,通过卷积核个数改特征图深度。

​ResNet块有两种形式,一种堆叠前后维度相同,另一堆叠前后维度不相同,将ResNet块的两种结构封装到一个橙色块中,定义一个ResNetBlock类,每调用一次ResNetBlock类,会生成一个黄色块。

如果堆叠卷积层前后维度不同,residual_path等 于1,调用红色块中的代码,使用1*1卷积操作,调整输入特征图inputs的尺寸或深度后,将堆叠卷积输出特征y,和if语句计算歌的residual相加、过激活、输出。

​如果堆叠卷积层前后维度相同,不执行红色块内代码,直接将堆叠卷积输出特征y和输入特征图inputs相加、过激活、输出。

搭建网络结构,ResNet18的第一层是个卷积,然后是8个ResNet块,最后是一层全连接,每一个ResNet块有两层卷积,一共是18层网络。

第一层:采用64和3*3卷积核,步长为1,全零填充,采用BN操作,rule激活,图中代码紫色块。

下面的结果描述四个橙色块,第一个橙色快是两条实线跳连的ResNet块,第二三四个橙色快,先虚线再实线跳连的ResNet块,用for循环实现,循环次数由参赛列表元素个数决定,这里列表赋值是2,2,2,2四个元素,最外层for循环执行四次,每次进入循环,根据当前是第几个元素,选择residual_path=True,用虚线连接,residual_path=False,用实线连接,调用ReshetBlock生成左边ResNet18结构中的一个橙色块,经过平均池化和全连接,得到输出结果

完整代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
import tensorflow as tf
import os
import numpy as np
from matplotlib import pyplot as plt
from tensorflow.keras.layers import Conv2D, BatchNormalization, Activation, MaxPool2D, Dropout, Flatten, Dense
from tensorflow.keras import Model

np.set_printoptions(threshold=np.inf)

cifar10 = tf.keras.datasets.cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0


class ResnetBlock(Model):

def __init__(self, filters, strides=1, residual_path=False):
super(ResnetBlock, self).__init__()
self.filters = filters
self.strides = strides
self.residual_path = residual_path

self.c1 = Conv2D(filters, (3, 3), strides=strides, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')

self.c2 = Conv2D(filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b2 = BatchNormalization()

# residual_path为True时,对输入进行下采样,即用1x1的卷积核做卷积操作,保证x能和F(x)维度相同,顺利相加
if residual_path:
self.down_c1 = Conv2D(filters, (1, 1), strides=strides, padding='same', use_bias=False)
self.down_b1 = BatchNormalization()

self.a2 = Activation('relu')

def call(self, inputs):
residual = inputs # residual等于输入值本身,即residual=x
# 将输入通过卷积、BN层、激活层,计算F(x)
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)

x = self.c2(x)
y = self.b2(x)

if self.residual_path:
residual = self.down_c1(inputs)
residual = self.down_b1(residual)

out = self.a2(y + residual) # 最后输出的是两部分的和,即F(x)+x或F(x)+Wx,再过激活函数
return out


class ResNet18(Model):

def __init__(self, block_list, initial_filters=64): # block_list表示每个block有几个卷积层
super(ResNet18, self).__init__()
self.num_blocks = len(block_list) # 共有几个block
self.block_list = block_list
self.out_filters = initial_filters

# 第一层
self.c1 = Conv2D(self.out_filters, (3, 3), strides=1, padding='same', use_bias=False)
self.b1 = BatchNormalization()
self.a1 = Activation('relu')

self.blocks = tf.keras.models.Sequential()
# 构建ResNet网络结构
for block_id in range(len(block_list)): # 第几个resnet block
for layer_id in range(block_list[block_id]): # 第几个卷积层

if block_id != 0 and layer_id == 0: # 对除第一个block以外的每个block的输入进行下采样
block = ResnetBlock(self.out_filters, strides=2, residual_path=True)
else:
block = ResnetBlock(self.out_filters, residual_path=False)
self.blocks.add(block) # 将构建好的block加入resnet
self.out_filters *= 2 # 下一个block的卷积核数是上一个block的2倍
self.p1 = tf.keras.layers.GlobalAveragePooling2D()
self.f1 = tf.keras.layers.Dense(10, activation='softmax', kernel_regularizer=tf.keras.regularizers.l2())

def call(self, inputs):
x = self.c1(inputs)
x = self.b1(x)
x = self.a1(x)
x = self.blocks(x)
x = self.p1(x)
y = self.f1(x)
return y


model = ResNet18([2, 2, 2, 2])

model.compile(optimizer='adam',
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False),
metrics=['sparse_categorical_accuracy'])

checkpoint_save_path = "./checkpoint/ResNet18.ckpt"
if os.path.exists(checkpoint_save_path + '.index'):
print('-------------load the model-----------------')
model.load_weights(checkpoint_save_path)

cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_save_path,
save_weights_only=True,
save_best_only=True)

history = model.fit(x_train, y_train, batch_size=32, epochs=5, validation_data=(x_test, y_test), validation_freq=1,
callbacks=[cp_callback])
model.summary()

# print(model.trainable_variables)
file = open('./weights.txt', 'w')
for v in model.trainable_variables:
file.write(str(v.name) + '\n')
file.write(str(v.shape) + '\n')
file.write(str(v.numpy()) + '\n')
file.close()

############################################### show ###############################################

# 显示训练集和验证集的acc和loss曲线
acc = history.history['sparse_categorical_accuracy']
val_acc = history.history['val_sparse_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']

plt.subplot(1, 2, 1)
plt.plot(acc, label='Training Accuracy')
plt.plot(val_acc, label='Validation Accuracy')
plt.title('Training and Validation Accuracy')
plt.legend()

plt.subplot(1, 2, 2)
plt.plot(loss, label='Training Loss')
plt.plot(val_loss, label='Validation Loss')
plt.title('Training and Validation Loss')
plt.legend()
plt.show()

运行结果

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
49600/50000 [============================>.] - ETA: 1s - loss: 0.3901 - sparse_categorical_accuracy: 0.8777
49632/50000 [============================>.] - ETA: 1s - loss: 0.3899 - sparse_categorical_accuracy: 0.8778
49664/50000 [============================>.] - ETA: 1s - loss: 0.3902 - sparse_categorical_accuracy: 0.8778
49696/50000 [============================>.] - ETA: 1s - loss: 0.3901 - sparse_categorical_accuracy: 0.8777
49728/50000 [============================>.] - ETA: 1s - loss: 0.3900 - sparse_categorical_accuracy: 0.8778
49760/50000 [============================>.] - ETA: 1s - loss: 0.3901 - sparse_categorical_accuracy: 0.8778
49792/50000 [============================>.] - ETA: 0s - loss: 0.3900 - sparse_categorical_accuracy: 0.8779
49824/50000 [============================>.] - ETA: 0s - loss: 0.3900 - sparse_categorical_accuracy: 0.8779
49856/50000 [============================>.] - ETA: 0s - loss: 0.3899 - sparse_categorical_accuracy: 0.8778
49888/50000 [============================>.] - ETA: 0s - loss: 0.3899 - sparse_categorical_accuracy: 0.8778
49920/50000 [============================>.] - ETA: 0s - loss: 0.3899 - sparse_categorical_accuracy: 0.8778
49952/50000 [============================>.] - ETA: 0s - loss: 0.3898 - sparse_categorical_accuracy: 0.8778
49984/50000 [============================>.] - ETA: 0s - loss: 0.3899 - sparse_categorical_accuracy: 0.8778
50000/50000 [==============================] - 238s 5ms/sample - loss: 0.3899 - sparse_categorical_accuracy: 0.8778 - val_loss: 0.6399 - val_sparse_categorical_accuracy: 0.8054
Model: "res_net18"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) multiple 1728
_________________________________________________________________
batch_normalization (BatchNo multiple 256
_________________________________________________________________
activation (Activation) multiple 0
_________________________________________________________________
sequential (Sequential) multiple 11176448
_________________________________________________________________
global_average_pooling2d (Gl multiple 0
_________________________________________________________________
dense (Dense) multiple 5130
=================================================================
Total params: 11,183,562
Trainable params: 11,173,962
Non-trainable params: 9,600
_________________________________________________________________

6.6 经典卷积网络小结

卷积是什么?卷积就是特征提取器,就是CBAPD

卷积神经网络:借助卷积核提取空间特征后,送入全连接网络

这种特征提取是借助卷积核实现的参数空间共享,通过卷积计算层提取空间信息。例如,我们可以用卷积核提取一张图片的空间特征,再把提取到的空间特征送入全连接网络,实现离散数据的分类

分享到