在这篇文章中，我将使用fastai库(构建在PyTorch上)训练卷积神经网络，将图像分类为硬纸板、玻璃、金属、纸张、塑料或垃圾。我将使用由Gary Thung和Mindy Yang手工收集的图像数据集。请在此处下载机器学习数据集（https://github.com/garythung/trashnet/blob/master/data/dataset-resized.zip）。注意：您需要使用GPU来加速训练。

导入Python库

from fastai.vision import *
from fastai.metrics import error_rate
from pathlib import Path
from glob2 import glob
from sklearn.metrics import confusion_matrix
import pandas as pd
import numpy as np
import os
import zipfile as zf
import shutil
import re
import seaborn as sns

提取数据

首先，我们需要提取“dataset-resized.zip”的内容。

files = zf.ZipFile("dataset-resized.zip",'r')
files.extractall()
files.close()

解压缩后,有六子文件夹:

os.listdir(os.path.join(os.getcwd(),"dataset-resized"))

['paper', 'trash', '.DS_Store', 'cardboard', 'metal', 'glass', 'plastic']

将图像组织到不同的文件夹中

现在我们已经提取了数据，我将使用50-25-25分割将图像分割为训练、验证和测试图像文件夹。首先，我需要定义一些辅助函数。

## helper functions ##
## splits indices for a folder into train, validation, and test indices with random sampling
 ## input: folder path
 ## output: train, valid, and test indices 
def split_indices(folder,seed1,seed2): 
 n = len(os.listdir(folder))
 full_set = list(range(1,n+1))
 ## train indices
 random.seed(seed1)
 train = random.sample(list(range(1,n+1)),int(.5*n))
 ## temp
 remain = list(set(full_set)-set(train))
 ## separate remaining into validation and test
 random.seed(seed2)
 valid = random.sample(remain,int(.5*len(remain)))
 test = list(set(remain)-set(valid))
 
 return(train,valid,test)
## gets file names for a particular type of trash, given indices
 ## input: waste category and indices
 ## output: file names 
def get_names(waste_type,indices):
 file_names = [waste_type+str(i)+".jpg" for i in indices]
 return(file_names) 
## moves group of source files to another folder
 ## input: list of source files and destination folder
 ## no output
def move_files(source_files,destination_folder):
 for file in source_files:
 shutil.move(file,destination_folder)

接下来，我将根据ImageNet目录约定创建一组目标文件夹。它看起来是这样的:

每个图像文件只是材质名称和编号(如cardboard1.jpg)

## paths will be train/cardboard, train/glass, etc...
subsets = ['train','valid']
waste_types = ['cardboard','glass','metal','paper','plastic','trash']
## create destination folders for data subset and waste type
for subset in subsets:
 for waste_type in waste_types:
 folder = os.path.join('data',subset,waste_type)
 if not os.path.exists(folder):
 os.makedirs(folder)
 
if not os.path.exists(os.path.join('data','test')):
 os.makedirs(os.path.join('data','test'))
 
## move files to destination folders for each waste type
for waste_type in waste_types:
 source_folder = os.path.join('dataset-resized',waste_type)
 train_ind, valid_ind, test_ind = split_indices(source_folder,1,1)
 
 ## move source files to train
 train_names = get_names(waste_type,train_ind)
 train_source_files = [os.path.join(source_folder,name) for name in train_names]
 train_dest = "data/train/"+waste_type
 move_files(train_source_files,train_dest)
 
 ## move source files to valid
 valid_names = get_names(waste_type,valid_ind)
 valid_source_files = [os.path.join(source_folder,name) for name in valid_names]
 valid_dest = "data/valid/"+waste_type
 move_files(valid_source_files,valid_dest)
 
 ## move source files to test
 test_names = get_names(waste_type,test_ind)
 test_source_files = [os.path.join(source_folder,name) for name in test_names]
 ## I use data/test here because the images can be mixed up
 move_files(test_source_files,"data/test")

为了重现，我将两个随机样本的seed都设置为1。

## get a path to the folder with images
path = Path(os.getcwd())/"data"
tfms = get_transforms(do_flip=True,flip_vert=True)
data = ImageDataBunch.from_folder(path,test="test",ds_tfms=tfms,bs=16)

ImageDataBunch.from_folder（）指定我们将从ImageNet结构的文件夹中提取我们的训练，验证和测试数据。批量大小bs是您一次训练的图像数量。如果计算机内存较少，请选择较小的batch size。您可以使用get_transforms（）函数来增强数据。

以下是数据的示例：

data.show_batch(rows=4,figsize=(10,8))

机器学习模型训练

learn = create_cnn(data,models.resnet34,metrics=error_rate)

什么是resnet34？

残差神经网络是一个多层卷积神经网络(CNN)。resnet34有34层，是在ImageNet数据库上预训练好的。预训练的卷积神经网络(CNN)将在新的图像分类任务中表现得更好，因为它已经学习了一些视觉特征，并且可以将这些知识进行转移(从而转移学习)。

由于深度神经网络能够描述更多的复杂性，因此理论上它应该比浅层神经网络在训练数据上表现得更好。

创建了Resnets以使用称为快捷方式连接的黑客来规避这个故障。如果图层中的某些节点具有次优值，则可以调整权重和偏差; 如果节点是最优的（其残差为0），为什么不单独留下？仅根据需要对节点进行调整（当存在非零残差时）。

需要调整时，快捷方式连接应用标识功能将信息传递给后续层。这在可能的情况下缩短了神经网络，并允许resnet具有深层体系结构，并且更像浅层神经网络。resnet34中的34只是指层数。

找到学习率

我将找到梯度下降的学习率，以确保我的神经网络合理快速收敛。

learn.lr_find(start_lr=1e-6,end_lr=1e1)
learn.recorder.plot()

学习率查找器建议学习率为5.13e-03。

训练机器学习模型

learn.fit_one_cycle(20,max_lr=5.13e-03)

模型运行了20个epochs。这种拟合方法最酷的地方在于，学习率随着时间的推移而降低，使我们越来越接近最优值。8.6%的验证误差看起来非常好，让我们看看它在测试数据上是如何执行的。

首先，我们可以看看哪些图片分类最不正确。

可视化大多数不正确的图像

interp = ClassificationInterpretation.from_learner(learn)
losses,idxs = interp.top_losses()
interp.plot_top_losses(9, figsize=(15,11))

看起来这些照片曝光太多，所以这实际上并不是机器学习模型的错！

doc(interp.plot_top_losses)
interp.plot_confusion_matrix(figsize=(12,12), dpi=60)

机器学习模型常常把塑料和玻璃混淆，把金属和玻璃混淆。最令人困惑的图片如下

interp.most_confused(min_val=2)

对测试数据做出预测

要了解此机器学习模式的实际执行情况，我们需要对测试数据进行预测。首先，我将使用learner.get_preds（）方法对测试数据进行预测。

注意： learner.predict（）仅预测单个图像，而learner.get_preds（）预测一组图像。

preds = learn.get_preds(ds_type=DatasetType.Test)

get_preds（ds_type）中的ds_type参数采用DataSet参数。示例值是DataSet.Train，DataSet.Valid和DataSet.Test。

print(preds[0].shape)
preds[0]

这些是每个图像的预测概率。这个张量有365行（每幅图像-行）和6列（每一种材料类别一列）。

data.classes

['cardboard', 'glass', 'metal', 'paper', 'plastic', 'trash']

把上面张量中的概率转换成一个带有类名的字符串。

## saves the index (0 to 5) of most likely (max) predicted class for each image
max_idxs = np.asarray(np.argmax(preds[0],axis=1))
yhat = []
for max_idx in max_idxs:
 yhat.append(data.classes[max_idx])

让我们检查第一张图像是否真的是玻璃。

learn.data.test_ds[0][0]

接下来，我将从测试数据集中获得实际的标签。

y = []
## convert POSIX paths to string first
for label_path in data.test_ds.items:
 y.append(str(label_path))
 
## then extract waste type from file path
pattern = re.compile("([a-z]+)[0-9]+")
for i in range(len(y)):
 y[i] = pattern.search(y[i]).group(1)

快速检查

## predicted values
print(yhat[0:5])
## actual values
print(y[0:5])

看起来前五个预测是一致的!(检查)。同样，我们可以用混淆矩阵来找出答案。

混淆矩阵

cm = confusion_matrix(y,yhat)
df_cm = pd.DataFrame(cm,waste_types,waste_types)
plt.figure(figsize=(10,8))
sns.heatmap(df_cm,annot=True,fmt="d",cmap="YlGnBu")

同样，这个模型似乎混淆了金属与玻璃的关系和塑料与玻璃的关系。

correct = 0
for r in range(len(cm)):
 for c in range(len(cm)):
 if (r==c):
 correct += cm[r,c]
accuracy = correct/sum(sum(cm))
accuracy

0.9212598425196851

最终在测试数据上获得了92.1％的准确度，这非常棒 - TrashNet数据集的原始创建者在70-30测试训练拆分中使用支持向量机实现了63％的测试精度。

最后

建议从数据集中删除过度曝光的照片。这只是一个快速的小项目，它展示了训练一个图像分类模型是非常快的。

玖叶教程网

前端编程开发入门

如何构建用于垃圾分类的图像分类器实例分析

导入Python库

提取数据

将图像组织到不同的文件夹中

机器学习模型训练

对测试数据做出预测

最后