Improving person re-identification by attribute and identity learning[reading notes]

Improving person re-identification by attribute and identity learning


Improving person re-identification by attribute and identity learning

Most existing re-ID methods only take identity labels of pedestrians into consideration.



However, we find the attributes,containing detailed local descriptions, are beneficial in allowing the re-ID model to learnmore discriminative feature representations.


属性有助于re-ID 模型去学习更有辨别的特征表达


in this paper, based on thecomplementarity of attribute labels and ID labels, we propose an attribute-person recognition (APR) network, a multi-task network which learns a re-ID embedding and at the same time predicts pedestrian attributes.

基于属性标签和 身边(ID)标签的互补性,提出了——

attribute-person recognition (APR) network (属性-人物识别网络)



We manually annotate attribute labels for two large-scale re-ID datasets, and systematically investigate how person re-ID and attribute recognition benefit from each other. In addition, were-weight the attribute predictionsconsidering the dependencies and correlations among the attributes.

给两个大规模re-ID数据集 手动标注了属性标签系统地调查了人物 的re-ID 和属性识别如何互利



The experimental resultson two large-scale re-ID benchmarks demonstrate that by learning a more discriminative representation, APR achieves competitive re-ID performance compared with the state-of-the-art methods.

We use APR to speed up the retrieval process by ten times with a minor accuracy drop of 2.92% on Market-1501. Besides, we also apply APR on the attribute recognition task and demonstrate improvement over the baselines.

通过学习更有辨别力的表达,APR 拥有了和state-of-the-art方法相比较的竞争力。





Attributes describe detail information for a person, including gender, accessory, the color of clothes, etc .

属性: 性别、配饰、衣服的颜色 etc.


In this paper, we aim toimprove the performance of large-scale person re-ID, usingcomplementary cues(互补线索)from attribute labels.



The motivation of this paper is that existing large-scale pedestrian datasets for re-ID contains only annotations of identity labels, we believe that attribute labels are complementary with identity labels in person re-ID.

现有数据集仅标注了身份信息,我们却坚信 属性标签在re-ID任务上 和 身份标签是互补的。


The effectiveness of attribute labels is three-fold:

First, training with attribute labels improves the discriminative ability of a re-ID model.

Attribute labels can depict pedestrian images withmore detailed descriptions.

These local descriptions push pedestrianswith similar appearances closer to each otherand those different away from each other

Second, detailed attribute labels explicitly guide the model to learn the person representation by designated human characteristics.

With the attribute labels, the model is able to learn to classify the pedestrians byexplicitly focusing on some local semantic descriptions, which greatly ease the training of models.

Third, attributes can be used to accelerate the retrieval process of re-ID

The main idea is to filter out some gallery images that do not have the same attributes as the query.



In [6] , the PETA dataset is proposed which contains both attribute and identity attributes. However, PETA is comprised of small datasets and most of the datasets only contain one or two images for an identity.

[6 ]提出了包含属性和身份属性的PETA数据集。但是,PETA由小数据集组成。大多数数据集仅包含一个或两个身份图像

When using attributes for re-ID, attributes can be used as auxiliary information for low level features [9] or used to better match images from two cameras [10–12] .

[9] 属性被用作辅助信息

[10-12] 属性被用于更好地匹配来自两个摄像机的图像

In recent years, some deep learning methods are proposed [13–15] . In these works, the network is usually trained by several stages. Franco et al.

最近的深度学习方法中 网络分为几个阶段来训练

[13] propose a coarse-to-fine learning framework. The network is comprised of a set of hybrid deep networks, and one of the networks is trained to classify the gender of a person. In this work, the networks are trained separately and thus may over- look the complementarity of the general ID information and the attribute details. Besides, since gender is the only attribute used in the work, the correlation between attributes is not leveraged in [13] .


其中一个网络 用于对人的性别进行分类。单独训练,可能会忽略一般ID信息和属性细节的互补性性别是使用的唯一属性,没有利用属性之间的相关性。

In [14,15] , the network is first trained on an independent attribute dataset, and then the learned information is transferred to the re-ID task.


A work closest to ours consists of [16] , in which the CNN embedding is only optimized by the attribute loss. We will show that by combining the identification and attribute recognition with an attribute re-weighting module, the APR network is superior to the method proposed in [16] .

[16] CNN嵌入仅通过属性损失进行优化

VS我们将通过将带有属性重置权重模块的属性识别身份识别相结合来证明 ,APR网络优于[16]中提出的方法


First,our work systematically investigates how person re-ID and attribute recognition benefit each other by a jointly learned network.

通过联合学习网络,我们系统地调查了 re-ID 和属性识别是如何互利的

On the one hand, identity labels provide global descriptions for person images, which have been proved effective for learning a good person representation in many re-ID works [17-19]

On the other hand, attribute labels provide detailed local descriptions.

身份标签 提供全局描述


—— 由此实现更高准确率的 属性识别和 re-ID 识别。

Second, in previous works, the correlations of attributes are hardly considered.


In fact, many attributes usually cooccur for a person, and the correlations of attributes may be helpful to re-weight the prediction of each attribute.

We thereby introduce an Attribute Re-weighting Module to utilize correlations among attributes and optimize attribute predictions.




In this paper, we propose theattribute-person recognition (APR) networkto exploit both identity labels and attribute annotations for person re-ID.

By combining the attribute recognition task and identity classification task, the APR network is capable of learning more discriminative feature representations for pedestrians, including global and local descriptions.

结合**属性识别任务 ** 和身份分类任务, APR网络可以学习 更有辨别力的特征表达, 包括全局和局部描述。

Specifically, we take attribute predictions as additional cues for the identity classification. Considering the dependencies among pedestrian attributes, we first re-weight the attribute predictions and then build identification upon these re-weighted attributes descriptions.


考虑属性之间的依赖性,我们首先 re-weight 了 属性预测 并且 在这些re-weight了的属性描述上 构建了身份。

The attribute is also used to speed up the retrieval process by filtering out the gallery images with different attribute from the query image.

这些属性加速了 检索过程。


In the experiment, we show that by applyingthe attribute acceleration process, the evaluation time is saved to a significant extent.


We evaluate the performance of the proposed method APR on two large-scale re-ID datasets and an attribute recognition dataset.The experimental resultsshow that our method achieves competitive re-ID accuracy to the state-of-the-art methods.

APR的性能 和state-of-the-art methods 有得一比

In addition, we demonstrate that the proposed APR yields improvement in the attribute recognition task over the baseline in all the testing datasets.

APR 在属性识别任务上有所改进。


(1) We have manually labeled a set of pedestrian attributes for the Market-1501 dataset and the DukeMTMC-reID dataset. Attribute annotations of both datasets are publicly available on our website ( https://vana77.github.io ).


(2) We propose a novel attribute-person recognition (APR) framework. It learns a discriminative Convolutional Neural Network (CNN) embedding for both person re-identification and attributes recognition.


(3) We introduce the Attribute Re-weighting Module (ARM), which corrects predictions of attributes based on the learned dependency and correlation among attributes.

引入了 属性 re-weighing 模块(ARM), 它根据学习到的属性之间的相关性和依赖性来纠正属性的预测。

(4) We propose an attribute acceleration process to speed up the retrieval process by filtering out the gallery images with different attribute from the query image. The experiment shows that the size of the gallery is reduced by ten times, with only a slight accuracy drop of 2.92%.

提出 了属性加速过程。

(5) We achieve competitive accuracy compared with the state- of-the-art re-ID methods on two large-scale datasets, i.e., Market-1501 [17] and DukeMTMC_reID [20] . We also demonstrate improvements in the attribute recognition task.

效果和state- of-the-art re-ID methods 有得一比。 在属性识别任务上有改进。

Related work

CNN-based person re-ID

基于CNN的方法占据着主导地位 [20-26]

[23] 在卷积层中插入了门控函数。以此来捕捉两张输入图片上细微的不同。


【24】 提出了通过训练来自具有域引导丢失的多个域的分类模型来学习通用特征嵌入。

【20】verification 和 classification losses 的结合被证明是有效的。

【32】提出了一个姿势指导的PPA模型,可以从一个基本的网络中提取attention-aware(注意感知)特征。 这些特征将会被re-weighted,形成最终的特征向量。

【33,34,35】 使用GAN 来解决了re-ID

【34】 提供了PTGAN 模型 来将一个数据集的图片风格转移称为另一个的。使用了身份信息作为桥接。

【36】 dictionary-learning scheme 被应用到转移特征上,通过目标识别和任务检测来进行学习。

【19.37】是最近提出的用以解决re-ID 数据问题的 半监督方法。

【22,38】是最近提出的用以解决re-ID 数据问题的 无监督方法。


属性信息在半监督任务中 对这些方法是有益的。

在本文中, 我们采用了一个简单分类模型作为baseline,进一步探索了传统 身份标签 和 属性标签的 互利性。

Attribute for person re-ID


【9,39】低级的描述和SVM被用来训练属性检测器,这些属性综合各种metric 学习方法。

【11,12】 利用从属性中学习的低级特征和相机相关性

【41】 提出了一种字典学习模型,它利用了分类任务的判别属性

【13】提出了一种coarse-to-fine 学习框架,由一系列混合的深度网络组成。被训练用语来区分 是否是人, 预测人和行人的性别。 在这个工作中,网络被分别训练,忽略了看身份标签和属性标签的互补性。 除此之外,性别是这项工作中唯一用到的属性,所以属性之间的关联也没有被考虑到。没有显示提出的方法时候相对于baselines 改进了属性识别

【14】 首先在带有属性标签的独立的数据集上训练网络,然后仅使用具有三元组丢失的身份标签来对网络进行目标数据集的调整, 最后,预测目标数据集的属性标签与独立数据集结合,进行最后一轮微调。

【15】 网络在标有属性的独立数据集上进行预训练,然后在另一个具有行人 ID的集合上进行调整。

【42】一组属性标签用作检索人物图像的查询。 对抗性学习用于为查询属性生成类似图像的概念,并使其在全局级别和语义ID级别与图像匹配。 该属性还用作无监督学习的监督。

【43】 提出了一种无监督的重新ID方法,该方法通过从标记的源数据中学习的属性来共享源域知识,并通过跨域的联合属性身份转移学习将这些知识传递给未标记的目标数据

Wang et al. [43] propose an unsupervised re-ID method that shares the source domain knowledge through attributes learned from labelled source data and transfers such knowledge to unlabelled target data by a joint attribute identity transfer learning across domains (翻译水平有限)

Attribute annotation

手动标注了数据集Market-1507 【17】DukeMTMC-reID【35】属性标签

这俩个数据集是高校收集的,里面的人物都是学生,但是包含了季节信息, 所以有明显不同的服装。

在Market-1507中,大多穿着裙子和短裤,而在DukeMTMC-reID 穿着长裤



For Market-1507



For DukeMTMC-reID

标记了23 个属性


所有的属性都在身份级别进行标注,例如图2. 中 第二行的前两张图片。 尽管我们不能清晰地看到第二张图片的背包,但是因为他们是同一个人,所以我们仍然给第二张图片标记了“背包”。对于两个数据集, 我们在图3. 中阐述了 属性的分布

我们将两个属性的相关性定义为它们在一个人上共同发生的可能性。在图4 中我们展示了两个有代表性的属性的相关性。


The proposed method



Let SI={(x1,y1),...,(xn,yn)}S_I = \left\{(x_1,y_1),...,(x_n,y_n)\right\}SI​={(x1​,y1​),...,(xn​,yn​)} 带带有行人标注的数据集。

其中xi和yix_i和y_ixi​和yi​分别表示 第i个图像和 它对应的身份标签。

对于每一个图像xi∈SIx_i \in S_Ixi​∈SI​,它有对应的属性标注 ai=(ai1,ai2,...,aim)a_i = (a_i^1,a_i^2,...,a_i^m)ai​=(ai1​,ai2​,...,aim​), 其中aija_i^jaij​ 是图像xix_ixi​的第j个属性标签,m 是属性类型的个数。

Let SA={(x1,a1),...,(xn,an)}S_A=\left\{(x_1,a_1),...,(x_n,a_n)\right\}SA​={(x1​,a1​),...,(xn​,an​)} 作为属性标注集。


Baseline 1 ID-discriminative Embedding (IDE).

和【17】一样,我们采用IDE 来训练 re-ID模型,把re-ID训练过程看作图像身份分类任务。 只使用了SIS_ISI​来训练。 目标函数如下:

Baseline 2 Attribute Recognition Network (ARN).

ARN只使用了SAS_ASA​来训练。目标函数如下:其中fAjf_{A_{j}}fAj​​ 是第j个属性的分类器,我们将输入图像x i上的m个属性预测的所有遭受损失的总和作为第i个样本的损失。



我们使用了嵌入函数 将图像嵌入到特征空间。查询结果是根据查询数据和每个图库数据之间的欧几里德距离的所有图库数据的排名列表。如下:我们将属性预测 fA(wA;ϕ(θ;⋅))f_A(w_A; \phi(\theta;·))fA​(wA​;ϕ(θ;⋅)) 作为输出。因此通过分类度量标准评估基础事实。

Attribute-Person Recognition network

Architecture overview


APR网络包含两个部分,一个用于属性识别,另一个用于身份分类。将行人的图像作为输入APR首先对图像进行了特征提取接着 进行了属性预测,并根据预测的结果和真实的标签计算 属性的loss对于身份分类部分,我们将属性预测结果作为附加线索。为了更好的利用属性,APR 为M个独立的属性计算出loss, 然后 M个预测分数被串联 并 喂给了属性 re-weight 模块(ARM). ARM的输出将会和全局图像特征串联,并计算ID loss。最后的分类是建立在 **局部-全局 特征 ** 串联的情况下的。

Attributie re-weighting module

假设图像x的属性预测结果是{a′1,a′2,...,a′ma'^{1},a'^2,...,a'^ma′1,a′2,...,a′m}. 其中a′ja'^ja′j 是 第j个属性的预测分数。

我们将这些预测分数串联起来作为一个向量a’,则a′∈R1∗ma' \in R^{1*m}a′∈R1∗m.

预测向量a’对应的置信分数 C 按照如下公式学习得到:

C=Sigmoid(va′T+b)(3)C = Sigmoid(va'^T+b) (3)C=Sigmoid(va′T+b)(3)

其中v∈Rm∗m,b∈Rm∗1v \in R^{m*m}, b\in R^{m*1}v∈Rm∗m,b∈Rm∗1 都是可训练的参数。

置信分数 C∈Rm∗1C \in R^{m*1}C∈Rm∗1 是学习到的权重的集合。

因此,属性re-weight 模块转移 原始预测 a’ 为一个新的预测分数:

a=c⋅a′T(4)a = c · a'^T (4)a=c⋅a′T(4)

其中 · 表示元素乘法。

所以,这个预测分数 a 将会和 全局图像表达相串联 进行进一步的 身份分类。



SI和SAS_I 和 S_ASI​和SA​ 组合的数据集称作S,S={(x1,y1,a1),...,(xn,yn,an)}S = \left\{ (x_1,y_1,a_1),...,(x_n,y_n,a_n)\right\}S={(x1​,y1​,a1​),...,(xn​,yn​,an​)} .

基于提取出来的图像表达 ϕ(θ;xi)\phi(\theta;x_i)ϕ(θ;xi​) ,两个目标函数进行同步优化:


和baseline ARN 相似,属性的预测通过 一系列属性分类器的集合获得,我们优化属性预测的目标函数 和公式(2) 相同。

身份的目标函数:为了将属性预测引入到身份预测中,我们将属性预测{fAj(wAj;ϕ(θ;xi))}\left\{f_{A_j}(w_{A_j}; \phi(\theta; x_i)) \right\}{fAj​​(wAj​​;ϕ(θ;xi​))} 和 Re-weight模块获得的re-weight 相结合。用于身份预测的新的目标函数如下:其中a 是连接的re-weight之后的属性预测。


其中λ\lambdaλ 是用于平衡属性识别损失和 身份分类损失的超参数。


Attribute accelerationm process



在离线的计算中,我们会对 图库中的图像 进行 特征提取,并预测属性。





当查询和图库图像的属性可靠时,我们检查两个图像是否对该属性具有相同的预测。 如果不是,则从库池中移除该候选图像。


阈值接近0时,删除了大多数候选人,并且只保留了一些在线匹配的候选人。 这适用于以检索速度为主要焦点的应用。


在Market-1501的实验中,设置阈值为0.7. 我们将检索过程加速了十倍以上,准确度下降了2.92%。

Experimental results

Datasets and evaluation protocol


re-ID 数据集 : Market-1501 和 DukeMTMC-reID

属性识别数据集 PETA


19732 图像 for 751 个身份 用于训练13328 图像for 750 身份 用于测试每张图 27个属性为了验证超参数λ\lambdaλ ,使用训练集中 651 个身份 和其他100个身份 作为 验证集,来确定参数λ\lambdaλ的值。然后我们在正常的751/750分割中使用这个超参数。


是DukeMTMC数据集的子集。16522 个训练图像 for 702个身份。19889个测试图像for 702个身份。没有图像都有23个标签。


标注了61个二进制属性和 4 个多分类属性 for 19000个图像。和【6】一样,我们实验中了有了35个最重要和有意义的属性。因为大多数图形只有恨少的训练图像,有得甚至只有1个想你了图像。 所以PETA并不是理想的在本文中,为了再PETA上评估我们的方法,我们重新划分的数据集。使用 4981个身份的 17100 个图像来做实验。在我们的新划分中, 4558个身份的 9500个图像被用作训练,423个图像用作查询,7177个图像作为 图库。


使用了CMC 曲线 和 mAP在实验中,我们使用[17,35]中公开提供的评估包。验证属性分类的准确性,图库中的图像被用作了测试集。

Implementation details

Evaluation of Person Re-ID task

Comparison with the state-of-the-art methods

Ablation studies

[未完待续…… ]
