1500字范文,内容丰富有趣,写作好帮手!
1500字范文 > 论文笔记 A brief introduction to weakly supervised learning -

论文笔记 A brief introduction to weakly supervised learning -

时间:2020-02-27 23:36:36

相关推荐

论文笔记 A brief introduction to weakly supervised learning -

A brief introduction to weakly supervised learning

南大周志华National science review(IF 17.3), (Citations 815)

ABSTRACT

This article reviews some research progress of weakly supervised learning, focusing on three typical types of weak supervision: (See the introduction for a more detailed explanation)

incomplete supervision, where only a subset of training data is given with labels;inexact supervision, where the training data are given with only coarse-grained labels;inaccurate supervision, where the given labels are not always ground-truth.

INTRODUCTION

Typically, there are three types of weak supervision.

incomplete supervision, i.e. only a (usually small) subset of training data is given with labels while the other data remain unlabeled.

For example, in image categorization the ground-truth labels are given by human annotators; it is easy to get a huge number of images from the Internet, whereas only a small subset of images can be annotated due to the human cost.

inexact supervision, i.e. only coarse-grained labels are given.

It is desirable to have every object in the images annotated; however, usually we only have image-level labels rather than object-level labels.

inaccurate supervision, i.e. the given labels are not always ground-truth.

Such a situation occurs, e.g. when the image annotator is careless or weary, or some images are difficult to categorize.

INCOMPLETE SUPERVISION

Incomplete supervision concerns the situation in which we are given a small amount of labeled data, which is insufficient to train a good learner, while abundant unlabeled data are available.

Formally, the task is to learn f:X↦Yf: \mathcal{X} \mapsto \mathcal{Y}f:X↦Y from a training data set D=(x1,y1),...,(xl,yl),xl+1,...,xmD = {(x_1, y_1), . . . , (x_l , y_l ), x_{l +1}, . . . , x_m }D=(x1​,y1​),...,(xl​,yl​),xl+1​,...,xm​.

There are two major techniques for this purpose:

incompletesupervision{activelearning(Withhumanintervention)semi-supervisedlearning(Withouthumanintervention)\text{incomplete supervision}\left\{ \begin{aligned} &\text{active learning (With human intervention)}\\ &\text{semi-supervised learning (Without human intervention)} \end{aligned} \right. incompletesupervision{​activelearning(Withhumanintervention)semi-supervisedlearning(Withouthumanintervention)​

active learning;

Active learning assumes that there is an‘oracle’, such as a human expert, that can be queried to get ground-truth labels for selected unlabeled instances.

selectioncriteriaofactielearning{informativenessrepresentativeness\text{selection criteria of actie learning} \left\{ \begin{aligned} &\text{informativeness}\\ &\text{representativeness} \end{aligned} \right. selectioncriteriaofactielearning{​informativenessrepresentativeness​

semi-supervised learning.

In contrast, semi-supervised learning attempts to automatically exploit unlabeled data in addition to labeled data to improve learning performance, where no human intervention is assumed.

semi-supervisedlearning{(pure)semi-supervisedlearningtranductivelearning\text{semi-supervised learning}\left\{ \begin{aligned} &\text{(pure) semi-supervised learning}\\ &\text{tranductive learning}\\ \end{aligned} \right. semi-supervisedlearning{​(pure)semi-supervisedlearningtranductivelearning​

Actually, in semi-supervised learning there are two basic assumptions, i.e. thecluster assumptionand themanifold assumption; both are about data distribution. The former assumes that data have inherent cluster structure, and thus, instances falling into the same cluster have the same class label. The latter assumes that data lie on a manifold, and thus, nearby instances have similar predictions. The essence of both assumptions lies in the belief that similar data points should have similar outputs, whereas unlabeled data can be helpful to disclose which data points are similar.

fourmajorcategoriesofsemi-supervisedlearning{generativemethodsgraph-basedmethodslow-densityseparationmethodsdisagreement-basedmethods.\text{four major categories of semi-supervised learning}\left\{ \begin{aligned} &\text{generative methods}\\ &\text{graph-based methods}\\ &\text{low-density separation methods}\\ &\text{disagreement-based methods.}\\ \end{aligned} \right. fourmajorcategoriesofsemi-supervisedlearning⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧​​generativemethodsgraph-basedmethodslow-densityseparationmethodsdisagreement-basedmethods.​

INEXACT SUPERVISION

Formally, the task is to learn f:X↦Yf: \mathcal{X} \mapsto \mathcal{Y}f:X↦Y from a training data set D={(X1,y1),...,(Xm,ym)}D = \{(X_1, y_1), ..., (X_m, y_m)\}D={(X1​,y1​),...,(Xm​,ym​)}, where Xi={xi1,...,ximi}⊆XX_i = \{x_{i1}, . . . , x_{im_i} \}\subseteq \mathcal{X}Xi​={xi1​,...,ximi​​}⊆X is called a bag, xij∈X(j∈{1,...,mi})x_{i j}\in X (j∈ \{1, ..., m_i\})xij​∈X(j∈{1,...,mi​}) is an instance, mim_imi​ is the number of instances in XiX_iXi​, and yi∈Y={Y,N}y_i\in \mathcal{Y} = \{Y, N\}yi​∈Y={Y,N}. Xi is a positive bag, i.e. yi=Yy_i = Yyi​=Y, if there exists xipx_{i p}xip​ that is positive, while p∈{1,...,mi}p\in \{1, ..., m_i\}p∈{1,...,mi​} is unknown. The goal is to predict labels for unseen bags. This is calledmulti-instance learning.

本内容不代表本网观点和政治立场,如有侵犯你的权益请联系我们处理。
网友评论
网友评论仅供其表达个人看法,并不表明网站立场。