Multi-View Embedding Learning for Incompletely Labeled Data
Wei Zhang, Ke Zhang, Pan Gu, Xiangyang Xue
With the recent development of Web 2.0, many content sharing platforms, e.g., Flickr, PhotoStuff, among others, have provided annotation functions to enable users to tag images at anytime from anywhere. Therefore, there already exist a large number of images tagged with words describing the image contents perceived by human users. These tagged image databases contain rich information about the semantic meanings of the images.
The main characteristics of current multi media data are as follows.
- Multiple Feature & High-dimensionality
- Multi-concept & incomplete-labeled
To enhance the efficiency in high-dimension heterogeneous features, We learn compact low-dimension embedding that captures:
1. Feature correlations
To identify the similarity between samples represented by multiple features. Each datum is represented by heterogeneous features,
We use CCA technique to obtain matrices and which transfer heterogeneous features into a uniform space. Similarity between two images among heterogeneous features is calculated as follows:
2. Label correlations
In real world, semantic concepts usually do not appear independently but occur correlatively. The correlation between concepts can be initializ-ed as the harmonic mean of the empirical conditional probabilities:
where the empirical conditional probability is derived from the labeled samples and measure the co-occurrence of concept pair on the given data.
3. Feature-label associations
By mapping data from multiple feature spaces to the embedding space and to the concept space, we learn the embedding which preserves the neighborhood context in the original spaces, and complete the labels at the same time.There is semantic gap between the input multi-view feature space and the semantic concept space; and the compact embedding space can be looked on as the bridge between the above spaces.
where denotes the u-th column of the matrix , which is the estimated multi-label vector for the sample ; denotes the s-th row of the matrix , which indicates for the s-th concept which samples are positive while the others are negative. Solving the objective is is difficult and we can relax the domain of from to . is defined to capture the correlations between concepts s and t. and are the tradeoff parameters.
We experimentally evaluate the performance of the proposed method, denoted by ’ours’, and compare it with the state-of- the-art methods: WELL[Sun et al., 2010]and PU WLR[Lee and Liu, 2003]. PU WLR [Lee and Liu, 2003] is a method learning with Positive and Unlabeled data using Weighted Logistic Regression; WELL (WEak Label Learning) [Sun et al., 2010] is the method designed for incompletely labeled dataset. Furthermore, we also evaluate the degenerated ver- sion of our method denoted by ’ours−‘ where the multiple heterogeneous features are simply concatenated into a high- dimensional vector without capturing inter-feature correla- tions. We conduct experimental evaluations on three image datasets: MSRC, LabelMe [Russell et al., 2008] and NUS- WIDE[Chua et al., 2009].
In this paper we propose a novel method to learn compact embedding that captures inter-feature correlations, inter-label correlations, and feature-label associations simultaneously from multi-view incompletely-labeled data. By mapping data from multiple feature spaces to the embedding space and to the concept space, we learn the embedding which preserves theneighborhood contextintheoriginal spaces, andcomplete the labels at the same time. There is semantic gap between the input multi-view feature spaces and the semantic concept space; and the compact embedding space can be looked on as the bridge between the above spaces.
[Lee and Liu, 2003] Wee Sun Lee and Bing Liu. Learning with positive and unlabeled examples using weighted logistic regression. In ICML, 2003.
[Sun et al., 2010] Yu-Yin Sun, Yin Zhang, and Zhi-Hua Zhou. Multi-label learning with weak label. In AAAI, 2010.