Está en la página 1de 10

International Journal of Pure and Applied Mathematics

Volume 116 No. 21 2017, 549-558


ISSN: 1311-8080 (printed version); ISSN: 1314-3395 (on-line version)
url: http://www.ijpam.eu
Special Issue
ijpam.eu

ANNOYED REALM OUTLOOK TAXONOMY USING TWIN TRANSFER LEARNING


M. Rajesh
Vice President, Melange Technologies, Pondicherry, India

Manikanthan
President, Melange Technologies, Pondicherry, India

ABSTRACT

The sentiment classification aims to identify the sentiment polarity of the reviews as positive or negative based on
the subjective information expressed in the reviews. Generally, all learning techniques require labeled data to train
a classifier. However, obtaining labeled data in every domain is impractical and also assigning label for each
domain takes time consuming and cost effective. Moreover, the classifier trained in one domain may not produce
good results when it is applied to another domain. In order to solve this issue, the proposed method develops the
cross domain sentiment classification framework using dual transfer learning which learns both marginal and
conditional distributions of features from source and target domain. The proposed method is based on joint
nonnegative matrix tri factorizations (NMTF). The two distributions are learned from the decomposed latent factors
that exhibit the duality property. Extensive experimental results were performed on Amazon data set which shows
dual transfer learning is more effective than structural correspondence learning for cross domain sentiment
classification.

Key words: Cross Domain sentiment classification, Transfer learning, Domain adaptation, Data Mining

1. INTRODUCTION

The Sentiment analysis focuses on analyzing the reviews posted by the customers in e-commerce web site.
Developing the sentiment classifier is necessary to identify the sentiment polarity (positive or negative) of the
product which will be useful to both customers and manufacturer to understand the opinion about the products. In
Machine learning approaches [4,5], classifier is trained by labeled data which gives high accuracy than other
methods. However, many machine learning algorithms strictly rely on the assumption, which requires that the
training data and the testing data are drawn from the same feature space and the same distribution. However, in
many real world applications, this major assumption may not hold. Generally, machine learning techniques require
labeled data, But obtaining labeled data in every domain is impractical. Moreover, the classifier trained in one
domain may not produce good results when it is applied to another domain. For example, In book domain, users
may express domain specific features such as well researched, Thrilling and boring. In electronics domain,
they may use domain specific features such as reliable, compact, sharp and blurry. Domain Independent
features are features which occur in all domains such as good, excellent and never buy. Domain specific
features are features which occur in specific domain. E.g. words such as well researched and boring are related
to books domain. Due to domain specific feature mismatches between source and target domains, classifier trained
in source (books) domain may not work well when directly applied to target (electronics) domain. Transfer learning
tackles the problem of predicting the label of target domain data drawn from a different but related distribution
compared with source domain data. Transfer learning [6] which utilizes the labeled data in some source domain to

549
International Journal of Pure and Applied Mathematics Special Issue

help to train better models in the target domain. Although the data distributions between the source domain and
target domain are different, there is some common knowledge structure shared across domains. Such common
structure can be utilized for the learning task in the target domain.

Generally source domain has labeled data which is denoted as Psrc(x, y) and target domain has only unlabeled data
Ptar(x). Both Psrc and Ptar are the data distributions of source domain and the target domain respectively, where x is
an example reviews and y is its label. Transfer learning aims to predict the label of target domain by transferring
knowledge from the source domain labeled data . Many methods have been proposed to extract the common
structure across the domains so that the distribution divergence between domains is reduced and the traditional
learning algorithms can be applied. By definition, P(x, y) = P(x). P(y/x), where P(x) is the marginal distribution, and
P(y/x) is the conditional distribution that can be viewed as a classification model. A common assumption of the
existing transfer learning methods is that, if the marginal distributions of examples are similar in some latent space,
then the conditional distributions of the corresponding examples will also be similar [17]. More intuitively, if two
data points are close in the latent space, then their class labels should also be similar. In single bridge transfer, either
the marginal distribution or the conditional distribution is used as the bridge for knowledge transfer. This method
usually constructs a latent space to represent the common structure shared across domains. This method has some
limitations, 1) when distributions between two domains are entirely different so little knowledge can be shared
assumption which resulted in negative transfer. The sharing all" latent factors assumption may cause the existing
methods to underperform when the marginal distribution or the conditional distribution can only be drawn closer in
a subspace of the latent space. The existing methods do not consider the duality between the marginal distribution
and the conditional distribution.. The duality between these two distributions is that learning one distribution can
help to learn the other.

The proposed novel approach performs cross domain sentiment classification using dual transfer learning which
simultaneously learns the marginal and conditional distributions and exploits the duality between them to achieve
effective knowledge transfer. When marginal distribution between the two domains are close to each other, then a
classification model can be learned in this space and shared across domains to draw the conditional distributions
across domains closer. On the other hand, the learned conditional distributions can be used to refine the latent space
so that the marginal distributions may become closer. Some of the latent factors may cause the discrepancy between
the distributions of the data in different domains. These factors are referred as domain-specific latent factors. On
some other latent factors, the data distributions may be similar across domains. These latent factors are referred as
common latent factors. Dual transfer learning finds the latent feature space where the marginal distributions across
domains are close, and simultaneously learns a shared classification model in the latent space to make the
conditional distributions across domains closer.

The rest of the paper is organized as follows: Section 2 describes related works. Section 3 deals with the
proposed methodology. Experimental results and comparison of SCL with DTL is described in section 4. Section 5
concludes with direction for futures.

2. RELATED WORKS

Sinno Jialin Pan et al.[1] proposed a spectral feature alignment (SFA) algorithm to align domain-specific words
from different domains into unified clusters, with the help of domain independent words as a bridge. In this way, the
clusters can be used to reduce the gap between domain-specific words of the two domains, which can be used to
train sentiment classifiers in the target domain accurately. Compared to previous approaches, SFA can discover a
robust representation for cross-domain data by fully exploiting the relationship between the domain-specific and
domain independent words via simultaneously co-clustering them in a common latent space. Many researchers have
addressed the problem in cross domain classification [1][2][3][22][12]. Blitzer et al.[2] addressed the problem in
cross domain sentiment classification using structural correspondence learning algorithm where frequent words in
both source and target domain were selected as candidate pivot features and linear predictors are trained to predict
the occurrences of those pivot features. In structural corresponding learning-mutual information approach, the
mutual information between a feature and the domain label is used to select pivot features instead of co-occurrence
frequency. Danushka Bollegale et al.[3] proposed cross domain sentiment classification by creating sentiment
sensitive thesaurus which aligns different words that express the same sentiments. They expanded feature vector
using created thesaurus while training a binary classifier.

550
International Journal of Pure and Applied Mathematics Special Issue

Transfer Learning [6] is widely applied in the applications where the training data and the test data are obtained
from different resources and with different distributions. The existing methods can be categorized into two types:
instance transfer learning and feature representation transfer learning. Instance transfer learning uses a re-weighting
strategy for instances. The idea is to increase the weights of instances in the source domain that are close to the
instances in the target domain, and decreases the weights of instances in the source domain that are far away from
the instances in the target domain [12, 16]. Feature representation transfer learning aims to discover a shared feature
space in which the data distributions across domains are close to each other. W. Y. Dai et al.[28] proposed boosting
for transfer learning to increase the classification accuracy in target domain. The shared feature space can be
constructed either in the original feature space [7,24], or in the transformed subspace [11, 26]. In the original feature
space, the correspondences among features are identified by modeling their correlations with pivot features that
behave similarly across domains. In the transformed subspace, dimensionality reduction methods are applied to
extract the underlying common structure. The work on the papers [8, 9] discussed learning target domain by
knowledge transfer. W. Y. Dai et al. [13] proposed Eigen transfer technique for transfer learning in which Eigen
vector is computed on the feature matrix which is very useful. The work on the papers [14, 15, 20] discussed cross
domain text classification using transfer learning. Non matrix tri factorization [21, 23] mainly used to decompose
the matrix which is used for clustering the data. X. Tian, et al.[18] proposed Sparse transfer learning for interactive
video search reranking. B. Geng et al.[19] proved domain adaptation metric learning. T. Hofmann [24] proposed
unsupervised learning by probabilistic latent semantic analysis. D. Hosmer et al. [26] propose applied logistic
regression. Generally, logistic regression classifier is used whenever the data distributions are different.

The existing feature representation transfer learning methods[25,27] focus on learning either the marginal
distribution or the conditional distribution for knowledge transfer. For example, Co-Clustering based Classification
(CoCC) [10] and Label Propagation [21] transfer the common feature clusters, which can be regarded as learning the
marginal distribution. Collaborative Dual PLSA [12] and Matrix Tri-Factorization based Classification (MTrick)
[20] transfer The common association between feature clusters and example classes, which can be regarded as
learning the conditional distribution. Another well-designed method for learning the marginal distribution is Joint
Subspace Nonnegative Matrix Factorization [16]. It learns the common latent factors and domain-specific latent
factors that span a shared subspace where the marginal distributions across domains are close. However, this method
does not learn the conditional distribution thus cannot be applied to cross-domain classification tasks. The proposed
method used transductive transfer learning [4] technique in which source domain has labeled dada and unlabeled
data and target domain has only unlabeled dada. In this work, classifier is learned from source and target domain
using dual transfer learning [17].

3. METHODOLOGY
A. Definitions
In this section, several definitions are given to clarify basic terminology.
Domain- A domain D denotes a class of entities in the real world. For example, different types of product such as
DVD, Kitchen appliances, books , electronics and music.
Sentiment- Given a specific domain D, sentiment data are the text documents which express the user opinion about
the entities of the domain.
Cross Domain sentiment Classification- Given a set of labeled reviews Ds= {(xi, yi)} from source domain where xi
represents features and yi represent sentiment label yi {+1, -1}. To predict the label in unlabeled target domain
Dt={xj} where xj represent features in target domain. Classifier is trained by labeled reviews source domain and it is
applied to classify the reviews of unlabeled target domain.

B. Architecture
In cross domain sentiment classification, classifier trained in one domain (source) is applied to another (target)
domain. Obviously it produces poor performance because the trained features are mismatched with target domain
features. In order to solve this issue, dual transfer technique is applied which learns both marginal and conditional
distribution of data from source and target domain. The Figure 1 shows the framework of cross domain sentiment
classification using dual transfer learning.

POS tagging Select pivot Co-occurrence L1 Classifier


and stop features using matrix creation learns data using
Review word Mutual using IF-IDF Dual transfer
s removal learning
information

Test reviews from Predict the sentiment


target domain 551 polarity of the reviews
International Journal of Pure and Applied Mathematics Special Issue

Fig 1. Cross domain sentiment classification using Dual transfer learning

First, labeled/unlabeled reviews from source domain and unlabeled reviews from target domain are collected from
Amazon product reviews and these documents reviews are split up into sentences. Second, parts of speech (POS)
tagging and lemmatization are applied on the reviews using RASP system. Parts of speech tagger attaches POS tag
for each word of the sentences in the given reviews. Lemmatization is used to reduce the features by removing
inflectional endings only and to return the base or dictionary form of a word, which is known as the lemma. By
using a stop word removal algorithm, unwanted words are filtered out and POS words such as nouns, verbs,
adjectives and adverbs alone are retained.

Pivot features play a vital role in cross domain sentiment classification because they act as bridge between the
source and target domains. In order to discriminate the features into pivot or not pivot features, the mutual
information is computed between features and domains using equation 1 as in [1]. Each domain maintains a list of
features and its term frequency occurrence from the reviews of that domain. If a feature has high mutual
information to particular domain, then it is domain specific. Otherwise it is domain independent. For example, if
mutual information between well researched feature and books domain is high compare with other domains then
well researched feature is domain specific to books domain.

P(x,d)
MI(Xi,D)= p x, d log 2 (P ) (1)
x P(d)

where p(x,d) is the joint probability distribution function of feature x and domain d. p(x), p(d) are marginal
probability distribution function of feature x and domain d. X indicates all features to be classified as domain
independent or domain specific feature based on the mutual information. All domains such as books, DVDs,
electronics and kitchen appliances are considered as D.

Co-occurrence matrix is created between pivot and non pivot features. The values of pivot features are its TF-IDF.
Let Wi,j be the word i in the paragraph j defined as the following expression as

wi,j= TFi,j x log (N/Pfi ) (2)

Where TFi,j is the term frequency of word i in the paragraph j. and Pfi is the number of paragraph in which the
word i occur. Next, data matrix W is created by updating the values of co-occurrence matrix by empirical risk
minimization technique.

W = argmin (j L( w. xj , pl (xj))+ w 2 ) (3)

Where w1,w2,.. wk are weight vectors. w 2


is loss function.

Nonnegative Matrix TriFactorization


Nonnegative matrix factorization (NMTF) technique [16,21] has been widely used for text classification. Hence this
technique is applied on the data matrix W which decompose the data matrix into product of three non negative
factors U,H and VT.

552
International Journal of Pure and Applied Mathematics Special Issue

W= UHVT (4)
m
where U R x K is cluster assignment matrix represents feature clusters and V is the cluster assignment matrix
represents example clusters and H is the cluster association matrix which represents association between feature
clusters and example clusters. This approximation can be achieved by the following matrix norm optimization.

LNMTF = X - UHVT (5)

where X = [x1; : : : ; xn] is an m xn data matrix containing n examples. Each example xi is an m x 1 feature vector in
the original feature space of m dimensions. The data matrix can also be interpreted into two linear transformation.
This interpretation can be used in learning the marginal and conditional distributions.

L = X UX1 (6)

L = 1
X - HVT (7)

Assume that data is in the original feature space, the marginal distribution is denoted as P(x) and the conditional
distribution is denoted as P(y/x). The Equation (6) derives a linear transformation Rm Rk. Each feature in the
original feature space is mapped to the latent feature space spanned by the columns of U. After mapping, the
marginal distribution changes from P(x) to P(x1). We refer to as the marginal mapping. and U are related to
learning the marginal distribution.

The equation (7) derives linear transformation : Rk Rc. vT = (x1) maps each example in the latent feature
space to example cluster/class space spanned by the columns of H. After mapping, the conditional distribution
changes from P(y/x) to P(y/x1). is the conditional distribution. Here and H are learning to conditional distribution.
Such transformation will draw the data points closer across domains. The duality between these two distributions
exploited to facilitate knowledge transfer.

Dual Transfer Learning

The proposed method used transductive transfer learning method in which source domain has labeled data and target
domain has only unlabeled data. The goal of transfer learning is to alleviate this difficulty by making data
distributions between domains drawn closer in a latent space so that we can train the cross-domain classifier f as
accurate as possible.

All NMTF domains data matrices are clustered using equation (8)

Ld = Xd - Ud1 Hd VdT 2
(8)

the feature clusters across domains can be partitioned into a common part and a domain specific part. The common
part is used to draw the marginal distributions across domains closer. So we partition Ud1 = [ U, Ud ] where U is the
common part and Ud is the domain specific part. This also leads to the partition of the marginal mapping d1=
[,d] where only the common part can be shared across domains to draw the marginal distributions closer, while
the domain-specific part is used to respect domain specific knowledge.

To derive the conditional mapping by learning the cluster association matrix HT. The association between feature
clusters and example clusters usually remain stable across domains. This assumption is further strengthened since it
learned the feature clusters in an adaptive way in the marginal mapping, which in return can help learning the
conditional distribution more effectively due to the duality property. Therefore, Hd = H and d= for all domains,

553
International Journal of Pure and Applied Mathematics Special Issue

and share the entire conditional mapping across domains to draw the conditional distributions closer. Hence, the
Equation (8) extended into equation (9) so that the learning of the two distributions are integrated in a unified
subspace learning.

H
Ld = Xd - [U, Ud] VdT 2
(9)

The values U, Ud, Vd and H are updated for each domain until it reaches to convergence using the equation (10) to
(13) as in [1].


Ud UT (10)
[ ,


=1
UU (11)
=1 [,

[, ]
Vd Vd (12)
, 1[, ]


=1 [, ]
H Hd [, [, ]
(13)
=1

Classifier learns both marginal and conditional distribution between the domains by updating these values. Finally,
the classifier is applied to predicts label of target domain.

Algorithm:
Input
Labeled Reviews of source domains Dsrc {Xi, Yi} and unlabeled reviews of target domain Dtar {Xi}.
Output: Classifier predicts the sentiment polarity of target domain.
1. Pre-processing such as POS tagging, stop word removal are applied to all domains.
2. For each domain 1 to k
Features are extracted from the reviews.
Features are discriminated as pivot and non pivot features usinh equation 1.
Co-occurrence matrix is created between pivot and non pivot features using equation (2) and
values are updated using equation (3).
3. Non Nonnegative Matrix Factorization is applied to decompose the matrix using equation (5).
4. Use Dual transfer learning across domains
All NMTF data matrices are clustered into feature clusters using equation (8)
Feature clusters are partitioned into common part and domain specific part equation (9).
L1 classifier is used to learn the features of target domain by updating the values U,Ud, H, Vd until
it reached to convergence.
Finally, trained classifier is applied to the target domain to predict the sentiment polarity of target domain.

IV. EXPERIMENTS
A. Data sets
Amazon product reviews are bench mark data set which consists of three different product types: Books, DVDs and
Music are chosen for the proposed work. The data sets structure is shown in table 1 .For each domain, there are 2000
positive reviews with 2000 negative reviews. Each domain also has some unlabeled reviews.

554
International Journal of Pure and Applied Mathematics Special Issue

Table 1- Amazon product data sets


Domain Positive Negative Unlabeled
DVDs 2000 2000 34377
Music 2000 2000 39291
Books 2000 2000 23055

For experiments, among three domains, one domain is selected as source domain and another domain taken as target
domain. To learn L1 classifier, 800 positive reviews and 800 negative reviews are selected from source domain and
1000 unlabeled reviews are selected from target domain.

The classification accuracy is computed as shown in equation (14).

Accuracy= Number of correctly classified target reviews


Total number of reviews in the target domain (14)

Table 2 shows comparison of existing SCL[2] and DTL by adapting single source domain. From table 2, Music
source domain (MB) produced high accuracy than DVD source domain for Books target domain. DVD source
domain(DM) produced high accuracy than Books source domain for Music target domain. similarly, Music
domain (MD) produced high accuracy than Books source domain for DVD target domain.

Table 2-Comparison of SCL and DTL by adapting single source domain.

B M B D M B MD DB DM
SCL 69.87 72.19 70.64 68.45 66.74 70.89
DTL 79.72 77.64 82.78 81.12 81.27 82.09

Table 3-Comparison of SCL and DTL by adapting two source domains.

B,D M B,M D M,D B


SCL 78.65 75.33 77.64
DTL 84.72 86.64 85.78

To study the effect of two source domains adaptation for target domain, different combinations of two source
domains with one target domain has been taken and classification performance in target domain was analyzed.
Table 3 shows the comparison of SCL and DTL classification accuracy by adapting two source domain. First,
books (B), DVD (D) are taken as source domains and Music (M) is taken as target domain. The classifier is trained
using dual transfer learning and it is applied to Music domain (M) which produced 84 percentage of accuracy in
DVD domain. Second, (B, M) are taken as source domains and D is taken as target domain which produced 86
percentage of accuracy in DVD domain. Third, (M, D) are taken as source domain and B is taken as target domain
which produced 85 percentage of accuracy in books domain. Figure 2 shows the classification accuracy of proposed
framework for the above experimental setup with increasing unlabeled reviews of target domain.

555
International Journal of Pure and Applied Mathematics Special Issue

89
88
87

accuracy
86
85 Books
84 DVD
83
Music
82
500 1000 1500 2000 2500
no. of unlabeled reviews of target domain

Fig.2 classification accuracy in each target domain

From fig 2, classification accuracy has increased in target domain due to the increasing of unlabeled reviews of
target domain. Based on the analysis of multiple source domains, adaptations of two source domains with one target
domain has increased the performance of the classification in target domain.

V. CONCLUSION

The proposed work performed cross domain sentiment classification using dual transfer learning which increases
the classification performance in target domain. First, labeled reviews are collected from source domain and
unlabeled reviews are collected from target domain. Second, preprocessing functions such as POS tagging and stop
Words removal on the reviews. Third, pivot features are extracted using mutual information techniques and co-
occurrence matrix is created between pivot and non pivot features. Fourth, data matix of each domain is decomposed
by non matrix tri factorization. Finally dual transfer learning technique is applied to learn marginal and conditional
distribution across domains. Then, an alternately iterative algorithm is developed to solve the proposed optimization
problem This method produces good performance than other transfer learning.

REFERENCES

S. J. Pan, X. Ni, J.-T. Sun, Q. Yang, and Z. Chen, Cross- Do main Sentiment Classification via Spectral Feature
Alignment, Proc. 19th Intl Conf. World Wide Web, pp. 751-760, Apr, 2010.
J. Blitzer, R. McDonald, and F. Pereira, Domain Adaptation with Structural Correspondence Learning, Proc.
Conf. Empirical Methods in Natural Language Processing (EMNLP 2006) for Computational Linguistics, pp.
120-128, 2006.
Danushka Bollegala, David Weir and John Carroll, Cross Domain Sentiment Classification using a Sentiment
Sensitive Thesaurus, IEEE Trans. Knowledge and Data Engineering, vol.25, no.8, pp.17191731, August
2013, doi:10.1109 /TKDE.2012.103.
T. Joachims, Transductive inference for text classification using support vector machines, in Proc. 16th ICML,
1999, pp. 200209.
B. Pang, L. Lee, and S. Vaithyanathan, Thumbs up? Sentiment Classification using Machine Learning
Techniques, Proc. Conf. Empirical Methods in Natural Language Processing (EMNLP 2002), pp. 7986, July
2002.
S. J. Pan and Q. Yang, A survey on transfer learning, IEEE Trans.Knowl. Data Eng., vol. 22, no. 10, pp. 1345
1359, Oct. 2010.
W. Y. Dai, Y. Q. Chen, G. R. Xue, Q. Yang, and Y. Yu, Translated learning: Transfer learning across different
feature spaces, in Proc. 22ndNIPS, 2008, pp. 353360.
J. Gao, W. Fan, J. Jiang, and J. W. Han, Knowledge transfer via multiple model local structure mapping, in Proc.
14th ACM SIGKDD, 2008,pp. 283291.

556
International Journal of Pure and Applied Mathematics Special Issue

Rajesh, M., and J. M. Gnanasekar. "Path observation-based physical routing protocol for wireless ad hoc networks."
International Journal of Wireless and Mobile Computing 11.3 (2016): 244-257.
Rajesh, M., and J. M. Gnanasekar. "Congestion control in heterogeneous wireless ad hoc network using FRCC."
Australian Journal of Basic and Applied Sciences 9.7 (2015): 698-702.
Rajesh, M., and J. M. Gnanasekar. "GCCover Heterogeneous Wireless Ad hoc Networks." Journal of Chemical and
Pharmaceutical Sciences (2015): 195-200.
Rajesh, M., and J. M. Gnanasekar. "CONGESTION CONTROL USING AODV PROTOCOL SCHEME FOR
WIRELESS AD-HOC NETWORK." Advances in Computer Science and Engineering 16.1/2 (2016): 19.
Rajesh, M., and J. M. Gnanasekar. "An optimized congestion control and error management system for OCCEM."
International Journal of Advanced Research in IT and Engineering 4.4 (2015): 1-10.
Rajesh, M., and J. M. Gnanasekar. "Constructing Well-Organized Wireless Sensor Networks with Low-Level
Identification." World Engineering & Applied Sciences Journal 7.1 (2016).
Rajesh, M. "Traditional Courses Into Online Moving Strategy." The Online Journal of Distance Education and e-
Learning 4.4 (2016).
J. Gao, W. Fan, Y. Z. Sun, and J. W. Han, Heterogeneous source consensus learning via decision propagation and
negotiation, in Proc.15th ACM SIGKDD, 2009, pp. 339348.
W. Y. Dai, G. R. Xue, Q. Yang, and Y. Yu, Co-clustering based classification for out-of-domain documents, in
Proc. 13th ACM SIGKDD,2007, pp. 210219.
P. Luo, F. Z. Zhuang, H. Xiong, Y. H. Xiong, and Q. He, Transfer learning from multiple source domains via
consensus regularization, in Proc. 17th ACM CIKM, 2008, pp. 103112.
G. R. Xue, W. Y. Dai, Q. Yang, and Y. Yu, Topic-bridged PLSA for cross-domain text classification, in Proc.
31st ACM SIGIR, 2008,pp. 627634.
W. Y. Dai, O. Jin, G. R. Xue, Q. Yang, and Y. Yu, Eigen transfer:A unified framework for transfer learning, in
Proc. 26th ICML, 2009,pp. 193200.
F. Z. Zhuang, P. Luo, H. Xiong, Q. He, Y. H. Xiong, and Z. Z. Shi,Exploiting associations between word clusters
and document classes for cross-domain text categorization, in Proc. 10th SIAM SDM, 2010,pp. 1324.
F. Zhuang, P. Luo, Z. Shen, Q. He, Y. Xiong, Z. Shi, and H. Xiong,Collaborative dual-PLSA: Mining distinction
and commonality across multiple domains for text classification, in Proc. 19th ACM CIKM,2010, pp. 359
368.
H. Wang, H. Huang, F. Nie, and C. Ding, Cross-language web page classification via dual knowledge transfer
using nonnegative matrix trifactorization, in Proc. 34th ACM SIGIR, 2011, pp. 933942.
M. Long, J. Wang, G. Ding, W. Cheng, X. Zhang, and W. Wang, Dual transfer learning, in Proc. 12th SIAM
SDM, 2012, pp. 540551.
X. Tian, D. C. Tao, and Y. Rui, Sparse transfer learning for interactive video search reranking, ACM Trans.
Multimedia Comput., Commun., Applicat., vol. 8, no. 3, pp. 26:126:19, 2012.
B. Geng, D. Tao, and C. Xu, DAML: Domain adaptation metric learning, Trans. Img. Proc., vol. 20, no. 10, pp.
29802989,2011.
F. Zhuang, P. Luo, C. Du, Q. He, and Z. Shi, Triplex transfer learning:Exploiting both shared and distinct concepts
for text classification, in Proc. 6th ACM WSDM, 2013, pp. 425434.
C. Ding, T. Li, W. Peng, and H. Park, Orthogonal nonnegative matrix t-factorizations for clustering, in Proc. 12th
ACM SIGKDD, 2006,pp. 126135.
T. Li, V. Sindhwani, C. Ding, and Y. Zhang, Knowledge transformation for cross-domain sentiment classification,
in Proc. 32nd ACM SIGIR,2009, pp. 716717.
[23] F. Wang, T. Li, and C. Zhang, Semi-supervised clustering via matrix factorization, in Proc. 8th SIAM SDM,
2008, pp. 10411048.
T. Li, C. Ding, Y. Zhang, and B. Shao, Knowledge transformation from word space to document space, in Proc.
31st ACM SIGIR, 2008,pp. 187194.
T. Hofmann (2001). Unsupervised learning by probabilistic latent semantic analysis, J. Mach. Learn., vol. 42, no.
12, pp. 177196.
D. Hosmer and S. Lemeshow, Applied Logistic Regression. New York,NY, USA: Wiley, 2000.
D. Zhang, J. He, Y. Liu, L. Si, and R. Lawrence, Multi view transfer learning with a large margin approach, in
Proc. 17th ACM SIGKDD,2011, pp. 12081216.
W. Y. Dai, Q. Yang, G. R. Xue, and Y. Yu, Boosting for transfer learning, in Proc. 24th ICML, 2007, pp. 193
200.

557
558

También podría gustarte