Documentos de Académico
Documentos de Profesional
Documentos de Cultura
processing strategies that can handle, and take Dealing with the huge amounts of data avail-
advantage, of such amount of data. However, able on Twitter demand clever strategies. One
processing Twitter is all but an easy task, not only interesting idea, explored by (Go, Bhayani, and
because of specific phenomena that can be found Huang, 2009) consists of using emoticons, abun-
in the data, but also because it may require to dantly available on tweets, to automatically label
process a continuous stream of data, and possibly the data and then use such data to train ma-
to store some of the data in a way that it can be chine learning algorithms. The paper shows that
accessed in the future. machine learning algorithms trained with such
This paper tackles two well-known Natural approach achieve above 80% accuracy, when
Language Processing (NLP) tasks, commonly classifying messages as positive or negative. A
applied both to written and speech corpora: senti- similar idea was previously explored by (Pang,
ment analysis and topic detection. The two tasks Lee, and Vaithyanathan, 2002) for movie re-
have been performed over Spanish Twitter data views, by using star ratings as polarity signals
provided in the context of a contest proposed in their training data. This latter paper analyses
by “TASS – workshop on Sentiment Analysis” the performance of different classifiers on movie
(Villena-Román et al., 2012), a satellite event of reviews, and presents a number of techniques that
the SEPLN 2012 conference. were used by many authors and served as base-
The paper is organized as follows: Section 2 line for posterior studies. As an example, they
overviews the related work, previously done for have adapted a technique, introduced by (Das
each task. Section 3 presents a brief description and Chen, 2001), for modeling the contextual
of the data. Section 4 describes the most relevant effect of negation, adding the prefix NOT to every
strategies that have been considered for tackling word between a “negation word” and the first
the problem. Section 5 presents and analyses punctuation mark following the negation word.
a number of experiments, and reports the re- Common approaches to sentiment analysis
sults for each one of the approaches. Section involve the use of sentiment lexicons of positive
6 presents some conclusions and discusses the and negative words or expressions. The General
future work. Inquirer (Stone et al., 1966) was one of the first
available sentiment lexicons freely available for
2 Related work research, which includes several categories of
Sentiment analysis and topic detection are two words, such as: positive vs. negative, strong
well-known NLP (Natural Language Processing) vs. week. Two other examples include (Hu and
tasks. Sentiment analysis is often referred by Liu, 2004), an opinion lexicon containing about
other names (e.g. sentiment mining) and consists 7000 words, and the MPQA Subjectivity Cues
of assigning a sentiment, from a set of possible Lexicon (Wilson, Wiebe, and Hoffmann, 2005),
values, to a given portion of text. Topic detection where words are annotated not only as positive
consists of assigning a class (or topic) from a set vs. negative, but also with intensity. Finally,
of possible predefined classes to a given docu- (Baccianella, Esuli, and Sebastiani, 2010) is an-
ment. Often, these two tasks are viewed as two other available resource that assigns sentiment
classification problems that, despite being char- scores to each synset of the wordnet.
acterized by their specificities, can be tackled Learning polarity lexicons is another research
using similar strategies. The remainder of this approach that can be specially useful for dealing
section overviews the related work previously with large corpora. The process starts with a
done concerning these two tasks. seed set of words and the idea is to increasingly
find words or phrases with similar polarity, in
2.1 Sentiment analysis semi-supervised fashion (Turney, 2002). The
Sentiment analysis can be performed at different final lexicon contains much more words, pos-
complexity levels, where the most basic one con- sibly learning domain-specific information, and
sists just on deciding whether a portion of text therefore is more prone to be robust. The work
contains a positive or a negative sentiment. How- reported by (Kim and Hovy, 2004) is another
ever, it can be performed at more complex levels, example of learning algorithm that uses WordNet
like ranking the attitude into a set of more than synonyms and antonyms to learn polarity.
two classes or, even further, it can be performed
in a way that different complex attitude types can 2.2 Topic Detection
be determined, as well as finding the source and Work on Topic Detection has its origins in 1996
the target of such attitudes. with the Topic Detection and Tracking (TDT)
78
Sentiment Analysis and Topic Classification based on Binary Maximum Entropy Classifiers
initiative sponsored by the US government (Al- tweets). The provided test data is also available
lan, 2002). The main motivation for this initia- in XML and contains about 60800 unlabeled
tive was the processing of the large amounts of tweets. The goal consists in providing automatic
information coming from newswire and broad- sentiment and topic classification for that data.
cast news. The main goal was to organize the Each tweet in the labelled data is annotated
information in terms of events and stories that in terms of polarity, using one of six possible
discussed them. The concept of topic was de- values: NONE, N, N+, NEU, P, P+ (Section
fined as the set of stories about a particular event. 4.2 contains information about their meaning).
Five tasks were defined: story segmentation, Moreover, each annotation is also marked as
first story detection, cluster detection, tracking, AGREEMENT or DISAGREEMENT, indicating
and story link detection. The current impact whether all the annotators performed the anno-
and the amount information generated by social tation coherently. In what concerns topic detec-
media led to a state of affairs similar to the one tion, each tweet was annotated with one or more
that fostered the pioneer work on TDT. Social topics, from a list of 10 possible topics: polı́tica
media is now the context for research tasks like (politics), otros (others), entretenimiento (enter-
topic (cluster) detection (Lee et al., 2011; Lin et tainment), economı́a (economics), música (mu-
al., 2012) or emerging topic (first story) detec- sic), fútbol (football), cine (movies), tecnologı́a
tion (Kasiviswanathan et al., 2011). (technology), deportes (sports), and literatura
In that sense, closer to our work are the ap- (literature).
proaches described by (Sriram et al., 2010) and It is also important to mention that, besides
(Lee et al., 2011), where tweets are classified the tweets, an extra XML file is also available,
into previously defined sets of generic topics. In containing information about each one of the
the former, a conventional bag-of-words (BOW) users that authored at least one of the tweets in
strategy is compared to a specific set of features the data. In particular, the information includes
(authorship and the presence of several types of the type of user, assuming one of three possible
twitter-related phenomena) using a Naı̈ve Bayes values – periodista (journalist), famoso (famous
(NB) classifier to classify tweets into the follow- person), and politico (politician) – which may
ing generic categories: News, Events, Opinions, provide valuable information for these tasks.
Deals, and Private Messages. Findings show that Apart from the provided data, some exper-
authorship is a quite important feature. In the iments described in this paper also made use
latter, two strategies, BOW and network-based of Sentiment Lexicons in Spanish1 , a resource
classification, are explored to classify clusters of created at the University of North Texas (Perez-
tweets into 18 general categories, like Sports, Rosas, Banea, and Mihalcea, 2012). From this
Politics, or Technology. In the BOW approach, resource, only the most robust part was used,
the clusters of tweets are represented by tf-idf known as fullStrengthLexicon, and containing
vectors and NB, NB Multinomial, and Support 1346 words automatically labelled with senti-
Vector Machines (SVM) classifiers are used to ment polarity.
perform classification. The network-based clas-
sification approach is based on the links between
users and C5.0 decision tree, k-Nearest Neigh- 4 Approach
bor, SVM, and Logistic Regression classifiers We have decided to consider both tasks as clas-
were used. Network-based classification was sification tasks, thus sharing the same method.
shown to achieve a better performance, but being The most successful and recent experiments cast
link-based, it cannot be used for all situations. the problem as a binary classification problem,
which aims at discriminating between two pos-
3 Data sible classes. Binary classifiers are easier to
Experiments described in this paper use Span- develop, offer faster convergence ratios, and can
ish Twitter data provided in the context of the be executed in parallel. The final results are then
TASS contest (Villena-Román et al., 2012). The produced by combining all the different binary
provided training data consists of an XML file classifiers.
containing about 7200 tweets, each one labelled The remainder of this section describes the
with sentiment polarity and the corresponding method and the architecture of the system when
topics. We decided to consider the first 80% of applied to each one of the tasks.
the data for training our models (5755 tweets)
and the remaining 20% for development (1444 1 http://lit.csci.unt.edu/
79
Fernando Batista, Ricardo Ribeiro
cine deportes economía entretenimiento fútbol literatura música otros polít. tec
ET ET#O M O P T ET F M O P ET ET#P F O O#P P F L M M#O O O#P O#T P T M O P T O P O P T P T T
ET
ET#O
cine
M
O
P
T
ET
deportes
ET#M
ET#O
F
M
M#O
P
economía
ET
ET#P
F
O
O#P
P
F
F#O
entretenimiento
L
L#O
L#T
M
M#O
M#P
O
O#T
P
P#T
T
M
fútbol
M#O
O
P
T
liter.
M
O
P
t pol otros mús
O
T
P
T
Figure 3: Topics confusion matrix (higher values darker; lower values lighter).
Das, Sanjiv and Mike Chen. 2001. Yahoo! for Pang, Bo, L. Lee, and S. Vaithyanathan. 2002.
Amazon: Extracting market sentiment from Thumbs up? sentiment classification using
stock message boards. In Proceedings of machine learning techniques. In Proceedings
the Asia Pacific Finance Association Annual of Empirical Methods in Natural Language
Conference (APFA). Processing (EMNLP), pages 79–86.
Daumé III, Hal. 2004. Notes on CG and Perez-Rosas, V., C. Banea, and R. Mihalcea.
LM-BFGS optimization of logistic regres- 2012. Learning Sentiment Lexicons in Span-
sion. http://hal3.name/megam/. ish. In N. Calzolari, K. Choukri, T. Declerck,
M. U. Doğan, B. Maegaard, J. Mariani,
Go, Alec, Richa Bhayani, and Lei Huang. 2009. J. Odijk, and S. Piperidis, editors, Proceed-
Twitter Sentiment Classification using Distant ings of the International Conference on Lan-
Supervision. Technical report, Stanford Uni- guage Resources and Evaluation (LREC’12).
versity. ELRA.
Hu, Minqing and Bing Liu. 2004. Mining Sriram, Bharath, David Fuhry, Engin Demir,
and summarizing customer reviews. In Pro- Hakan Ferhatosmanoglu, and Murat Demir-
ceedings of the 10t h ACM SIGKDD Interna- bas. 2010. Short Text Classification in
tional Conference on Knowledge Discovery Twitter to Improve Information Filtering. In
and Data Mining, KDD ’04, pages 168–177. SIGIR’10: Proceedings of the 33rd interna-
ACM. tional ACM SIGIR conference on Research
and Development in Information Retrieval,
Kasiviswanathan, S. P., P. Melville, A. Baner-
pages 841–842. ACM.
jee, and V. Sindhwani. 2011. Emerg-
ing Topic Detection using Dictionary Learn- Stone, P., D. Dunphy, M. Smith, and D. Ogilvie.
ing. In CIKM’11: Proceedings of the 1966. The General Inquirer: A Computer
20th ACM international conference on Infor- Approach to Content Analysis. MIT Press.
mation and Knowledge Management, pages
Turney, Peter D. 2002. Thumbs Up or Thumbs
745–754. ACM.
Down? Semantic Orientation Applied to Un-
Kim, Soo-Min and Eduard Hovy. 2004. De- supervised Classification of Reviews. In Pro-
termining the sentiment of opinions. In Pro- ceedings of the 40th Annual Meeting on Asso-
ceedings of the 20th International Conference ciation for Computational Linguistics, pages
on Computational Linguistics (COLING ’04). 417–424. ACL.
ACL. Villena-Román, J., J. Garcı́a-Morera, C. Moreno-
Lee, K., D. Palsetia, R. Narayanan, M. Patwary, Garcia, L. Ferrer-Ureña, S. Lana-Serrano,
A. Agrawal, and A. Choudhary. 2011. Twit- J. C. González-Cristobal, A. Westerski,
ter trending topic classification. In Interna- E. Martı́nez-Cámara, M. A. Garcı́a-
tional Conference on Data Mining Workshops Cumbreras, M. T. Martı́n-Valdivia, and
(ICDMW), pages 251–258. IEEE. Ureña-López L. A. 2012. TASS-
Workshop on Sentiment Analysis at SEPLN.
Lin, C., Y. He, R. Everson, and S. Rüger. Procesamiento de Lenguaje Natural, 50.
2012. Weakly Supervised Joint Sentiment-
Topic Detection from Text. IEEE Transac- Wilson, T., J. Wiebe, and P. Hoffmann.
tions On Knowledge And Data Engineering, 2005. Recognizing contextual polarity in
24(6):1134–1145. phrase-level sentiment analysis. In Pro-
ceedings of Human Language Technology
Makhoul, J., F. Kubala, R. Schwartz, and Conference and Conference on Empirical
R. Weischedel. 1999. Performance measures Methods in Natural Language Processing
for information extraction. In Proceedings of (HLT/EMNLP), pages 347–354. ACL.
the DARPA Broadcast News Workshop.
O’Connor, B., R. Balasubramanyan, B. R. Rout-
ledge, and N. A. Smith. 2010. From tweets to
polls: Linking text sentiment to public opin-
ion time series. In Proceedings of the Fourth
International AAAI Conference on Weblogs
and Social Media (ICWSM’10).
84