Note: This content is accessible to all versions of every browser. However, this browser does not seem to support current Web standards, preventing the display of our site's design details.


Text and Web Mining: Sentiment Analysis of Financial Markets


E. Zvizdic

Master Thesis, HS15 (10321)

Sentiment analysis is a machine learning procedure that aims to transform a collection of raw text documents to features and map them to labels, which convey the attitude of each document towards a particular topic. Transformed feature space is typically very sparse and noisy. In addition, many frequently used approaches neglect the richness of information stored in the syntactic and semantic structure of documents. We propose a novelty framework that analyzes the syntactic, semantic and sentiment properties on document, phrase and word level to derive a more structured representation of text. We decompose each document into syntactic units, obtain the vocabulary and assign semantic and sentiment priors to each extracted word and phrase using abundant online lexical resources. We use an unsupervised method to contextualize the prior assignments and a Bayesian classifier to predict sentiment labels for documents. Additionally, we alleviate the data sparsity and allow for better generalization to unseen documents using synonym sets. We have evaluated the developed model on the standard movie review data set and the newly created supervised set consisting of comments on performance of stocks. Experimental evaluation has shown that contextualized semantic and sentiment features play a crucial part in our approach and improve the prediction accuracy consistently. On the other hand, our syntactic representation did not outperform the benchmark methods, presumably due to the feature selection process which may be more suitable for topic detection and clustering. We have also observed that our sparsity alleviation method improves the feature representation and allows us to apply our method for online sentiment estimation. The results on the stock comment data set show promise for future applications in finance.

Supervisors: Sebastiano Rossi (swissQuant AG), Nathaniel Zollinger (swissQuant AG), Angelos Georghiou, Chithrupa Ramesh, John Lygeros


Type of Publication:

(12)Diploma/Master Thesis

J. Lygeros

File Download:

Request a copy of this publication.
(Uses JavaScript)
% Autogenerated BibTeX entry
@PhdThesis { Xxx:2016:IFA_5383
Permanent link