Document Type

Theses, Ph.D

Disciplines

Computer Sciences

Publication Details

A thesis submitted for the degree of Doctor of Philosophy (Ph.D.), Technological University Dublin, December 2023.

Abstract

The concept of linguistic style denotes that many aspects of text can vary while maintaining a same source core semantic meaning. For example, a message may be written in a formal or informal style. The textual style transfer problem aims at generating a paraphrase of a given text by modifying its style while preserving its content. To the best of our knowledge, within the literature on textual style transfer, there is no standard widely accepted definition of the concept of style. Moreover, very few works have investigated the characteristics of language styles. Therefore, previous research, as far as our knowledge extends, have not taken the variations of different textual style domains into account while dealing with the style transfer task. This research investigates domain-specific style characteristics by examining the separation of the style and content, as well as the variations in style across domains and how these variations are encoded. Furthermore, it looks into the factors which are relevant to do a comprehensive evaluation of textual style transfer models. The research uses the domains of sentiment and formality.

A variety of frameworks have been employed as textual style transfer models throughout the experiments of the current work where networks such as RNNs, and transformers are used as the encoders, decoders and discriminators. These models are trained in an unsupervised manner, i.e. the data is or is considered as non-parallel. The experimental methodology frames style transfer as a multi-objective problem, evaluating each approach through three aspects of the generated style-shifted outputs: presence of the target style (style-shift power of the approach), presence of the input content (content preservation power of the approach) and fluency and grammatical correctness of the output sequences. To evaluate these dimensions, various automatic methods are applied which are further confirmed by conducting human evaluation tests. The performance of the style transfer systems reveals a trade-off between the evaluation aspects. This confirms the need of applying this comprehensive evaluation methodology and questions the approach taken in some previous researches where the focus is on one or two evaluation dimensions which can lead to neglecting the disregarded aspect(s).

Our research firstly looks into the separation of style and content in chapters 4 and 5. To do so, different experiments are conducted to probe the latent space of a variety of adversarial RNN-based style-shift frameworks while considering sentiment and formality as the style domains. The main focus of these experiments is to investigate the presence of the source stylistic features, i.e. to analyse how these models encode style-related features in their latent spaces. The results which hold for the two style domains indicate that style cannot be totally separated from content.

A series of experiments are then designed in chapter 5 to examine if the concept of style is consistent across different style domains. This includes experiments which focus on studying the correlation of style and content across the domains, as well as, studying the effect of modifying the latent space across different style domains. The findings indicate that in the case of sentiment there is a closer entanglement between style and content, as compared with formality domain where these elements are less entangled. Observing that the concept of style can vary across different domains shifts the attention of this study towards analysing how this variation is encoded.

To explore the variations across the style domains, a number of experiments are conducted in chapter 6. This includes a series of probing classification tasks are performed to examine how different layers of encoders of adversarial transformer-based style transfer models encode the style of the input. Furthermore, some unigram-based experiments are conducted which further confirm the variations observed. The results indicate that formality is more globally encoded compared to the sentiment which is more locally encoded. Finally, a series of experiments look into the effect of emphasizing more on encoding the input on the style-shift power of the models across different style domains which is in line with previous results and implies that formality is a more complex style domain to be dealt with in style transfer scope as compared with sentiment. The findings of these experiments contribute to a better understanding of the style and highlight the question of how the characteristics of various style domains can affect framing the textual style transfer task.

This open question is investigated as a final step by conducting some experiments in chapter 6 to illustrate how style characteristics of different styles should be considered when selecting evaluation methods. In particular, it focuses on the content preservation dimension and shows how it can be computed more effectively by considering the variation of the characteristics and encoding of style across the formality and sentiment domains.

DOI

https://doi.org/10.21427/R2PW-DQ03

Creative Commons License

Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License
This work is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 4.0 International License.


Share

COinS