This is the project I worked on during my master thesis. The full project description and code can be found on my GitHub page.
Abstract of the thesis, describing the motivation and idea of the project:
Motivated by the potential benefits of a system that accelerates the process of writing radiological reports, we present a Recurrent Neu- ral Network Language Model for modeling radiological language. We show that Recurrent Neural Network Language Models can be used to produce convincing radiological reports and investigate how their performance can be improved by using advanced regularization techniques like embedding dropout or weight tying, and advanced initialization techniques like pre-trained word embeddings. Furthermore, we study the use of transfer learning to create topic-specific language models. To test the applicability of our techniques to other domains we perform experiments on a second dataset, consisting of forum posts on motorized vehicles. In addition to our experiments on Recurrent Neural Network Language Models, we train a Continu- ous Bag-of-Words model on the radiological dataset and analyze the resulting medical word embeddings. We show that the embeddings encode medical relationships, semantic similarities and that certain medical relationships can be represented as linear translations.