Deep learning of humor from Gary Larson’s cartoons (2020) – Research Unit Virtual & Augmented Reality

Abstract

The aim of this thesis is to model humor using deep learning based on Gary Larsonscartoons. The recent success of deep learning in computer vision and natural languageprocessing shows that similar techniques can be applied in the field of computationalhumor. The training of deep learning models requires a dataset with many trainingsamples, which is why I created a novel dataset containing several thousands of GaryLarsons cartoons, punchlines and corresponding funniness annotations. The dataset wasannotated using a custom labelling tool, by the single person. Therefore, the datasetentails the humor of a single person. With this dataset it is possible to quantitativelycompare humor with the results of the deep learning models or with other people.After an extensive dataset analysis, I designed and trained several deep neural architectures.First, focusing on the visual domain (cartoons) using convolutional neuralnetworks, transfer learning and object detection techniques. Afterwards, I focused onthe text domain (punchlines) using Long Short-Term Memory networks, several wordembeddings (deep learning based and classical) and an automated machine learningapproach. Finally, I tried to combine all the findings into a unified two stage architecture.Unfortunately, the evaluation revealed that this task is not yet tractable by the deeplearning techniques applied. I chose two performance metrics (Mean absolute error andaccuracy) and several baseline models (most frequent class, mean class, etc.) and nomodel improved on the baselines significantly. On the test set a transfer learning basedapproach scored the best accuracy of 26.10%, while the most frequent class scored 24.50%.Both a deep learning approach and the mean class reached a mean absolute error of 1.57.These results show, that the semantic gap between computers and humans is too largefor current deep learning based approaches to successfully model the humor of a singleperson. It seems another breakthrough besides deep learning is required for this task.

Reference

Fischer, R. (2020). Deep learning of humor from Gary Larson’s cartoons [Diploma Thesis, Technische Universität Wien]. reposiTUm. https://doi.org/10.34726/hss.2020.56860