In this thesis we develop an architecture aimed at tackling humour intensity prediction. The task has a continuous label space contrary to much previous work, which has mostly concerned itself with discrete (and often binary) classification. Using a combination of techniques the regression model seeks to incorporate many aspects of humour. By combining modern neural encoders with classical hand-crafted features and neural language models we hypothesise that it is possible to capture many perspectives of the complex task. By comparing a variety of configurations to relevant baselines we conclude that the proposed model performs well. An ablation study shows that the main contributor to the models success is the neural language model. By analysing the components further the work seeks to explore why this is, and proposes some possible answers for why the components underperform, and how this can be addressed in future work.