1

De-identification of Privacy-related Entities in Job Postings

De-identification is the task of detecting privacy-related entities in text, such as person names, emails and contact data. It has been well-studied within the medical domain. The need for deidentification technology is increasing, as …

DAN+: Danish Nested Named Entities and Lexical Normalization

This paper introduces DAN+, a multi-domain resource for nested named entities (NEs) and lexical normalization for Danish, a less-resourced language. We empirically assess three strategies to model the two-layer NE annotations, cross-lingual …

Cross-Domain Sentiment Classification using Vector Embedded Domain Representation

Due to the differences between reviews in different product categories, creating a general model for crossdomain sentiment classification can be a difficult task. This paper proposes an architecture that incorporates domain knowledge into a neural …