We present a neural network model that computes embeddings of words using recurrent network based on long short-term memories to read in characters. As an alternative to word lookup tables that require a set of parameters for every word type in the vocabulary, our models only require a look up table for characters and a fixed number of parameters for the compositional model, independent of the vocabulary size. As a consequence, our model uses fewer parameters and is also sensitive to lexical content, such as morphology, making it more adequate for tasks where morphological information is required. In part-of-speech tagging, we can perform competitively with state-of-the-art systems, without explicitly engineering lexical features, and using a relatively small number of parameters.
NLP with characters
April 28, 2015
1:00 pm