Word embeddings, such as Word2Vec or Glove, are vector representations that capture lexical-semantic properties of words. They constitute a practical way for transferring knowledge between two machine learning models, and they contribute to greatly reducing the learning time required for solving various NLP tasks. There is great practical interest in experimenting with different word embedding models. Neural-based models, due to their flexibility, are a great framework for that experimentation. However, that very same flexibility also brings many degrees of freedom to the experimentation, which end up becoming a challenge in itself. In this talk, we will present Syntagma, a python toolkit (still under development) that enables rapid experimentation of neural word embedding models. We will present preliminary results of experimenting with some of the hyper-parameters of a baseline word embedding model (similar to Word2Vec), and we will discuss the next steps for Syntagma.
Going Neurotic With Neural Word Embeddings… again!
July 18, 2017
1:00 pm
Luís Sarmento
Luís Sarmento holds a PhD in Computer Science from University of Porto (2010), with background in Electrical Engineering (Bs+MsC) and Artificial Intelligence (MsC). He has been working in the fields of Natural Language / Search for about 15 years, both as a member of research groups at the University of Porto and in the industry. In 2010 he joined Portugal Telecom / SAPO as tech lead for Big-Data and Recommender Systems, and in 2012 he joined Amazon where, until early 2017, he led research teams in the fields of Query Understanding and Voice Shopping. He is now CTO of Tonic App (http://www.tonicapp.com/), a startup developing productivity tools for medical doctors.Tonic AppSeminários
Últimos seminários
Cost-Sensitive Learning to Defer to Multiple Experts
March 2, 2026Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Fair Federated Learning under Group-Specific Distributed Concept Drift
February 24, 2026Machine learning models can become unfair when different groups experience changes in data over time, a phenomenon called group-specific concept…
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…

