Exploring Label Structure and Spatial Attention for Fashion Images Classification

May 26, 2020

1:00 pm

In order to make decisions, for instance when purchasing a product, people rely on rich and accurate descriptions, which entail multi-label retrieval processes. However, multi-label classification is challenged by high dimensional and complex feature spaces and its dependency on large and accurately annotated datasets. Deep learning approaches brought a definite breakthrough in performance across numerous machine learning problems, and image classification was, undoubtedly, one of the tasks where these approaches had greater repercussions. In this presentation we will focus on image classification of fashion images, using deep learning approaches to tackle the multi-class/multi-label problems in order to generate rich images descriptions. Fashion datasets are challenging because they include a vast amount of similarly looking images and they are annotated with a large diversity of attributes but with few labels per exemplar. To address the previous issues we explore domain knowledge to constrain the (otherwise completely data-driven) solutions. Specifically, we first show how to incorporate knowledge about annotations structure. Secondly, we use context and semantic localization to guide an attention mechanism that designs the feature space by focusing on visually meaningful regions. We show with thorough experimentation the performance gains achieved for both cases.

Beatriz Ferreira

Beatriz Quintino Ferreira is a PhD student of the NETSyS program, from the Signal and Image Processing Group at Instituto de Sistemas e Robótica. Her main research interests lie in the intersection of Computer Vision and Machine Learning. She is also an apologist of interpretable models, as she deems interpretability to be fundamental to the development of richer and more robust models, more easily comprehended by humans. She has been a PhD student intern at Farfetch and a visiting scholar at CMU. Some of her recent publications can be found on KDD and on ICCV workshops.ISR/IST

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…
Enhancing Uncertainty Estimation in Neural Networks
May 6, 2025
Neural networks are often overconfident about their predictions, which undermines their reliability and trustworthiness. In this presentation, I will present…
Improving Evaluation Metrics for Vision-and-Language Models
April 22, 2025
Evaluating image captions is essential for ensuring both linguistic fluency and accurate semantic alignment with visual content. While reference-free metrics…

Exploring Label Structure and Spatial Attention for Fashion Images Classification

Beatriz Ferreira

Seminários

Últimos seminários

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Speech as a Biomarker for Disease Detection

Enhancing Uncertainty Estimation in Neural Networks

Improving Evaluation Metrics for Vision-and-Language Models