Linguistic Benchmarks of Online News Article Quality

April 4, 2017

1:00 pm

Online news editors ask themselves the same question many times: what is missing in this news article to go online? This is not an easy question to be answered by computational linguistic methods. In this work, we address this important question and characterise the constituents of news article editorial quality. More specifically, we identify 14 aspects related to the content of news articles. Through a correlation analysis, we quantify their independence and relation to assessing an article’s editorial quality. We also demonstrate that the identified aspects, when combined together, can be used effectively in quality control methods for online news.

Filipa Peleja

Filipa Peleja is a data scientist in the Big Data Analytics team at Vodafone. She holds a Ph.D. in Computer Science having studied topics in machine learning, information retrieval, natural language processing, sentiment analysis and recommendation systems. During her Ph.D. she had the opportunity to enroll in a nine month internship at Yahoo! Labs and work as a data scientist researcher at Eurecat Technology Centre of Catalonia. Filipa Peleja has been involved in several computer software projects in collaboration with private companies, public institutions and academia. She studied at the Faculty of Science and Technology in the NOVA University of Lisbon obtaining Computer Software/Informatics Engineering (M.Sc., B.Sc.). Also, worked as a Business Consultant at IP2CS and as a researcher at research centers CITI and NOVALINCS. Filipa Peleja has published several scientific articles and demos in international conferences.Vodafone

Seminários

Últimos seminários

Cost-Sensitive Learning to Defer to Multiple Experts
March 2, 2026
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Fair Federated Learning under Group-Specific Distributed Concept Drift
February 24, 2026
Machine learning models can become unfair when different groups experience changes in data over time, a phenomenon called group-specific concept…
Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding
June 17, 2025
Large language models (LLMs) have emerged as strong contenders in machine translation. Yet, they often fall behind specialized neural machine…
Speech as a Biomarker for Disease Detection
May 20, 2025
Today’s overburdened health systems face numerous challenges, exacerbated by an aging population. Speech emerges as a ubiquitous biomarker with strong…

Linguistic Benchmarks of Online News Article Quality

Filipa Peleja

Seminários

Últimos seminários

Cost-Sensitive Learning to Defer to Multiple Experts

Fair Federated Learning under Group-Specific Distributed Concept Drift

Unlocking Latent Discourse Translation in LLMs Through Quality-Aware Decoding

Speech as a Biomarker for Disease Detection