Large-scale models are trained on massive amounts of data, yet the secrecy surrounding training datasets makes it difficult to determine whether specific content was included. In this talk, I introduce two novel approaches for addressing this challenge in the context of large language and vision-language models.
First, I present DE-COP, a method designed to detect whether copyrighted text has been included in a language model’s training data. By leveraging multiple-choice questions that contrast verbatim text with its paraphrases, DE-COP effectively exposes memorization, significantly outperforming prior methods. Unlike most existing training data detectors, it does not rely on access to token probabilities, making it fully applicable to black-box models.
Then, I extend this investigation to vision-language models with DIS-CO, a new approach for identifying copyrighted visual content in training data. DIS-CO queries models with frames from movies, evaluating whether they can correctly guess the corresponding titles in free-form text generation. Using our MovieTection benchmark, built from 14,000 frames across various films, we find that many popular VLMs display clear signs of memorization, raising broader concerns about AI training practices and copyright compliance.