Empirical Insights from Code: Data Mining and Machine Learning for Software Security and Quality
- Speaker(s)
- dr Piotr Przymus
- Affiliation
- Uniwersytet Mikołaja Kopernika w Toruniu
- Language of the talk
- English
- Date
- May 23, 2025, 4 p.m.
- Room
- room 4060
- Link
- https://meet.google.com/jbj-tdsr-aop
- Seminar
- Seminar Intelligent Systems
This talk explores how machine learning and data mining
techniques can advance empirical software engineering, with a focus on
improving software quality and security. It presents methods for bug
localization, code reviewer recommendation, analyzes vulnerability
lifecycles - including those involving transitive dependencies - and
introduces tools for studying real-world bug-fixing practices. The
evolving role of large language models (LLMs) is examined from dual
perspectives: as development aids and as potential sources of new
security threats. A case study of the XZ Utils supply chain attack
illustrates how modern development workflows can be exploited, using
insights from repository mining. These findings highlight the
potential of data-driven research to enhance the development of secure
and trustworthy software systems.