Bias Propagation in Large Scale Machine Learning Pipelines in the Pharmaceutical Sector

Authors

  • Eitan Rosenfeld Holon Institute of Technology, Israel Author
  • Lior Katz Shamoon College of Engineering, Israel Author
  • Fredrik Heintz Linkping University, Sweden Author
  • Francesco Piccialli University of Naples Federico II, Italy Author

Keywords:

Bias propagation, machine learning pipelines, pharmaceutical analytics, clinical decision support systems, artificial intelligence, data governance, explainable AI

Abstract

Machine learning pipelines in the pharmaceutical sector increasingly influence discovery, clinical decision support, safety monitoring, and operational planning. While these systems promise efficiency and scale, they also introduce complex mechanisms through which bias is accumulated, amplified, and propagated across interconnected data and model layers. Unlike isolated model bias, pipeline level bias emerges from interactions between data acquisition, preprocessing, feature engineering, learning architectures, and deployment feedback loops. This work presents a systematic investigation of bias propagation in large scale pharmaceutical machine learning pipelines. We propose a formal pipeline bias decomposition framework, introduce quantitative propagation metrics, and demonstrate how bias evolves across discovery, development, and post market surveillance workflows. Experimental results highlight measurable distortions in risk prediction, patient stratification, and adverse event detection. The study emphasizes the need for architecture aware mitigation strategies that extend beyond single model interventions

Downloads

Published

2022-04-15