The quality of semantic image segmentation models can be affected by external factors such as weather or daytime. Those factors can lead to safety-critical mistakes. In this work, we propose a systematic approach to detect and alleviate such weaknesses of semantic segmentation models. We systematically evaluate a semantic segmentation model under different external factors and analyze which factors have the largest impact on the performance. Then, we collect new training data under the most harmful external factors and fine-tune the model. We use the CARLA simulator to obtain driving data under various environment settings. We deploy a state-of-the-art semantic segmentation model in two distinct driving environments. Then, we use the proposed process to detect which external factors affect model performance the most. We collect new training data under those factors and fine-tune the model. The proposed approach outperforms collecting the same amount of random additional data by up to 10.6%. Our results show the benefit of using an iterative refinement approach as opposed to merely collecting larger data sets. Finally, we use the knowledge about which factors affect performance the most to train a simple decision tree classifier to predict the model’s performance given the current external factors. Problematic environments can be detected at an average accuracy of 87.5%.