Toward Data-drvien, Semi-automatic Inference of Phenomenological Physical Models: Application to Eastern Sahel Rainfall

First-principles based predictive understanding of complex, dynamic physical phenomena, such as regional precipitation or hurricane intensity and frequency, is quite limited due to the lack of complete phenomenological models underlying their physics. To address this gap, hypothesis-driven, manually-constructed, conceptual hurricane models and models for regional-scale precipitation extremes have been emerging. To complement both approaches, we propose a methodology for data-driven, semi-automatic inference of plausible phenomenological models and apply it to derive the model for eastern Sahel rainfall, an important factor for socioeconomic growth and development of this region. At its core, our methodology derives cause-effect relationships using the Lasso multivariate regression model and quantifies compound affect that the complex interplay among the key predictors at their prominent temporal phases plays on the response (rainfall). Specifically, we propose methods for (a) detecting and ranking predictors' prominent temporal phases, (b) optimizing the regularization penalty, (c) assessing predictor statistical significance, (d) performing impact analysis of data normalization on model inference, and (e) calculating the ECI (Expected Causality Impact) score to quantify impact analysis. The culmination of this study is the plausible phenomenological model of the eastern Sahel seasonal rainfall and quantified key climate drivers involved in the rainfall variability at different time lags. To the best of our knowledge, this is the first phenomenological model of this phenomenon; several of its components are consistent with the known evidence from literature.[pdf]

Authors: Saurabh V. Pendse, Isaac K. Tetteh, Fredrick Semazzi, Vipin Kumar, Nagiza F. Samatova

Supplement Files

Supplement 1 : Beta coefficiencts for predictors over different temporal phases (Methods A, B, C, D and E). Download here
Supplement 2 : Bar graphs depicting the outputs ofthe Lasso algorithm (beta coefficients) with statistical significance assessment. Download here
Supplement 3 : Tables representing prominent temporal phases selections for all indices (Methods A, B, C and D). Download here
Supplement 4 : Tables representing significant betas corrsponding to the bar charts in Supplement 2. Download here
Supplement 5 : Monthly probabilities of influence of predictors on rainfall as the response. Download here
Supplement 6 : Cummulative probability distribution for detection of predictors and the ECI scores for method A, B, C, D and E. Download here

For more information contact: Dr. Nagiza Samatova - samatova@csc.ncsu.edu