Рет қаралды 14
Vishal Subedi is a first year PhD student at UMBC. Interests lie in applied statistics, machine learning and deep learning.
Domestic crime, conflict, and instability pose a significant threat to many contemporary governments.
These challenges have proven to be particularly acute within modern-day Mexico. While there have been significant developments in predicting intrastate armed and electoral conflict in various contemporary settings, such efforts have thus far been limited in their use of spatial as well as temporal correlations, as well as in the features they have considered. Machine learning, especially deep learning, has been proven to be highly effective in predicting future conflicts
using word embeddings in Convolutional Neural Networks (CNN) but lacks the spatial structure and, due to the black box nature, cannot explain the importance of predictors. We develop a novel methodology using machine learning that can accurately classify future anti-government violence in Mexico. We further demonstrate that our approach can identify important leading predictors of such violence. This can help policymakers make informed decisions and can also help governments and NGOs better allocate security and humanitarian resources, which could prove beneficial in tackling this problem. Using a variety of political event aggregations from the ICEWS database alongside other textual and demographic features, we trained various classical machine learning algorithms, including but not limited
to Logistic Regression, Random Forest, XGBoost, and a Voting classifier. The development of this reseearch was a stepwise process in three phases where the following phase was built upon the shortcomings of the previous phases. In the very first phase, we considered a mix of CNN + Long Short Term Memory (LSTM) networks to decode the spatial and temporal relationship in the data. The performance of all the black box deep learning models was not at par with the classical machine learning models. The second phase deals with the analysis of the temporal relationships in the data to identify the dependency of the conflicts over time and its lagged relationship. This also serves as a method to reduce feature dimension space by removing variables not covered with the cutoff lag. The third phase talks about the general variable selection methodologies used to further reduce the feature space along with identifying the important predictors that fuel anti-government violence along with their directional effect using Shapley additive values. The voting classifier, utilizing a subset of features derived from LASSO
across 100 simulations, consistently surpasses alternative models in performance and demonstrates efficacy in accurately classifying future anti-government conflicts. Notably, Random Forest feature importance indicates that some features, including but not limited to homicides, accidents, material conflicts, and positive worded citizen information sentiments emerge as pivotal predictors in the classification of anti-government conflicts. Finally, in the fourth phase, we conclude the
research by analysing the spatial structure of the data using Moran’s I index extended version for spatiotemporal data to identify the global spatial dependency and local clusters followed by modelling the data spatially and evaluating the same using Gaussian Process Boost(GPBoost). The global spatial autocorrelation is minimal, characterized by localized conflicts cluster within the region. Furthermore, the Voting Classifier demonstrates superior performance over GPBoost, leading to the inference that no substantial spatial dependency exists among the various locations.
Session Materials: dataworks.test...