02 → Predicting Crop Damages
CONTEXT:
Academic Deep Learning Project
Spring 2024
SKILLS DEVELOPED:
Deep Learning, Neural Networks, Data Development, Data Cleaning
ABOUT:
Small farms, which make up over 90% of U.S. farms, are highly vulnerable to climate change-driven weather extremes, such as droughts and floods. Many of these farms lack insurance and risk management tools, making it difficult to prepare for crop loss events. This project aimed to explore whether weather data could be used to predict crop loss occurrences, providing small-scale farmers with a proactive tool to mitigate risks. Using open-source data from the USDA and NOAA, I built machine learning models to classify whether specific weather patterns would lead to crop losses. The ultimate goal was to develop a predictive model that could serve as the foundation for a digital platform assisting farmers in climate adaptation.
DATA + METHODOLOGY :
A significant portion of this project was dedicated to data collection and processing. I combined two primary datasets: (1) USDA's crop loss records (1989-present), which document crop damages and their causes, and (2) NOAA’s historical monthly weather summaries, detailing extreme temperatures, precipitation, and seasonal variations. Data preprocessing involved mapping counties to their nearest weather stations, filtering out non-weather-related losses (e.g., disease, pests), and structuring input features to capture four-month weather trends leading up to a crop loss event. The modeling process evolved through multiple iterations, starting with logistic regression as a baseline, followed by a feedforward neural network (FNN), and ultimately an LSTM-based recurrent neural network (RNN), which captured temporal dependencies in weather patterns.
RESULTS:
Early modeling attempts, such as logistic regression and FNN, showed limited accuracy in classifying specific causes of crop loss due to overlapping categories (e.g., drought vs. heat-related losses). Adjusting the model to predict a simpler binary classification—loss vs. no-loss—yielded significantly better results. The final LSTM-based RNN model achieved 89% accuracy, with 93% accuracy for no-loss predictions and 84% accuracy for loss predictions. This highlighted the potential for time-series models to provide meaningful insights into climate-related agricultural risks.
FUTURE DIRECTIONS + APPLICATIONS:
Although the project primarily served as a proof-of-concept, its findings have practical implications for small-scale farming communities. The predictive model could be integrated into a digital risk management tool that provides farmers with real-time insights into climate-related threats. Beyond simple risk alerts, such a platform could offer crop selection recommendations based on resilience to local weather patterns and facilitate farmer-to-farmer knowledge sharing. Future directions include expanding the dataset to cover additional states, incorporating more granular weather variables (e.g., humidity, consecutive frost days), and refining predictions to estimate the extent of crop damage rather than just its occurrence.