Pragmatic and ML Approaches to Backfilling Missing Data Within Time Series Datasets
Document Type
Conference Proceeding
Publication Date
2024
Abstract
This paper explores innovative approaches to ad-dressing data gaps in time series datasets, specifically focusing on monthly mean water levels along the US East and Gulf of Mexico coasts. We employ both pragmatic and machine learning (ML) techniques to backfill missing data, enhancing the reliability of historical water level records that are essential for climate and coastal hazard studies. Our pragmatic approach includes a comprehensive framework for distance calculation, correlation analysis, and relative intensity measurement, providing a robust baseline for data imputation. In parallel, we develop advanced ML models, including ensemble methods such as Voting and Stacking Regressors, which significantly outperform traditional techniques in hindcasting missing values. These models leverage feature engineering and data augmentation, incorporating cli-matic indices and temporal features to capture complex patterns in the data. The results demonstrate that the combined approach both fills data gaps effectively and also offers a versatile solution for improving the quality of environmental time series datasets, contributing to better understanding and forecasting of coastal water level variations. © 2025 Elsevier B.V., All rights reserved.
Recommended Citation
Brown, Taylor J.; Wilkerson, Matthew; Blanton, Brian O.; and Bhattacharya, Sambit, "Pragmatic and ML Approaches to Backfilling Missing Data Within Time Series Datasets" (2024). College of Health, Science, and Technology. 1159.
https://digitalcommons.uncfsu.edu/college_health_science_technology/1159