Abstract
Cloudbursts pose a significant threat in India, especially during the South-West Monsoon season that commences in June. India's diverse climate regions, including the northern Himalayan region, Indo-Gangetic Plain, southern peninsula, and coastal areas, experience sporadic cloudbursts, with only 31 recorded instances, mainly in Himachal Pradesh, Uttarakhand, and Jammu and Kashmir. To address the lack of comprehensive Indian cloudburst data, we've curated a dataset, incorporating meteorological factors for cloudburst prediction. This dataset encompasses variables such as Temperature, Wind Gust, Wind Gust Speed, Humidity, Monsoon patterns, Air Pressure, and Cloud Density. Our goal is to improve preparedness and mitigation strategies, safeguarding lives, and property in cloudburst prone areas. Employing optimized machine learning algorithms, our model analyzes these parameters alongside prevailing weather conditions, facilitating cloudburst event prediction. We evaluate the prediction performance of machine learning algorithms, including KNN. The KNN algorithm outperformed others with an accuracy of 86.18%. Moreover, we provide graphical insights into the correlation between humidity and cloudburst occurrence, emphasizing the importance of weather variables in prediction models. This research contributes to cloudburst forecasting, even with limited Indian data, and highlights the potential of utilizing diverse machine learning techniques for improved accuracy.