Associations among clusters of states made by pollutants and diseases
The values represented below are normalized values and do not correspond directly to actual values
Inference
Low count of diseases cases ==> Low water pollution
Moderate-High water pollution ==> Moderate count of diseases cases
Data Sets
H-DS-1: All India (from 2000 to 2011) and State-wise (2010 and 2011) number of cases and deaths due to specified diseases (Acute Diarrhoeal Diseases, Malaria, Acute Respiratory Infection, Japanese Encephalitis, Viral Hepatitis)
W-DS-2:Status of Water Quality in India - 2008 and 2011
Methodology
For every State:
Consider “diseases cases/death count per unit population” and “water quality factors per unit population” normalized in the range [0,1]
Generate clusters for states using k-means with k=5 (as inferred by elbow curve) considering:
normalized counts for different diseases cases => results Cluster set A
normalized values for different Water quality factors => results Cluster set B
Using Apriori association rules, we uncover the association between the cluster set A and B with a certain level of confidence and lift considering state-level pollutants and disease related data