For every State:

Consider “diseases cases/death count per unit population” and “water quality factors per unit population” normalized in the range [0,1]
Diseases Considered are

Acute diarrhoea, malaria, Japenese_Encephalitis, viral hepatitis

Water quality factors considered are:
- Temperature, Dissolved Oxygen, pH, Conductivity, Biochemical Oxygen, Nitrate, Coliform level (All mean values)

Generate clusters for states using k-means with k=5 (as inferred by elbow curve) considering:

Using Apriori association rules, we uncover the association between the cluster set A and B with a certain level of confidence and lift considering state-level pollutants and disease related data

Associations among clusters of states made by pollutants and diseases

The values represented below are normalized values and do not correspond directly to actual values