RESUMO
Typical spatial disease surveillance systems associate a single address to each disease case reported, usually the residence address. Social network data offers a unique opportunity to obtain information on the spatial movements of individuals as well as their disease status as cases or controls. This provides information to identify visit locations with high risk of infection, even in regions where no one lives such as parks and entertainment zones. We develop two probability models to characterize the high-risk regions. We use a large Twitter dataset from Brazilian users to search for spatial clusters through analysis of the tweets' locations and textual content. We apply our models to both real-world and simulated data, demonstrating the advantage of our models as compared to the usual spatial scan statistic for this type of data.