Socioeconomic Characteristics of Cancer Mortality in the United States of America: A Spatial Data Mining Approach
Cancer is the second leading cause of death in the United States of America. Though it is generally known that cancer is influenced by environment, its relation to socioeconomic conditions is still widely debated. This research analyzed the spatial distribution of cancer mortalities of breast, colorectal, lung, and prostate, and their associated socioeconomic characteristics using association rule mining technique. The mortality patterns were analyzed at the county and health service area levels that corresponded to the years between 1999 2002 and 1988 1992, respectively.
Distinct socioeconomic characteristics of cancer mortality were revealed by the association rule mining technique. The counties that had very high rates of breast cancer mortality also had very low percent of whites who walked to work; very high rates of colorectal cancer mortality was associated with very low percentage of foreign born population; very high rates of lung cancer mortality was associated with very low percent of whites who walked to work; and counties that had very high prostate cancer mortality rates had a very low percentage of their residents born in the west.
The cancer mortality and socioeconomic variables were discretized using equal interval, natural breaks, and quantile discretization methods to analyze the impact discretization techniques have on the cancer mortality and socioeconomic patterns obtained using association rule mining. The three discretization techniques produced patterns that involved different rates of cancer mortality and socioeconomic characteristics. Results of this analysis showed that a 5-class interval natural breaks discretization technique achieved the highest discretization accuracy, while the equal interval method produced association rules that had the highest support value.
The research also analyzed the effect of scale on the patterns produced by the association rule technique. At the county level breast and lung cancers associated with mode of transportation to work, whereas colorectal and prostate cancers associated with place of birth. At the health service area level, the association rule with the highest support value among the breast-, colorectal-, and prostate-cancer mortality rates involved a household family characteristics, whereas high lung cancer mortality rates were associated with low educational attainment.
Advisor:Kelley Pace; Gerald M. Knapp; Michael Leitner; Andrew Curtis; Nina S.-N. Lam
School:Louisiana State University in Shreveport
School Location:USA - Louisiana
Source Type:Master's Thesis
Date of Publication:11/14/2006