Article ID | Journal | Published Year | Pages | File Type |
---|---|---|---|---|
383470 | Expert Systems with Applications | 2013 | 8 Pages |
•We propose CT4RDD, a novel methodology for analysis and modeling of digital divide.•CT4RDD is based on C4.5, a reputed classification tree algorithm from the AI domain.•Data from the 2010 Mexican Census are used to create and evaluate our methodology.•Internet presence in households of cities is represented as a discrete value scale.•CT4RDD has produced if-then rules describing profiles of digital divide of cities.
This paper presents CT4RDD (classification trees for research on digital divide), a novel methodology for the quantitative analysis and modeling of the digital divide phenomenon with an approach of single country. It is inspired on the reputed Quinlan’s C4.5 algorithm to automatically produce classification trees, as implemented in Witten & Frank’s WEKA software toolkit. The methodology is created and evaluated on data from the 2010 Mexican Population and Housing Census that include a number of variables whose interactions involve aspects of the phenomenon; particularly, interactions among Internet service presence in households and a number of features regarding educational and economical levels, genders, ages, housing characteristics, ratios of indigenous population, etc. Discretization is used to represent percentages of presence of Internet in households of municipalities as a nominal target attribute to produce classification trees. Results suggest that the methodology can produce quantitative profiles that describe similarities and differences among a series of municipality classes that present different percentages of presence of Internet in households. The discovered profiles provide scholars, government officials and enterprise managers with valuable insight for research, planning and decision making.