ZRSA

Technologies We Use

Interactive Data Visualization

Click below to see a county outline

The relatively new D3 JavaScript library provides the means to look at data in new and engaging ways. Data can be shuffled, colored and linked in ways that can help you uncover new patterns and new connections. Although there are a wide range of data visualization libraries (e.g. gRaphaƫl, Highcharts) D3 allows significant control over how your data is presented.

An ExampleAsian immigration in New York State counties

Given the attention in Congress and the news media focused on immigration we thought we would set up an example data visualization using US Census data on the foreign-born population — specifically those born in Asia. Here in Tompkins County we benefit from a large immigrant population thanks largely to Cornell University and Ithaca College and the relatively high proportion of those born in Asia is evident in the visualizations below. See if you can spot Tompkins County (low population, significant population from Asia) in the chart and map.

Total Population
Scale Circle Area By:
% born in Asia

NYS counties with circle areas proportional to total population. Hover to select a county.

Total Population
Sort Data By:
% born in Asia
Color Data By:
% born in Asia

The height of the bars is determined by county population. The bars are sorted (left to right) by total population. Hover to select a county.

Tabular Data Prep

The visualization is based on US Census variable B05006, by county — place of birth for the foreign-born population in the United States. This data can be downloaded from FactFinder2 or from DataFerrett. We did the data processing in http://www.r-project.org/.

Geographic Data Prep

The geographic file is the detailed county boundaries data available through Environmental Systems Research Institute. But versions of this data can be downloaded for free from a variety of sources including the US Census as well of versions that have already been processed for use on the web (Mike Bostock's examples). We like to use detailed county boundaries in our maps but have been hampered by the size of representing them on the web in vector format. In past examples we have used tiling through Google Maps, CartoDB, MapBox but a new JavaScript library allowed us to reduce the size of the overall files.

Using Mike Bostock's new TopoJSON library we converted the shapefile to TopoJSON format to reduce the overall size by eliminating topologic redundancy (avoiding having to draw, more than once, borders, for example). In our example, we started with a shapefile of ~5.5 mb which, when converted to standard GeoJSON (using ogr2ogr) is ~9.4 mb but with TopoJSON we have approximately 800 kb. This size comparison is not perfect because the original shapefile and GeoJSON file included several Census variables that were stripped in the processing but, regardless, the size difference is substantial.

Acknowledgments

The code used to create these graphics borrows heavily from nicely documented examples provided by the creator of D3, Mike Bostock. This site also makes use of code included in Scott Murray's useful book Interactive Data Visualization for the Web