about 6 years ago

Dealing with data: tips and resources

Getting to the root of a deep sociological question like the one we’re asking you to answer — Why don’t more U.S. high school students graduate on time, and how can we help them do that? — isn’t an easy task. Here are some things to consider when thinking about the graduation rate challenge and how to approach the data.

  1. Think about ways to leverage the Census data. AT&T has included data relating to every Census tract that touches a particular school district. Is this information useful, or is it too much? Do school districts with more variation within the tract variables shed any more insight into graduation rates?
  2. Incorporate other data sets. We’ve given you a head start by collecting some preliminary data, but we strongly believe the buck doesn’t stop there. What other publicly available datasets do you think could be beneficial in providing additional insights? For example, do geographic crime rates make a difference? Does weather have anything to do with the variation of graduation rates? How can civic and sociological data inform your findings? We’ve put together some additional suggested data sources and listed them on the resources page.
  3. Data visualizations can help. Like most data scientists, we love numbers! But raw numbers alone are often not especially intuitive when trying to paint a larger picture. Interactive visualizations can make all the difference when trying to communicate your findings. There are plenty of open source tools for creating data visualizations; we’ve listed a few below. (Please note: the rules require you to create EITHER a data visualization OR a data-centric app; we’re just pointing out some ways you could derive value from the data.)
Tools for building data visualizations

Additional tools:

IBM Bluemix tools such as Analytics for Hadoop, Watson, and Data Analytics. (sign up for a free 30-day Bluemix trial)


