Results

I selected this topic because I have a passion for using technology to enhance education, and I thought a data visualization on this topic could enhance the experience of comparing boroughs, school districts, and schools on a variety of metrics that the user chooses. The goal is to enable the user to input his/her definition of best, geospatially visualize all of the boroughs, school districts, or schools that meet this definition, and select specific points of interest for detailed comparison so the user can decide which are the best fit. This visualization also supports scenarios of determining which boroughs, school districts, or schools are the worst fit and determining which metrics influence other metrics, for instance, whether class size influences graduation rate.

The terms of use of the New York City Open Data state that: "Public data sets made available on the NYC OpenData portal are provided for informational purposes. The City does not warranty the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set made available on the NYC OpenData portal, nor are any such warranties to be implied or inferred with respect to the public data sets furnished therein. The City is not liable for any deficiencies in the completeness, accuracy, content, or fitness for any particular purpose or use of any public data set, or application utilizing such data set, provided by any third party."

A key outcome of completing this project was recognizing some of the limitations of the New York City Open Data portal. First, the most current school year of data available across all of the metrics is 2010-2011, which means that conditions may have changed in the past 4 years. Next, there were some errors in the data that became apparent once the data was projected geospatially. For instance, the School Point shape file had several of the Brooklyn schools tagged as the Bronx. Another data set had Queens labeled "Oueens". All of these errors were corrected during data cleansing. I learned that the New York City Open Data portal has a feedback mechanism to report errors in data, and I plan to pursue reporting my findings. The New York City Open Data portal also lacks a data dictionary. I learned a tremendous amount about the New York City school data by crawling the New York City Department of Education website to piece together definitions. Lastly, there are some schools for which data is not available. There were data sets that I was not able to use because there was so little data in them that they would not be meaningful, such as demographic statistics for only 2,000 students when New York City schools support over 1 million students. There were even data sets missing, such as Graduation Outcome data at the district level. Though there are some limitations of using the New York City Open Data portal, it should be commended that New York City has released a subset of its education data in this manner. This project was only possible because that data was released. Most school districts do not release their data in this manner.

Another key outcome of this project was bringing together 20 different data sets for user consumption in one place. If I was a user looking for more information on New York City Schools, I wouldn't want to open 20 different files and read the New York City Department of Education website. In addition, though the New York City Department of Education website offers detailed guides on most metric categories, users are typically limited to search by individual school. This data visualization enables a user to perform more open-ended holistic searching.

I was surprised by some of the results I saw within the visualization. There was a school with a 4% graduation rate for the cohort graduation year ending in August, and a handful of schools with graduation rates under 10%. On the other end of the spectrum, there were schools with 100% graduation rates. For NYS Math and ELA Tests, Bronx and Brooklyn have the lower test scores. Since Bronx and Brooklyn have the lower test scores, Bronx and Brooklyn also have the lower progress report scores because the test scores are a component of the progress report scores. Another interesting observation is that schools with larger average general education class sizes have higher graduation rates. School buildings with lower crime rates also have higher graduation rates. The Bronx was found to perform the worst across most of the categories.

I started this project looking to give users a holistic view of New York City schools so they could make their own informed decisions on what schools, districts, and boroughs would be best for them. I feel I have accomplished this goal and learned a lot about the New York City School System in the process. It seems that there is an endless list of questions that can be addressed through this visualization, and I have enjoyed asking many of my own.

I would like to thank all of the individuals who have participated in rounds of iterative feedback to enhance this visualization. It is a better product because of you.