Seminar 4: Shaping the Future of New York—Focus: Understanding the City through Data Analysis

Instructor: Katherine St. John
Offered by Lehman College
Thursdays 5:30-8:10pm
Macaulay Honors College, Classroom 308 

The purpose of Seminar 4 is to analyze the ongoing interplay of social, economic, and political forces that shape the physical form and social dynamics of New York City. Throughout the semester, students engage in a team research project, sometimes including Public Service Announcement Videos, to be presented at a model academic conference.

This special cross-campus section of the seminar, held at the Macaulay Building and open to students from all eight campuses, will focus on understanding New York City via data analysis.  Over the last decade, New York City has made a concerted effort to provide open access to data collected by and about the city.  This incredible public access to data allows information to aggregated, analyzed, and visualized to explore the interplay of social, economic, and political forces that shape the physical and social dynamics of the city.  In this course, we work directly with the city data (via the NYC Open Data project as well as federal census and labor data) and focus on acquiring and cleansing, inferring and analyzing patterns, and visualizing the results.

This course serves as an introduction to data science.  Data science uses techniques from computing, mathematics, and statistics to extract new insights from large data sets. It is a very broad field but at its core is the use of automated techniques to analyze and make inferences from inputted data.

This course will focus on data acquisition (how do you take data from multiple sources and put it in usable forms), data storage, data mining and basic machine learning, and visualization.

This course assumes proficiency in Python Programming.  In particular, you should be able to open and close files, loop through files, strings, and lists, use decisions and functions, and be comfortable using packages.  We will cover all the statistics and linear algebra needed for analysis in the course, but you should be comfortable working with formulas and lists of numbers.  If you have not taken Python but are very interested in the course, contact joseph.ugoretz@mhc.cuny.edu and we will find appropriate tutorials to get you up to speed before the course starts.

Course Codes:

MHC 353 (Lehman)