SF Crime Map

This project maps crimes in San Francisco. You can filter the map by types of crime, district of SF, and date. Also, try clicking on the tab titled 'Graphs' to see summary graphs of:

  •  Total crimes per day of week
  • Crimes by date
  • Crimes by time of dat
  • Crimes by type of crime

Try panning the map and notice how these graphs automatically update based on the crimes that are visible in the window. This feature makes it fast and easy to compare crimes from one neighborhood to another - all you have to do is pan the map and see how the graphs change!

If there is a map you'd like to make, feel free to reach out to me at jrpepper@gmail.com with a description of your project.

- Josh

Link: Full Screen Version of the SF Crime Map.

This tool was built using data from the SF data portalRStudioLeaflet and Shiny. (Please note: This tool currently works best on a computer)

Visualizing Weather Data

In my last post I talked about how Shiny can be used to create dynamic data exploration tools. I wanted to provide an example that didn't use maps, to show that these tools are great for displaying non-spatial data as well. Below is a graph showing annual temperature data, "normal temperature range" (as defined by a 95% confidence interval around the historical mean) and the total temperature range from 1995-2015. (Huge thanks to Brad Boehmke for his tutorial on creating these maps).

With Shiny, we can connect this graph with user input, so we can dynamically change the map to show different cities, different years and can even change styling features of the map like historical highs/lows in temperature.

Data exploration with R and Shiny

I've begun experimenting with R and it's incredibly powerful package Shiny. Shiny lets you create interactive web-apps using your R data, passing inputs and outputs back and forth between a graphical user interface and R.

Below you can see a tool I've created for exploring pesticide use reporting (PUR) data in California. The raw data from the Department of Pesticide Regulation is hard to work with - it's gigabytes of information in large tables. This tool puts the same information on a map, making it intuitive and even fun to play with the data. I've only included data for four counties surrounding Woodland, CA.

If you're interested in learning more about how to build one of these tools, or would like help constructing one with your dataset, feel free to contact me directly or visit my consulting website at http://www.jrpepperconsulting.com.

Full-screen version of the PUR Data Explorer here.

Make and aggregate your own dot density maps in R

Last post, I talked about dot density maps as a way of taking polygons (like zip codes) and distributing their population evenly across the area in the form of dots. If 1000 people live in a zip code, then place 1000 random dots inside it. That's useful for visualizing population density, but what if you could then add these dots back together across new polygons? You'd be able to then get a pretty good estimate of the population within a new boundary.

For example, we can take the locations of clinics in Pretoria and create "Thessian" polygons (otherwise known as a Voronoi diagram) that show the area for which that particular clinic is closest. In other words, if you are in the Thessian polygon for Clinic A, then you are closer to Clinic A than to any other clinic. It's a rudimentary way of calculating a catchment area for each clinic. Then, we can overlay our population dot density map. Finally, we can just count the dots inside each clinic catchment to estimate the total population served by that clinic.

I'll include my R code below so you can replicate these results if you'd like. A sidenote on mapping in R: I never understood why people would map in R, when you could map in QGIS, Tilemill, Mapbox, Leaflet or other programs that seemed more intuitive. What I now recognize is that R keeps everything reproducible, which is essential for publishing. Also, it makes in incredibly easy to repeat your entire script with new data or variables. For example, we can rerun this same analysis for the entire country of South Africa with only a few extra lines of code!

Larger resolution image: here.

Below is the R code. It is thoroughly commented but email me or write a comment if something doesn't make sense and I can update the Gist. The code is a bit long, but that's partly due to some data cleanup in the beginning, plus additional code for printing all four of the maps in the GIF above.

Dot Density Maps

I'm working on building dot density maps of South Africa using census data. Below is an example of what I'm talking about. Each dot on the map shows 100 people living in Tshwane, the municipality in which I'm currently living. If you can then calculate "catchment" areas for each clinic (the simplest way is to use the shortest Euclidean distance by creating a Voronoi diagram), we will be able to estimate the total number of people served by each clinic. This information, combined with information we have about health outcomes and financial resources of each clinic, will help identify those clinics with the highest burden and fewest resources.

This map was created in R using the dotsInPolys function (which is part of the maptools package) and ggplot2. For a good example of how to build these types of maps in R, check out AnthroSpace's tutorial, Dot Density Maps in R.

In researching these types of maps, I've seen other people create massive dot density maps that are color coded by ethnicity/race. Adrian Frith created a great dot map of South Africa. Below I've embedded a similar map made by the Cooper Center. I'm reminded of the Facebook connections map, in that you can clearly visualize the roads, boundaries and the topography of the entire US, even though the only thing that is mapped is dots - there are no lines in the entire map.