Oakland gets its crime data feed on

You may not have heard about this yet, which is a shame.  It’s a shame because it’s a rare good thing in local government tech, because it’s a serious milestone for our city hall and because our local government isn’t yet facile with telling our community about the awesome things that do happen in Oakland’s city hall. But I’m excited, and I’m impressed that we’ve gotten here- Oakland’s crime report data is now being published daily, automatically, to the web, freely available for all.

This is quite a leap forward to where we were several years ago and in fact just year ago to be honest: spreadsheet hell. Often photocopied spreadsheet hell. Things happen slowly, but some things we’ve pushed for because we recognize their potential to change the game forever. First we pushed for opendata as a policy in the city, and we got there quick enough, but we’re now waiting in expectation for our new CIO Bryan Sastokas to publish the city’s very crucial open data implementation plan.  Then we started pushing public records into the open with the excellent RecordTrac app that makes all public information requests public unless related to sensitive crime info. And now with local developers soaking up the public data available we have the first ever Oakland crime feed and an API to boot!

The API isn’t actually something the city built, it’s a significant side benefit of their Socrata data platform- just publish data in close to real time and their system will give you a tidy API to make it discoverable and connectable.You can access their API here:


If you’re more of an analyst or a journo or a citizen scientist you may want the data in bulk, which you can grab here.  That link will get you to a file that is updated on a daily basis- pretty rad huh. Given how crime reports tend to trickle in- some get reported days after, some months, some get adjusted, the data will change over time- the only way to build a complete, perfect dataset is to constantly review for changes, update, replace etc- a very complicated task.  If you want a bulk chunk of data covering multiple years, with many richer fields and much higher quality geocoding you can grab what we’ve published at http://newdata.openoakland.org/dataset/crime-reports (covers 2003 to 2013) and for the more recent historic version you can use this file: http://newdata.openoakland.org/dataset/oakland-crime-data-2007-2014

Now that we’ve figured out how to pump crime report data out of the city firewall, we can get to work connecting it to oakland.crimespotting.org and building dashboards to support community groups, city staff and more!

So thank you to the city staff who worked to get this done- now let’s get hacking!

Side bar- Oakland has finally gotten hold of it’s new Twitter handle: @Oakland is now online! More progress…

Numbers and nonsense in Oakland’s Search for Public Safety

Oakland is once again talking about data and facts concerning crime, causes and policing practices, except we’re not really. We’re talking about an incredibly thin slice of a big reality, a thin slice that’s not particularly helpful, revealing nor empowering. And this is how we always do it.

Chip Johnson is raising the flag on our lack of a broad discussion about the complexity of policing practices and the involvement of African-Americans in the majority of serious crimes in our city, and on that I say he’s dead right, these are hard conversations and we’ve not really had them openly. The problem is, the data we’re given as the public (and our decision makers have about the same) is not sufficient to plan with, make decisions from nor understand much at all.  Once again we’re given a limited set of summary tables that present just tiny nuances of reality and that do not allow for any actual analyses by the public nor by policy makers. And if you believe that internal staff get richer analysis and research to work with you’re largely wrong.

When we assume that a few tables of selectively chosen metrics suffice for public information and justification for decisions or statements, we’re all getting ripped off.  And the truth is our city departments (OPD esp.) do not have the capacity for thoughtful analytics and research into complex data problems like these.  And this is a real problem.  Our city desperately needs applied data capacity, not from outside firms on consultancy (disclosure: my current role does this sometimes for the city) but with built up internal capacity.  There is a strong argument for external, independent access to data for reliable analysis in many cases, but our city spends hundreds of millions per year and we don’t have a data SWAT team to work on these issues for internal planning.  Take a look at what New York City does for simple yet powerful data analytics that saves lives, saves money and makes the city safer.  This is what smart businesses do to drive better decision making. 

Data, in context, with local knowledge and experience, evidence based practices (those showing success elsewhere) and a good process will yield smarter decisions for our city.

Data tables do not tell us about any nuances in police stops, we don’t know how these data vary across different neighborhoods nor anything about the actual situations around each stop- the lack of real data that shows incident level activity makes any real understanding impossible.

For example, the data report shows that White stops yield a slightly higher proportion of seizures/recoveries, so logic says why don’t the OPD pull over more White folks if they lead to solid hits at least as often?

Back in 2012 the OPD gave Urban Strategies Council all their stop report data to analyze, but there was no context nor any clear path of analysis suggested making it near impossible to produce thoughtful results, nor was it part of our actual contract.  But the data exist and should be used by the city to really understand how our police operate, the context of their work and the patterns that lead to meaningful impacts rather than habits that are not reflected upon and never questioned or changed.

it is not our cities job to just do the work, process the paperwork and never objectively review meta level issues.  According to our Mayor “Moving forward, police will be issuing similar reports twice a year”. We need data geeks in city hall to support our police and all departments and in 2014 we need to be better than data reports that consist of a set of summary tables alone.  Pivot tables are not enough for public policy.

If you’re still reading- the same problem arises with relation to the Shot Spotter situation- the Chief doesn’t think it’s worth the money, but our Mayor and CMs want to keep it- we now have the data available for the public but we’ve not really had any objective evaluation of the systems utility for OPD use- and we’ve certainly not had a conversation in public about the potential benefits of public access to this data in more like real time! Just looking at the horrendous reality of shootings in East Oakland over the past five years makes one pause very somberly when considering how much the OPD must deal with and how much they need more analytical guidance to do their jobs better and more efficiently.


For a crazy look at shootings by month for these five years take a look at this animation– with the caveat that not all the city had sensors installed the whole time and that on holidays a lot of incidents in the data are likely fireworks!  Makes me want to know why there is a small two block section of East Oakland with no gunshots in five years- the data have been fuzzed to be accurate to no more than 100 feet but this still looks like an oasis- who knows why?

Upping the numbers game in Oakland PD

There is a lot of public pressure and public expectation in Oakland these days as a result of the increased oversight and monitoring of our police department.  The public has been promised much in the way of reforms, better service, smarter policing, something about community policing (but noone is quite sure what that means), better management (by having multiple people in charge presumably?) and a safer city overall.  All the high powered, well compensated experts and consultants in town come with varied baggage and success and all have something important to offer this city.  There is one thing that does seem to unify them all (Frazier, Wasserman, Bratton): data. They all talk up the importance of having and using good data. Data to drive geographic policing or hotspot policing, data for investigations, data for tracking processes and data to spot trends and patterns.

We’ve had some version of the popular model of CompStat in Oakland for a few years now, Batts implemented it when we were still contracting for him to do crime analysis.  Every city does it differently, as they should, and every city understands it slightly differently.  If you still see that word and think it means a computer system or program then please consider yourself corrected- it’s not a program, a piece of software, despite its name. It’s a method, a process. You could do it with pen and paper if you were a genius. Most of us prefer computers however. When things get tight in any city or company you have two options commonly considered- do much less, or try to be smarter and do more with less. Sure there are other options, but this is a blog, not a PhD.  With ~635 sworn officers Oakland has to get smarter. CompStat approaches can help but they are not THE answer. (Given this post was drafted a week ago, new today from Bratton’s report is important context for this: “The city’s Compstat process was more of a "presentation by a captain than a system of vigorous strategic oversight.”“ Source)

There is a fascinating battle raging (perhaps that’s overdramatic- but it’s saturday and I have a good beer open), between some respected academics overtheir opinions/statements on the validity of things like the actual impact of CompStat and other policing strategies, especially from NYCPD. New York is a favorite because of the huge size and corresponding huge samples in data availability, it’s massive drop in crime counts over 20 years and it’s scandals and successes.

When we move to a model where our police department is truly/heavily data driven (no I do not believe it is this way currently) we must be aware of the good and the bad of this approach, and more importantly be aware of the ways this approach can be abused.  Trust in police in Oakland is incredibly low and this is sad, wrong, broken, disgusting etc etc. On all angles. There is much to do to repair this and I propose that we must, it’s not an ok thing to maintain the status quo here.  But with more numbers involved, better reporting and better analysis and communication of these data (that’s the plan right?) comes an increase in the types of activities that have sullied new York City’s reputation and cast valid doubts over the veracity of the crime reduction facts touted there.

Whenever there is a major change in policy or practice we should, as a smart society, be evaluating the impact of these changes. Better sign-up process for food stamps online? You should see an uptake in enrollment – if not you’re missing something. That kind of simple evaluation.  Change the police reporting for multiple types of basic crimes including burglaries so people can only report them online now and no officer will show? You should be providing solid numbers to the public and the city administration on the trends in all those crimes as well as numbers on closure and conviction for those crimes. Things don’t stop at a report- if your reports go up, are you finding more people, less, no change? Any answer means something and should be critically be considered to see what it tells us.

When OPD ramps up it’s technology (Oh God let it be soon) and data use and builds its capacity for dynamic use of CompStat methods we will need to be ever more vigilant of the types of manipulation that have been documented in New York City.  Eterno & Silverman have a book in print that does a stunning job of documenting the abuses of the NYCPD to manipulate the numbers used in the CompStat program, it’s expensive, sorry.  It provides what to me are the most comprehensive and broad analytical assessments of the claims of crime reduction in New York City and shows them to be fraudulent and false overall.  Knowing the kinds of improbable realities they describe should position Oakland’s city staff and our community well to judge if these things begin to occur in Oakland.  For example, if we hear claims of reduction in assaults by 50%, yet our hospitalization reports increase by 90% we should be asking what the hell is going on.  Crime data are in my experience the most manipulated and most misleading figures in common use. Pressure from senior officers to suppress major crime statistics is something that will erode the remaining trust in OPD and will not have a positive impact on crime and violence prevention in our city.

If you want to see more of the critique of Zimring’s ideas that are quite relevant to Oakland check out this brief -then buy the book 😉