Australasian Science: Australia's authority on science since 1938

A Social Approach to Crime Prediction

A/Prof Flora Salim is leading a range of projects using social media data to predict human activity, including crime.

A/Prof Flora Salim is leading a range of projects using social media data to predict human activity, including crime.

By Shakila Khan Rumi

Computers can be trained to analyse location information generated by social media users to predict the likely time and place of specific crimes.

Imagine a program that could tell you where and when crimes were about to happen. While it may sound far-fetched, a recent study I published in EPJ Data Science (https://bit.ly/2Vlei3L) with Dr Ke Deng and A/Prof Flora Salim at RMIT University has used location and activity data from Foursquare social media app users in New York City and Brisbane to predict specific types of crime better than any other existing method.

The Foursquare app, which peaked in popularity 5–6 years ago with over 45 million users, allowed people to share their location and activity by checking-in at various places. For this study we gathered data from over 20,000 check-ins by users in Brisbane, and nearly 230,000 check-ins by users in New York City.

The large majority of people in the cities we studied were not using the app, and those who were committing crimes were likely not posting on the app about it, so obviously it’s not as simple as waiting for people to check in at a murder scene or post their intentions to commit crimes in their status update. What we actually do is use the location data from the mobile phones to understand the flow of people and types of activity around a city at any given moment, as this correlates with the likelihood of crimes being committed. This same type of location data are captured by Twitter and other social media apps on almost every mobile phone these days, so the same approach to data gathering could work no matter what the app.

Real-time data on people movements around a city is highly valuable for understanding the likelihood of different situations in an area for a range of reasons, from parking or taxi availability to crowd behaviour and crime. Even with the many gaps in the information collected and the relative scarcity of crime in relation to non-crime, there was enough data for computers using machine-learning algorithms to make predictions.

To enable this, our team developed recommendation algorithms, similar to those used to recommend related songs on Spotify or movies you might like on Netflix based on your past choices, to predict where crimes would happen next.

In tests on both cities, the system predicted specific types of crime in specific parts of the city better than existing crime prediction models based on crime trends. In Brisbane, the system was 16% more accurate at predicting assaults than current models, 6% more accurate for predicting unlawful entry, 4% better for drug offences and theft, and 2% better for fraud prediction. In New York City it improved prediction accuracy by 4% for theft and drug offences, fraud and unlawful entry, and improved predictions of assault by 2%.

Given how little data were available in the study, these results are significant as a proof of concept. Imagine how accurate the system could be when using data from many more people?

Based on these positive results, we’re really excited about how effectively this technology could allow police to design more effective patrol strategies. With limited resources, they could concentrate on sending officers to the places where crime is more likely.

Current state-of-the-art crime prediction models generally rely on relative static features, including long-term historical, geographical and demographic information. This information changes slowly over time, so these traditional models can’t capture short-term variations in crime event occurrences.

Our test results demonstrate that improved prediction performance is considerable and statistically significant when dynamic features are added. That we can now harness these data to keep people safer really is revolutionary.

While there is interest from police in this work, we also see applications for citizens themselves. For instance, it could warn them about areas of crime probability as they plan their trip or walk the city streets at night. Our follow-up work has shown that this analysis of the risk of crime for each person in various places at a certain time interval of a day can provide useful information.

We are now planning to extend the work by training the algorithms using data from one city, and increasing its ability to apply those learnings in a different city where data patterns are different.

I should also point out to anyone concerned about privacy that these analyses have been done anonymously and at an aggregated level. We are not interested in an individual’s action as much as the pattern of movement and activity more generally. And we make sure not to introduce demographic attributes like ethnicity that may lead to bias in how this is used.

Importantly, our system can be easily scaled up to process larger samples from almost any social media platform, app or mobile network that collects location-based data.

The widespread use of social media apps like Twitter, which gather huge amounts of data about our location, activities and preferences, provides unprecedented opportunities to capture the movement and activity of people across a city. The decisions we make on where to stop and where to shop, which street to take and who to meet up with along the way are complex, but they are not completely random. Patterns of human movement are defined by a whole set of variables, including working hours and peak commuting times, festivals and events, social relationships, weather patterns and the type of work we do.

A/Prof Salim is working on another project developing machine-learning algorithms to predict, with high levels of accuracy, what we’ll do in the second half of our day based on historical patterns and data collected from the first half of our day. She says that research into the pattern of human movement, based on data from our mobile apps, often shows how predictable many of our activities are.

In the era of big data, humans become moving sensors. Many of us may not even realise it, but all the mobile phones, smart watches or other devices we carry report our location and mountains of other information throughout the day. This data is used to tailor our apps so that Google Maps knows where you are to give you directions based on current location, or where to send you when you Google “pizza near me”.

But there are many applications at the larger scale. We can analyse the location, movement and even the activity of large numbers of people to get a bird’s eye view of the population like never before.

For instance, another project of our team involves setting up sensors throughout the beachside holiday hotspot of Rye to give real-time information on parking and amenity use, along with water quality at the pier and even air quality testing at the toilet blocks. Before long this system, trained by machine-learning algorithms, will be predicting tourist flows and giving council workers a head-start on managing the summer peak.

Other projects are tracking responses to natural disasters, managing foot traffic through shopping centres or crowd behaviour in large public spaces using the same idea.

Our smart cities will soon be more responsive and tailored to population needs than ever before. Hopefully, with the promising results from our crime prediction project, our streets will be safer than ever.


Shakila Khan Rumi is a PhD student in RMIT University’s School of Computer Science and Information Technology.