The new opportunities to obtain data on travel patterns offered by mobile technologies can generate valuable insights that can help advance sustainable mobility.
However, personal mobility data is sensitive and there is a risk that privacy protection rights are violated. Fosca Giannotti offered her views on this issue. She is Head of the Knowledge Discovery & Data Mining Lab, a joint research initiative of an Institute of the National Research Council of Italy (ISTI) and the Computer Science Department of the University of Pisa, Italy.
Can transport planners seize the benefits of collecting data while preserving privacy rights?
Definitely yes! In the last decades privacy enhancing technologies have made significant progress, particularly in the case of mobility and location data. If one wants to provide a service based on this kind of data, it is almost always possible to find a way to transform personal mobility data in such a way that privacy is guaranteed, while the data is still adequate for providing the service.
This paradigm is what we call ‘privacy by design’. It can be applied to tasks such as traffic monitoring, identifying the most frequently used routes to reach an attractor (such as a parking lot or the location of an extraordinary event), identifying travel patterns, and more.
These possibilities are surprisingly unknown to the general public and transport planners: while most believe that privacy regulations are insurmountable barriers, privacy-by-design technologies have the potential to set the power of data free.
A lot of technology is available and there is a big untapped potential. I expect that it will only be about two years before we see much more of it being used.
When it comes to transforming data to protect privacy rights, is the assumption that anonymous data does not infringe privacy laws correct?
‘De-identified data’, in other words detaching the personal identity associated with an item of data, is not necessarily ‘anonymous data’.
For example, if you can monitor de-identified personal mobility data of an individual that moves from place A to place B between 8:00 to 9:00 every day and returns to place A between 18:00 to 21:00, you can easily conclude that A might be the home and B the place of work. If I then just link this to a phone directory, I can restrict my guess and possibly re-identify the mysterious person.
Actually, the techniques to make data anonymous employ more sophisticated ways of data transformations such as scrambling, generalising and adding noise, precisely for the purpose of making the kind of tracking possibilities above very difficult. As a transport planner, it is not necessary to know the precise GPS location of individual travel behaviour but it’s sufficient to know about bigger movement patterns from a neighbourhood for instance, so the data can be distorted to some extent.
Even with these techniques, is it possible to guarantee anonymity?
Is it possible to design a door-locking technology that is impossible to hack? No, but we are happy enough if the probability that it’s possible to reproduce a key is very, very low.
Similarly for anonymity, it would be presumptuous to state that it is impossible to re-identify someone’s personal data from an anonymous data set. But we can accept the risk if the chance of a successful hack is extremely low.
What is your assessment of risks that data is obtained and possibly misused by third parties, for instance through cyber attacks or an intelligence service?
The current public debate about misuse of personal data is not about anonymised data but un-authorised acquisition and use of massive amounts of identified personal data by intelligence agencies. However, as I said before, also with anonymised data it is impossible to completely rule out that risk.
What is your advice to urban transport professionals who want to make the most of the benefits offered by transport telematics while preserving privacy rights?
Go for privacy by design!