Leveraging the M2DC technology towards state-of-the-art image anonymization

April 29, 2019, 12:26 p.m.

Authors: Dejan Štepec (XLAB), Michał (PSNC)

The recent success of deep neural networks (DNNs) has attracted a lot of research in different fields. This has been made possible with the advances in the theory itself, with some algorithmic modifications which made possible to train deeper and deeper models which have a powerful representational power. Another reason is the availability of annotated data at massive scales. One of the biggest factors that also contributed to this success are the GPUs which made it possible to process such large quantities of data on complex architectures in a reasonable time. The methods are becoming nowadays so mature that they are being used in commercial products, solving problems that we thought were unsolvable not so long ago. This is one of the main reasons that spiked research interest in performing DNN computations efficiently. Specialized GPU architectures along with software were developed to better utilize the GPUs for specific tasks and are also utilized in M2DC Cloud Appliance. This is a great opportunity for M2DC in the world of Machine Learning and its raising market.

     Figure 1 - Deep learning perspective by Nvidia (Nvidia developer website, 2019)

Community policing promotes the implementation of bidirectional collaboration channels between citizens and Law Enforcement Agencies (LEAs). By enhancing the discovery of relevant and up to date information, it speeds up the detection of risks, eases their prevention and builds a continuum of collaboration which motivates citizens and LEAs to work together. M2DC has been working in collaboration with the European Project TRILLION – TRusted, CItizen – LEA coILaboratIon over sOcial Networks. It delivers a fully-fledged platform to support the extensive collaboration between citizens and LEAs. TRILLION delivers a comprehensive service based platform and mobile applications that support the knowledge-based, real-time collaboration among law enforcement agents, first responders and citizens whilst ensuring that privacy and data protection are taken into account. The TRILLION consortium and supporting organizations include 6 citizen communities, 6 law enforcement stakeholders, 3 industrial players, and 5 universities and research centers. Extensive trials take place through pilots, early validations and serious game-based training across Italy, Portugal, Sweden, the Netherlands, and the United Kingdom, involving close to 2.000 citizens and law enforcement agencies representatives.

Problem and how we solved it?

One of the problems with TRILLION platform is providing anonymity to the users. The user can choose to report events anonymously and then first we have to provide anonymity to the identity of the user itself but also to the multimedia attachments that he may send.  An image is worth a thousand words and knowing that the image will be anonymized, discarding the content that may disclose identities may encourage more users to add attachments.

One may argue that such anonymized attachments are of no use to LEAs but you can still, e.g., see what’s going on, how many injured people are there and this definitely enriches basic textual information about the event.

We have built an image anonymization service that anonymizes faces, license plates and other visual information from images that could reveal identities when using TRILLION platform in anonymous mode. We are using modified state of the art object detection methods that are based on deep learning. In that, we are using GPUs for both the training part and the inference. For the anonymization part, we are using some of the methods provided by the partner from IDENTITY project. The whole service was developed and deployed on an M2DC architecture.

Figure 2 - Example of anonymized faces and scene text.

Why use M2DC?

M2DC platform is particularly good for solving these kinds of problems as it offers a wide range of newest NVIDIA GPUs that can be deployed efficiently, utilizing them as efficiently as possible together with all the perks that NVIDIA brings with them such as reduced precision and other optimizations. We have modified general purpose object detection method Faster R-CNN in order to detect the categories that we want to anonymize. Powerful GPUs that are available as part of M2DC testbeds were needed firstly to efficiently train the models as this task is computationally the most intensive part. Secondly, the learned models are then used for the inference at deployment.

The TRILLION project aims at large scale deployments on a national level which would require a lot of computational power which is powerful, scalable, easy to maintain and at the same time cost efficient to run. This is exactly what M2DC project represents. In the following months, we plan to integrate our service as a Cloud Appliance in order to utilize all the capabilities that M2DC has to offer. We have been also collaborating with TRILLION project where anonymization service was developed to anonymize the content of the attachments sent by the users such as the example below. The TRILLION project makes extensive use of GPUs and we plan to integrate it into the Cloud appliance.

Read more about the M2DC cloud appliance, its functionalities and other use-cases at: https://m2dc.eu/en/appliances/m2dc-cloud-appliance/.