The AA does not currently have a defined method for bulk access to data. The access to data is more critical than it’s form. For the AA to be both privacy conscious & provide federated access to consented data, it must be user-anonymised.
This will eventually open up doors for better modelling & insightful research etc and pave the way for a more sustainable lending ecosystem.
We built the project on the assumption that a new ‘consent-type: Anonymous’ will be facilitated in the consent-artefact
The functionalities we built:
A Query Layer
Integrated Privacy Models
The primary areas of confusion that came up while trying to build a privacy framework compatible for model building are stated below:
Getting anonymised data into the system:
We propose that to get such data into the system, AA collective must consider introducing a new type of consent, one specific to anonymisation. Permissions to consume anonymised-data could be a defacto consent secured from the user by FIUs/AAs.
Getting anonymisation models to work:
The Anonymisation models that we have described in our problem statement are meant to work with the full dataset.
Applying these directly on model training data could lead to the following issue:
For models built using the anonymised data to work in a production setting for predictive purposes, the anonymisation process in itself has to be repeatable.
Our solution: The system will require a method to save anonymisation-settings used for a data pull by a requesting entity for a pre-defined interval. A subsequent request by the same entity for the same set of variables will use the same settings to ensure that data is compatible with their existing models. This has been factored into the proposed architecture which we will build going ahead.
Discussion