Created on 27th March 2021
•
Our project aims to mask all the sensitive Personally Identifiable Information (PII) on the web. This masking logic will work in real-time and can connect to the company VPN and intercept all the traffic passing through the network. The masking logic can be configured by our clients, once their accounts are authorized by the admin. Several types of masks will be provided to ensure that our software covers all types of PII, especially in the pharmaceutical industry. The software can be deployed as both Cloud and On-Premise setup based upon our client company’s desire. Containerized deployment on Google Kubernetes Engine helps speed up the anonymization process, auto-scaling, auto-healing in case of errors, regular health checks, and periodic report generation. The CI/CD pipeline helps to push and deploy new code modifications with great ease. We have employed Istio’s Service Mesh Architecture to deploy our Project on Google Kubernetes Engine. Squid Proxy acts as a Reverse Proxy capable of intercepting all the traffic on a given network. Squid Proxy acts as a sidecar to the Python ICAP Server which Masks/Unmasks PII Data from the intercepted traffic. Redis is used for the purpose of in-memory caching of Masking logic, Request Configurations, Response Configurations, and User ID Management. Flask framework is used to develop the Configuration Software. PostgreSQL Database is used for the purpose of RDBMS. SpaCy's Presidio Analyzer Engine is leveraged to detect and anonymized the sensitive PII data from requests and responses.
We ran into a couple of challenges while developing our project: