Security - PII
When building chat applications, it’s crucial to ensure the secure handling of sensitive data, especially Personal Identifiable Information (PII). PII can be directly or indirectly linked to an individual, making it essential to protect user privacy by preventing the transmission of such data to language models.
Example of PII
Consider the text below, where PII has been highlighted:
Hello, my name is John and I live in New York. My credit card number is 3782-8224-6310-005 and my phone number is (212) 688-5500.
And here is the anonymized version:
Hello, my name is <PERSON> and I live in <LOCATION>. My credit card number is <CREDIT_CARD> and my phone number is <PHONE_NUMBER>.
Analyze and anonymize data
Integrate Microsoft Presidio for robust data sanitization in your Chainlit application.
Before proceeding, ensure that the Python packages required for PII analysis and anonymization are installed. Run the following commands in your terminal to install them:
Create an async context manager that utilizes the Presidio Analyzer to inspect the incoming text for any PII. This context manager can be included in your main function to scrutinize messages before they are processed. When PII is detected, you should present the user with the option to either continue or cancel the operation. Use Chainlit’s messaging system to accomplish this.
If your application has a requirement to anonymize PII, Presidio can also do that. Modify the check_text context manager to return anonymized text when PII is detected.
Was this page helpful?