Data enrichment is the process of combining existing customer data with external data sources to enhance its value and usability. This can include appending demographic, firmographic, or behavioral data to customer records, providing a more complete and accurate picture of the customer. Data enrichment can be used to update existing customer data, validate data quality, and even identify new business opportunities.
Because PDL enriches so aggressively, our own customer records became a liability. We accidentally exposed inferred data (e.g., “likely income range”) to sales reps who had no business seeing it. Worse, PDL doesn’t offer granular field-level suppression. You either accept their full enrichment payload or build a custom middleware filter yourself. data enrichment exposure from pdl customer
If an attacker compromises the enriched database, they gain access to a "shadow profile"—a highly detailed dossier on the customer that the customer never explicitly authorized. This drastically increases the severity of a breach. A leak of emails is a nuisance; a leak of enriched data (containing insights into personal habits and demographics) is a vector for identity theft and social engineering. Data enrichment is the process of combining existing
Imagine a company holds a list of 10,000 customers. To enrich this list, they send queries to a PDL provider. While modern APIs are often secure, legacy systems or bulk processing scripts sometimes transmit these queries in unencrypted logs or through insecure channels. Because PDL enriches so aggressively, our own customer
The data was labeled with indices pointing to People Data Labs (PDL) and another broker called OxyData .
While the data originated from the San Francisco-based data broker People Data Labs (PDL), the exposure was caused by a customer who failed to secure their own database. This incident highlights the profound risks inherent in the data enrichment industry, where sensitive information is sold and then left vulnerable by third parties. What Exactly Happened?
PDLs aggregate data from various sources—public records, scraping, partner sharing. When a company integrates this data, they inherit the privacy risks of the PDL’s collection methods. If the PDL source is exposed as unethical or non-compliant (e.g., containing data from a region with strict sovereignty laws like the GDPR in the EU), the PDL customer is suddenly liable for processing that illegal data.