Blog

Knowledge

Masking personal data in Symfony3 and Symfony4

Find out how you can protect sensitive data against indexing and access by third parties.

Ecommerce and shopping without registration

In modern Internet applications, e.g. e-commerce, it is quite common to serve users without the requirement of user authentication. In simple terms, it is about making it possible to place orders without prior registration or logging in. From the user's point of view, it is very convenient and, looking at the statistics, quite commonly used. This practice applies to various sales, offer and customer support systems.

 Ecommerce and shopping without registration

From the technical side, it is a less comfortable situation because usually after accepting the order, it is necessary to provide the user with a page with a summary or confirmation of a given operation. In the case of orders, it is a page where the user will be able to return after some time and verify the progress of the order. This site usually contains sensitive personal data of the user, e.g. home address, contact telephone number or email, generally all data that is necessary to handle the ordering process.

Most of the modern ecommerce systems offering order processing without registration give the user the ability to constantly view the status of the order and its content through a special, unique and secure link to the page containing the order data. Of course, the length of the link and its complexity will be important for the security of access to the data, but even the most complex link will not guarantee data security, because each link owner will have access to full customer data, delivery and status information.

We often don't realize how easy it is to make a unique URL publicly available. It is enough to accidentally paste the link into the messenger window, eg Skype or Discord, for the bot to be able to view and download the entire content of our unique (safe) URL. There is also the risk of such pages being indexed by search engine bots, which scrupulously scour all accessible websites, documents and files for URLs. Inadvertently pasting the link in the wrong place and our personal data may be indexed and made available to the public for viewing.

What can we do to improve the security of our e-commerce system?

Unique and safe URL

First, let's start with our unique URL. A correctly prepared link will not contain record identifiers in the form of the INT type and can be implemented as a single token or in the identifier + token format. Even a relatively small number of token characters can guarantee security against accidental discovery of the URL.

Sample unique URL:

https://example.lsb.com.pl/order/3xAvK12fke1kefg

Unique URL using UUIDv4:

https://example.lsb.com.pl/order/72afc88e-c126-48b1-852b-c3ee535eb649

Two-component formats (ID + token) are also often seen:

https://example.lsb.com.pl/order/12/3xAvK12fke1kefg

(not recommended, due to the possibility of iterating the object identifier, there is a risk of data disclosure in the event of a defective token verification)

Two-component formats (UUID + token):

https://example.lsb.com.pl/order/72afc88e-c126-48b1-852b-c3ee535eb649/3xAvK12fke1kefg

META TAGS

Secondly, having a properly prepared URL, you should inform the indexing bots that they should not index our page with personal data or sensitive data in case the bot finds such a page. To do this, add the appropriate META tags in the HEAD part of the page:

<META NAME="robots" CONTENT="noindex"> (no indexation)
<META NAME="robots" CONTENT="nofollow"> (blocks links follow)
<META NAME="robots" CONTENT="noindex,nofollow"> (no indexation, no links follow)

Data Masking

Third, it is worth making your data anonymous. Even the most complex link will be at risk of accidental disclosure. Therefore, the next step that should be implemented in our system is masking sensitive data and simple authentication and authorization of the user in terms of access to open data. In the discussed case, the user does not have an account in our system, therefore, for its authentication, some content known only to the user to whom the page is directed, located at a secret, unique URL address, may be used. In the case of order confirmation, it may be, for example, the telephone number provided when placing the order, an e-mail address or, for example, a PIN that we will send to the user to his e-mail address or in the form of an SMS after receiving a request to access data. The technical way to organize the authentication service depends to a large extent on the way your application is built, so we will not focus on this thread now. We will only focus on the method of data anonymization through masking.

Data Masking

An example of data masking can be done in the following way:

Masked (anonymized) content

Jo**** D****

Not masked content:

John Doe

In this example, we have used "*" as the masking character, but the number of masking characters should not coincide with the actual number of masked characters. This is an additional form of security that does not allow you to determine the actual number of data characters masked by outsiders. It is worth emphasizing that the method of masking should allow the target audience to somehow match the masked data with their actual data. This will allow the user to make it clear whether the page displayed is actually the page he wanted to view.

block image

Masked data

Masked data

Unmasked data

Unmasked data

Using annotations to mask data

In the case of web applications based on the Symfony3 or Symfony4 framework, bundle comes to the rescue: https://packagist.org/packages/superbrave/gdpr-bundle

Using the provided anonymizing service and annotations, we can easily indicate the properties of the entities to be anonymized. Although the bundle has built-in data anonymization methods for various data types, e.g. for IP addresses, dates, data collections, it also allows you to handle your own anonymizing classes.

In one of our projects, we used the approach described above and prepared three dedicated data anonymizers:

  • EmailAnonymizer
  • MaskAnonymizer
  • ZipCodeAnonymizer

Anonymizers mask data in such a way that only their real owner is able to judge with a high degree of probability whether the masked data belongs to him or not.

Code examples

<?php
declare(strict_types=1);
namespace LSB\UtilBundle\Anonymizer;

use Superbrave\GdprBundle\Anonymize\Type\AnonymizerInterface;

/**
* Maskowanie stringów
*
* Class MaskAnonymizer
* @package LSB\UtilBundle\Anonymizer
*/
class MaskAnonymizer implements AnonymizerInterface
{
   /**
    * {@inheritdoc}
    */
   public function anonymize($propertyValue, array $options = [])
   {
       return $this->maskProperty($propertyValue);
   }

   /**
    * @param $propertyValue
    * @param string $maskChar
    * @param int|null $prefixLength
    * @return string|null
    */
   private function maskProperty($propertyValue, string $maskChar = '*', ?int $prefixLength = null): ?string
   {
       if (!$propertyValue) {
           return null;
       }

       try {
           $propertyValue = (string) $propertyValue;
       } catch (\Exception $e) {
           return str_repeat($maskChar, rand(5, 15));
       }

       //Sprawdzamy długość ciągu
       $propertyLength = mb_strlen($propertyValue);

       if ($prefixLength !== null && intval($prefixLength)) {
           $showChars = (int) $prefixLength;
       } elseif ($propertyLength < 3) {
           $showChars = 0;
       } elseif ($propertyLength < 8) {
           $showChars = 2;
       } else {
           $showChars = 3;
       }

       if ($showChars) {
           $prefix = mb_substr($propertyValue, 0, $showChars);
       } else {
           $prefix = '';
       }

       $maskLength = rand(($propertyLength > 4 ? $propertyLength - 3 : $propertyLength), ($propertyLength + 1));
       $mask = str_repeat($maskChar, $maskLength);

       return $prefix.$mask;
   }
}

Configuring a new anonymizer in the services.yml file

LSB\UtilBundle\Anonymizer\MaskAnonymizer:
   tags:
       - { name: superbrave_gdpr.anonymizer, type: mask }

After installing the bundle and declaring your own anonymizing classes, you can use annotations with your own type in entities, e.g. mask: 

@GDPR\Anonymize (type = "mask")

An example of using annotations in an entity:

/**
* @var string
* @Assert\Length(max=255)
* @ORM\Column(type="string", name="customer_address",  length=255, nullable=true)
* @GDPR\Anonymize(type="mask")
*/
protected $customerAddress;

In this way, we can indicate the properties of the entity to be anonymized. Anonymization is not fully automatic and it is necessary to manually call the anonymize () method available on the "superbrave_gdpr.anonymizer" service and pass the object that we want to anonymize.An example of using the service as part of a controller action for an object of the Order class:

$anonymizer = $this->get('superbrave_gdpr.anonymizer')
$anonymizer->anonymize($order);

This way, the anonymized data will remain safe even if the "safe" URL is indexed or leaked.
An important feature of the anonymizer is the fact that it supports related objects, so even dependent objects or collections of dependent objects of our entity can be automatically anonymized if necessary, if only their properties are marked with a special annotation (@GDPR \ Anonymize)

In the case of REST API and classic web applications with TWIG template mechanisms, the above-described approach guarantees a significant improvement in the security of user data and protects their privacy against unintentional access.

Masked JSON response with annotations

Masked JSON response with annotations

Similar entries

lsb bulb

Have an idea? Let's talk

Have an idea? Let's talk about your project.