Trade surveillance, unstructured data and privacy concerns
In recent years, we have seen the introduction of cutting-edge technology to the trade surveillance sector. This has revolutionised how data is collected and analysed together with the nature of these services. However, if we take a step back, many instances of market abuse are planned days, weeks, or even months before their execution.
The introduction of electronic communication data has given regulated firms the ability to take a proactive approach to market abuse and surveillance. When collecting data from various messaging apps, phone calls, emails, and other sources, it is now possible to convert unstructured data and analyse it in a structured fashion. With this, however, there arise issues around data protection and privacy, and how they balance with the legal obligations of financial institutions and regulators.
The role of trade surveillance in financial markets
Before we move to the complexities of unstructured data and privacy concerns, it’s essential to highlight the basic elements of trade surveillance within investment markets.
- Monitoring financial data
- Identifying market manipulation
- Collating relevant data
- Reporting to regulators
This brings us to the critical purpose of trade surveillance: to preserve market integrity, protect investors, and assist with the prosecution of those engaging in market abuses. We live in an era of high-tech trading systems and ever-more innovative market abuse strategies, and the cost of fulfilling regulatory duties continues to rise.
Understanding unstructured data in trade surveillance
The term unstructured data relates to datasets that can be human or machine-generated in a textual or non-textual format. In essence, they aren’t stored in a structured database format, making it difficult (if not impossible) to search and analyse in their original format. The most common forms of unstructured data include:
- Emails
- Word-processing documents
- Audio files
- Video files
- Collaboration software
- Instant messaging
- Text messages
- Phone calls
It is estimated that unstructured data makes up 80% of all enterprise data and this will likely grow in the future. Consequently, if your surveillance activities are based on only 20% of structured data, how much potentially incriminating information are you missing?
There are four critical differences between structured and unstructured data:
- Defined vs Undefined
- Qualitative vs Quantitative
- Easy vs Difficult to analyse
- Predefined format vs Variety of formats
While it is possible to store structured and unstructured data in the cloud, unstructured data requires significantly more storage space and is therefore less efficient. To balance this, the collating of unstructured data is less expensive than structured data and can be accumulated much quicker. Converting this data into usable information does require using the latest technology, such as AI, ML and Big Data analysis.
Privacy concerns and trade surveillance
There are broader issues when it comes to balancing the requirements for surveillance with the requirements under GDPR and the principles of data minimisation, wherein only data that is needed for the purposes (of surveillance) should be collected and processed. This helps firms to approach the dual requirements of promoting market integrity together with ensuring data privacy is ensured in a consistent, principle-led fashion.
It is estimated that broader unstructured data will reach a staggering 175 billion TB annually by 2025, presenting substantial logistical, analysis and cost issues. Under UK data protection regulations, companies must make sure that all information collected is:
- Used fairly, lawfully and transparently
- Used for specified, explicit purposes
- Used in a manner which is only necessary for its purpose
- Kept up-to-date where applicable
- Accurate
- Retained for no longer than is necessary (legally or operationally)
Data must also be stored securely, with access limited to relevant individuals and protected from loss, destruction, unlawful use, or damage. This brings us to cybersecurity, a broader topic that companies holding private and personal information must consider.
Addressing privacy concerns in unstructured data surveillance
As a RegTech, eflow feels it is essential to address the elephant-in-the-room: employee privacy concerns, especially in the context of unstructured data surveillance. Many people ask whether, as an employee of a company, you need to consent to surveillance of communication during your working day.
Where an employer can identify a lawful basis for gathering, analysing, and holding private data, consent would not typically be required from the individual(s). In this scenario, it would likely come under the “legitimate interest” basis in the context of the wider business operation and, in this case, abiding by strict regulations. As we mentioned above, there should be no scope to use the data collected for any other purpose, and it must be stored securely and accessed on a strict need-to-use basis.
It’s important to note that employers must make their employees aware of the surveillance systems and the specific scope. When it comes to data and privacy protection for third parties communicating with individuals within your business, this will again be covered by the legitimate interest argument.
Ironically, while we tend to focus on privacy and data protection, US regulators issued $2 billion in fines in September 2022 for the opposite. These fines related to the failure of several financial institutions to record communications on unauthorised messaging apps, most notably WhatsApp.
Financial institutions are being squeezed from both sides, data privacy regulations restrict the use of data collated and regulators are issuing enormous fines for failure to maintain communication records.
Converting unstructured data
As it stands, unstructured data offers little assistance for those looking to identify instances of market abuse. There may be elements you can extract, but due to the format at best this would be time-consuming, at worst, inaccurate and potentially misleading. To maximise the benefits of unstructured data it should be converted into a structured format.
The main stages of this process are:
- Collating unstructured data
- Create a preferred data structure
- Cleaning the unstructured data
- Entity extraction
- Store the data
- Analyse the data
The less structure, the greater the challenge to extract the relevant information and rebuild within a predefined structure. To ensure maximum accuracy, legal obligations and business responsibilities, constant data quality checks must be done, with tweaks and changes where applicable. This is not a one-off activity; this is something that should be ongoing at all times to fulfil regulatory liabilities and responsibilities to your employees and clients.
Regulatory compliance and data protection
As the digital era expands into more areas of everyday life, business, and personal, governments worldwide have been forced to respond to concerns about data protection and privacy. In the UK, we have the UK Data Protection Act (DPA) 2018, in conjunction with the General Data Protection Regulation (GDPR), which took effect on 1 January 2021.
The EU has similar regulations to the UK, and other countries, such as the US, China, Switzerland, Australia, Brazil, Canada, Hong Kong, and many others, have also introduced privacy and data protection regulations. This is relevant because investment markets are now global; as a consequence of the digital era, you can deal in foreign markets at the touch of a button.
While the specific regulations will vary across countries, if you look at UK and EU GDPR, this gives you an idea of the scope of the regulations. These include but are not limited to, the following:
- Lawful, fair and transparent processing
- Limitation of purpose, data and storage
- Data subject rights
- Consent
- Personal data breaches
- Privacy by design
- Data protection impact assessment
- Data transfers
- Appointment of a data protection officer
- Awareness and training
Data storage also presents legal and logistical challenges. For example, Swiss regulations stipulate that data collected in Switzerland must stay (and be processed) in the country. This can add additional layers of cost and compliance for financial institutions with operations worldwide.
Technological solutions and ethical considerations
Many would argue that trade surveillance in relation to electronic communication is a proactive approach to market abuse, as opposed to reactive and retroactive when using historical data to identify similar activities. Undoubtedly, AI, ML, and, in particular, Big Data analysis have allowed financial institutions and regulators to identify potential illegal activity much earlier.
The growing use of cloud services has significantly increased the speed of IT service delivery. This has helped reconstruct unstructured data in a structured format and analyse the information in real time. We know that fraudsters often use keywords and particular phrases that are replicated across connected communications. In normal circumstances, they may not stand out but are often signs and signals for criminals. However, incorporating Natural Language Processing (NLP) with the latest technology allows patterns, phrases, and individual words to be used as triggers.
While NLP has existed for more than 50 years, it has only recently started to play a significant role in trade surveillance. It has helped significantly reduce the number of false positives, allowing companies to focus critical resources and funding on more likely instances of market abuse. These may seem like subtle improvements, but the knock-on effect on efficiently using resources and finances is significant.
Conclusion
The depth and quality of trade surveillance available today have improved dramatically compared to just a decade ago, taking in not just actual trade data but also the monitoring of electronic communications. This creates the potential for a genuinely proactive approach to detecting market abuse, compared to the reactive and retroactive approach associated with monitoring and analysing trade data. However, this addition to the armoury of the financial services industry comes with challenges.
Data protection and privacy regulations have been enhanced in recent years with the introduction of the GDPR in Europe and the UK. Finding a balance between protecting and maintaining privacy while using unstructured data elements to identify illegal activity is challenging. There are strict regulations in both areas, trade regulations and data protection, with financial institutions caught in the middle.
We provide a range of trade and broader surveillance services so financial institutions can fulfil their regulatory obligations. This promotes a healthy relationship between financial institutions and regulators and, more importantly, helps to maintain the integrity of investment markets.
If you would like to discuss your options and our services, please book a consultation.