skip to Main Content
Managing & Analyzing Huge Datasets For Law Enforcement Investigations

Published on July 30th, 2020 | by SS8 Blog

Managing & Analyzing Huge Datasets for Law Enforcement Investigations

The 5G world is upon us!  Which means Law Enforcement Agencies (LEAs) have a headache: 10X to 100X normal data rates are on the horizon.

LEAs often intercept and analyze communication data when investigating a suspect of interest.  However, to put the new volumes into perspective, there will now be 100 Mbps per suspect. So just 10 subjects would generate 324 Terabytes (TB) of data over the course of a single month! With such massive amounts of data, the old methods of storing it in a relational database, just won’t work anymore.

Additionally, finding anything in that 324 Terabytes is like looking for a needle in a haystack. Consider, a typical Network Attached Storage (NAS) reads about 200 MB/s, so a serial search through that 324 TB would take around 19 days. LEAs can’t wait 19 days for a search; a Monitoring Center must provide results in seconds. 

To make this possible, new Monitoring Centers must:

  1. Have a horizontally scalable database and indexing architecture that can spread the load across multiple servers, and is not limited in capacity. If load increases, you want to be able to just add more servers and/or additional storage and keep right on going.
  2. Be able to handle a significant number of targets.  In addition to fixed and mobile lines, the suspect may also have a whole host of Internet of Things (IOT) devices. This could be anything from cameras to vehicles with autonomous driving or other smart features.  These devices provide a “patterns of life” for a suspect and can help law enforcement to solve and prevent crimes. 
  3. Ingest traffic at multi-Gigabit rates, classify it in real time, and index the information (in memory) for very fast search results. That requires sophisticated multi-threaded architectures, intelligent classification of applications and services at ingest time, and a very fast pipeline into storage, which can be a potential bottleneck.
  4. Provide smart and configurable content filtering. This allows law enforcement to discard unimportant traffic so valuable storage space isn’t used for encrypted data, or a bad guy’s favorite online movie.

A Monitoring Center Should Also…

Once all that data has been successfully stored, the next important thing is making sure it can be found again. Smart and scalable indexing of the data is key. Which means employing a sharing strategy to spread data across servers, in a very efficient way. Doing so will facilitate very fast access for the queries analysts use most. For instance, searching for communication events in real time. 

In addition, one often overlooked aspect of the whole system, is to make sure it’s efficient and straight forward to delete old or irrelevant data. This is almost as important as being able to efficiently add it. By efficiently removing data, precious CPU and disk resources aren’t utilized on data that isn’t needed anymore.

Searching and querying become more and more challenging as data volumes grow. Finding that “needle in a haystack” is a lot harder when the haystack is 100 times larger than it used to be! The overall strategy here is to aggregate data for a “bird’s eye” view of the results, then allow the analyst to zoom in on specific data of interest. Common ways of zeroing in on key data is by allowing an analyst to specify timelines, location data, or other intelligent queries about what it is they want to see. 

Monitoring is a must. So, when new events of interest arise, the analyst should be immediately notified and able to review the event in question, without having to search for it. Advanced analytics running in the background also helps to identify key patterns and bring trends to the analysts, without the analysts having to search for the data. 

Powerful and Easy to Use

So, does all this power and flexibility mean the next generation, 5G-ready monitoring center applications will be difficult to use? No, not at all! 

A good querying engine will provide easy to use functionality for any level analyst.  e.g., “Play all phone calls with phone number 421-555-9696.” But it will still provide advanced queries that can be harnessed by advanced users.  e.g., “Find all e-mail events captured within 10 miles of the Great Mall in Milpitas, CA, between 10 pm and 11 pm on Nov. 7th where the device was an iPhone.” Advanced users will also be able to share the queries they create, empowering other users.

Beyond scale, new Monitoring Centers must also be able to import and analyze other data besides that received via lawful intercept. External data sources can be used to augment the lawful intercept data. Such augmentation may include other communications-oriented data, such as call detail records (CDRs) and internet communication records (ICRs). However, the most flexible systems will also ingest multiple metadata sources and automatically identify its data types and structure. Enabling law enforcement to analyze lawful intercept data side-by-side with vehicle registrations, financial records, automatic number plate readers, arrest records, or almost anything else. Being able to combine all these data sources and then query them simultaneously, helps analysts facilitate results from many pieces of a big puzzle. 


5G is dramatically pushing up data volumes and a next generation monitoring center is a must-have for law enforcement to keep up. It isn’t just about scale. It’s also about providing tools and analytics to successfully and expeditiously wade through all the data. A Monitoring Center with the latest in functionality will provide actionable intelligence to those who need it. Helping to ensure the good guys win the technology war!

About SS8

SS8 provides Lawful Intelligence platforms.  They work closely with leading intelligence agencies, communication providers, law enforcement agencies and standards bodies.  Their technology incorporates the methodologies discussed in this blog and the Xcipio® and Intellego® product portfolios are used worldwide for the capture, analysis and delivery of data for the purposes of criminal investigations

Tweet Us @SS8                        Follow Us LinkedIn