The Great LinkedIn Data Scrape (And What It Means for UK Businesses)


Executive Summary

  • In June 2021, a hacker on an underground forum claimed to be in possession of the data for 700 million LinkedIn users, including name, gender, email address, job title, industry, and more.
  • The data was accessed using a technique called “data scraping” which, while against LinkedIn’s rules, wasn’t technically a hack in the traditional sense.
  • While nobody purchased the scraped data, it was eventually leaked online for anyone to access, emphasising the need for businesses to understand the risks of data scraping.


Did you know that LinkedIn now has over 700 million monthly active users across the world?

That figure of 700 million is interesting not simply because it’s a staggering amount of individuals engaging with what was once thought of as the ‘Facebook for professionals’, but also because it’s the same number of users whose LinkedIn data was compromised during a cyber-attack in mid-2021.

This “attack” resulted in the collection of personal details about these particular LinkedIn users, which were subsequently offered for sale before being intentionally leaked. The attack used a particular method known as “data scraping”, but whether it’s truly malicious or not may be up for debate.

Here’s what UK businesses need to know about this leak, data scraping, and what you can do to prevent it happening to you and your team.

What was the great LinkedIn data scrape of 2021?

On June 22nd 2021, a user going by the name TomLiner on the RaidForums – a site known for sharing breached information – claimed to be in possession of 700 million records containing the personal details on LinkedIn users.

This collection of records included the following data:

  • Full names
  • LinkedIn IDs
  • Date of birth
  • Workplace address
  • Facebook and Twitter IDs
  • Job titles
  • GPS data

This isn’t an exhaustive list, but you get the idea.

The user TomLiner initially attempted to sell these 700 million records, but it seems nobody took the bait, because he (or his group) then leaked all 187GB of the data later in September for anyone to download.

While this sounds like a tough situation, you might notice something about the data that was leaked – it’s all technically publicly facing. And, as LinkedIn themselves attested, this wasn’t actually a breach of LinkedIn’s systems.

What is data scraping? Is it a cyber-attack?

Data scraping is a methodology used to extract large amounts of data from websites in a short space of time. It uses automated programs to run through all of the page content from top to bottom – the ‘scrape’ – and then saves it to another location. Over time, this scraping process can create a large repository of almost any type of text content from a website – which is precisely what happened with the LinkedIn scrape.

Is data scraping legal? The surprising answer is generally yes, because scraping targets publicly available data; it simply does it at a scale that no single human being could match. Having said that, there may be specific scenarios where data scraping does break the law. For example, if an attacker were to compile data containing registered trademarks and then try to sell it, there could be a case for the data owner.  

Is data scraping a cyber-attack? Well… not strictly.

It doesn’t require any level of ‘hacking’, whether that be brute force attacks or other forms of data breaching. All it’s really doing is taking data which is already online and compiling it. What people do with that data could comprise a breach in some scenarios, but the act of scraping isn’t against the law and doesn’t fit the definition of a cyberattack.

All of that said, data scraping is almost always against the terms of service of specific websites, but such policies aren’t generally legally enforceable, limiting a website’s defensive options to IP bans, user account bans, and other mitigation measures.  

What can businesses and individuals do to protect against data scrapes?

There are a few different ways that your IT support team and your employees can help your organisation prevent data scrapes like that seen at LinkedIn.

Without legal recourse – or at least confronted with many legal grey areas – businesses must do what they can to prevent their publicly facing data being scraped and used for nefarious purposes.

Here are a few key ideas to get you started:

  • Monitor your website traffic for any large jumps in activity. Data scraping naturally leaves a big footprint on your website, because the data has to be sent from your server to whoever is doing the scraping. If you or your IT support team can catch this activity before the scrape is complete, you may limit the potential risk.
  • Use authenticated content gating. If you have data that you think may be a target of potential data scrapes – such as publicly exposed customer or user information – you can prevent a good percentage of scrape attempts by forcing users to sign up (ideally with a double opt-in) before they can see the content.

  • Limit the activity users can carry out. Many scrapers rely on the ability to constantly search or request data from a website in order to carry out their task. By limiting the amount of requests a single user can make in a certain space of time, you’ll effectively slow down – and perhaps even halt – the progress of a scrape attempt.

  • Use a captcha for certain users. Scrapers are often hosted on certain web or cloud services, so one option (probably for your website developers to implement) is to show a captcha to prevent access by automated programs which are coming from these sources.

Of course, in addition to all of these, one of the most important things any business can do is to review the content they have on their site and whether it represents a scrape risk. That’s probably a good place to start if this is the first time you’re hearing about data scraping.

Keep your business cybersafe with Get Support

If anything we’ve covered here has been helpful, or even surprising, it’s worth knowing that this sort of best practice cyber security due diligence is all part of the service at Get Support

If you’d like to know more about our IT support packages, including comprehensive cyber security consultation, our team is waiting to hear from you.

You can reach us now on the phone by calling the IT support experts on 01865 594 000 or simply entering your details in the form below.

Latest From The Blog

New Microsoft Bookings

365 Feature Focus: The All-New Microsoft Bookings

Microsoft Bookings is designed to make the arrangement of customer appointments simple and straightforward. In 2021, Microsoft gave the app a new facelift, enhancing some features and adding some new ones. Here’s what you need to know.
On-Site Vs Remote IT Support

On-Site vs. Remote IT Support: Which is Right for Your Business?

With more businesses working from home or going the hybrid route, having a dedicated IT support team on site might not be the best option for UK companies. Here’s how to decide whether to keep IT support in the office or go remote.
Cyber Security

The UK’s Cyber Essentials Scheme is Changing in 2022. Here’s How

The Cyber Essentials scheme, backed by the UK government, gives businesses the chance to become certified in the fundamentals of cyber security – but it’s not been updated since 2014. In 2022, that’s going to change.