[OVERSTATE] VK Leaks 390 Million User Data Records
Summary:Analysis on the Judgment of the Leak of 390 Million Data Records from VK, Russia's Largest Social Media Platform
On September 2, 2024, an author named 'hikki-chan' on a hacker forum alleged that VK, Russia's largest social media website, had suffered a data breach, compromising hundreds of millions of records with personal information such as names, genders, photographs, addresses, and others.
The mirror address of the original data breach post on Darkweb.vc is https://darkweb.vc/details/EkJBS0FER0ZKSxdCEkZFF0FFQ0FFF0EWR0VEQURGEkRZAwYRHRYHXhsHHh9eQ0NDQ0FG
After analysis and judgment of the data by THUD TECHNOLOGY PTE. LTD. (darkweb.vc), the compressed data file is 6.55GB in size, and upon decompression, it reveals a CSV data file of 27.6GB, containing a total of 390,425,719 rows (approximately 390 million rows). The original file claimed to contain 6 fields, which are: name, surname, sex, img, country, and city. Upon analysis, the actual fields are: First Name, Last Name, Gender, VK Profile Picture URL, Country, and City.
All of these fields are user-defined fields on the VK website that do not require verification and are not unique identifiers.
When registering for VK, users are required to verify their mobile phone numbers and email addresses, which information will not be publicly displayed during VK social interactions. However, the six data fields involved in this data breach are all displayed in the user's public social information on the VK website, as illustrated in the figure.
After conducting a sampling analysis of the data, it was found that most Russians have indicated their real names on VK, but there are also some who have set them arbitrarily, for example:
Furthermore, the sampling analysis of the data revealed that approximately 20% to 30% of users have utilized personal photographs as their profile pictures, while the majority of other users prefer scenery shots, animals, cartoon characters, and others. Regarding the country and city fields, only a limited number of users have provided information, and the authenticity of these details is difficult to verify.
After conducting a sampling analysis of additional data from this event, we have arrived at the following conclusions:
1.It is highly likely that this data is a result of scraping VK's publicly available information, rather than a direct leak from VK's database.
2.This data holds relatively low value, primarily consisting of the correspondence between names and profile pictures (although only a portion of them match).
3.This data does not contain any sensitive information such as phone numbers, email addresses, passwords, or other authenticated or uniquely identifiable information; all data is comprised of publicly available social information from VK.