CodeSecDays 2024 - Join GitGuardian for a full-day exploration of cutting-edge DevSecOps solutions!

Save my spot!

CodeSecDays 2024 - Join GitGuardian for a full-day exploration of cutting-edge DevSecOps solutions!

Save my spot!

Analyzing the Samsung Hack - Thousands of credentials / secrets exposed

We run through the recent Samsung breach by Lapsus$ group taking a look into exactly what was leaked and if any credentials and secrets were exposed because of it (spoiler thousands were leaked).

First, we take a look at exactly what was leaked from Samsung, next we scan it for any secrets and look into a couple of examples and finally discuss what is coming next from Lapsus$ group and how they potentially hacked Samsung.

Video Transcript

Hey everyone, as you're probably aware Samsung has had a pretty significant data breach with nearly 200 gigabytes of source code being made public or involuntarily open sourced.

The Lapsus$ attack on Samsung

The Nvidia breach

This attack was done again by the Lapsus$ group, which is specifically known as a ransomware group who targeted Nvidia recently. The Nvidia breach was actually a lot smaller but contain some pretty significant information: we know that a lot of employees email addresses were leaked, including some password hashes that have been reversed engineered, and we also know that some security certificates have been signed using Nvidia's keys, which will allow malware to be installed onto windows devices. So, we know that there has been some significant security implications because of this Nvidia breach.

What Samsung info was breached?

So what about Samsung? Are we going to see the same type of information in the Samsung leak? Today we're going to run through exactly what was leaked, we're going to take a dive into it, and we're going to find out if any secrets have been exposed because of the Samsung breach, and what they could mean.

So first let's take a look at what exactly was leaked. The attackers leaked files in three parts.

Part one was really a dump of some of samsung's security applications: Samsung Knox, their security defense, and their boot loaders. This could be really significant for researchers or malicious actors to find vulnerabilities within their source code. And i know there's a lot of chatter about the bootloader being being exposed because samsung is notoriously hard to root their phones. And so people that want to have that option of having that extra freedom in their own devices haven't really been able to do so. Now that the bootloader has been leaked, well it's possible that people will be able to find more ways to be able to do this.

The second part really contains information about Samsung's encryption and in particular a lot of information about the bio metrics that they're using: the fingerprint scanners, the facial recognition... Now again if this contains significant vulnerabilities, there is potential for researchers and malicious actors to go in and uncover vulnerabilities.

And the last file contained a whole bunch of git repositories a huge amount of git repositories. And this here is really where i've focused a lot of the analysis, to see what kind of immediate information we can gather.

Now Samsung said that as a result of the breach that no personal information or user's information have been exposed. And while this is true fundamentally, well I can tell you that a lot of sensitive information has been exposed which could potentially lead to further breaches.

Analyzing the breached source code

High value API keys found in source code

As we did with the Twitch Breach, we decided to scan the Samsung source code for sensitive information like secrets.

And we found an awful lot! We found roughly 6 700 keys after filtering what we believe to be high value candidates. Now this may seem like a huge amount of keys, and don't get me wrong it is. But this is fairly typical of what we would see in a breach this size. In fact we found 6 600 keys in a very similar size breach when we analyzed the twitch's source code. The fact that these numbers are so close together is just a little bit of a coincidence, I must say, not necessarily a rule of thumb.

Internal system keys leaked

Now we can actually analyze and look at the types of keys that we found. What is not so surprising is that, in this breach, we found a huge amount of internal system keys. So private keys, RSA keys... Now these keys can be hugely sensitive. For example the Samsung signing keys could be in here which, just like Nvidia, could allow malicious actors to really sign malicious pieces of software saying that they are safe. But these keys can also give access to internal systems that are quite closed down or very limited in the access. So it's hard to know of so many keys exactly what ones are hugely sensitive and attackers can use, and which ones potentially aren't.

External service API keys leaked

What is interesting is the other type of keys that we have here. Now usually we would see more keys that are associated to external services. But it makes sense in what Samsung is developing that you wouldn't have so many of these services. But in saying that we still find a large number for instance we found 80 AWS keys. And we can take a look at some of these keys. For example, this one here is found in application.yaml file. This is a file that is used to really kind of set up infrastructure so it isn't uncommon to find keys in here. And here we have, although we've removed the keys so that you can't use them, but here we have an AWS key. Now this is exactly the type of file that we would expect to find active AWS keys within. Now i'm not going to go and check this service and see what it's used for, because that's crossing a line. But a malicious actor wouldn't have so much barriers to entry, and potentially could be causing a lot of headaches at Samsung right now. 

We also found keys to GitHub repositories. These are particularly interesting because here we have an address: a GitHub address for Samsung that's publicly accessible, so I can view it right now. And then we also have the credentials to be able to access this client id and the client secret. So it is potential that attackers, or malicious actors, or even a bloke like me can move laterally into these different systems.

Now hopefully these keys have been rotated since the breach so I wouldn't grant me access right now, but certainly could have granted the malicious actor access to these internal repositories at the time of the breach.

So other interesting ways of which you can get deeper into perhaps the Samsung infrastructure using the keys that have been found by the initial breach.

Now we have lots of keys like this that we can go through that provide, maybe bypasses for recapture: they could pervade inner workings into the systems, they could allow attackers to get deeper into internal messaging systems such as with the Slack Web Hooks, which are high value targets because you can launch internal phishing campaigns, which look very legitimate to the employees.

Employee keys

Now in here we also found credentials that are probably unlikely to be related to Samsung, but still could be interesting to an attacker. For example we found some dropbox keys. Now I think it would be unlikely that Samsung would be using a commercial Dropbox solution, in their code, perhaps they could be, but in this file we can clearly see that there are some Dropbox consumer keys and secret keys in here. Now why this is significant is that, perhaps, if this employee is copying sensitive information onto the Samsung server then, perhaps access to their Dropbox might be interesting: perhaps they've copied over sensitive information or quad sensitive files that could grant additional access into this Samsung's infrastructure on their personal Dropbox, which we now could gain access to.

Why are there API keys in source code?

So you can see there's a lot of different ways in which attackers can use these keys.

Let's talk about the problem as a whole. Why are all these keys littered in this source code? Is that a problem if your source code isn't leaked? And what can you do about it?

Source code is a leaky asset

This is really a huge problem because source code is a very leaky asset. Even Samsung, which is source code will be more sensitive than most, unfortunately did fall victim to that source code being leaked out into the public. And that's because source code is used by multiple different parties, different developers have access to it, and when you're dealing with thousands of employees, you just need one to have bad password hygiene or to be working as an insider to grant access to that source code. In fact, Lapsus$ has even sent out a recruitment inside their Telegram channel looking for insiders, looking for inside employees, at the following list of companies. Now this is crazy because so many developers will have access to internal source code for companies. That in itself may not be a huge problem but if that source code contains secrets such as Samsung's; such as Twitch's, such as Nvidia's... Then those secrets will leak out with those source codes. And even if that employee was never meant to have access to such critical infrastructure and data, the fact that they have access to the source code creates a significant problem.

How Lapsus$ gets initial access

Now Lapsus$ is sending out this kind of tweet actually gives us an insight into potential attack paths that these hacking groups are using. They're looking for insiders, they're looking for employees that have access to the source code, that have access to internal systems, to kind of come on side with them. Now while i think that 99 percent of employees would never do this, well... You never know who's having a bad day that day, who was just berated in front of their colleagues, and may be willing to change sides to the attackers. So this is why it's so critical that we must make sure our source code doesn't contain any sensitive information. We have to make sure that it doesn't contain these secrets which are going to allow attackers to persist into different infrastructure, move laterally, elevate privileges, or god forbid be able to sign malicious software as authentic.

How can you know if there are secrets in your source code?

Now one of the best ways to do this is through scanning for these secrets. We were able to identify these secrets, the bad guys probably are, but also organizations can. Automated secret detection is critical to set up. And as developers you can also put in place tools like git hooks to detect secrets before they enter into your repository.

Now not that i want to finish on a grim note, but I don't think we are at the end of seeing these types of attacks. In fact Lapsus$ himself has put out a poll, asking their followers whose information do they want to see next. And I know Vodafone, one of the options one of the companies, is currently doing an investigation to find out, perhaps have they been breached, and what sensitive information could be part of that. So if you want to hear the analysis of those or other breaches when they happen, make sure you subscribe to our channel. We try our best to provide you with clear breakdowns when breaches happen, of how they've happened and the information they contain. And give this video a thumbs up if you found it useful.

Finally if you're an organization and you're worried that your source code contains sensitive information, then you can reach out to the team at GitGuardian and they can do a scan of your entire source code history to make sure that you don't leave any nasty surprises for attackers, if your source code does leak out.

So thanks for watching and until next time!

I'm MacKenzie, and remember: secure code is good code!