Code signing and securing the software supply chain with Billy Lynch

Billy Lynch from Chainguard explores the importance of application and code signing for a secure supply chain, drawing from his experience at Google and sharing the latest developments in this field.

Video Transcript

uh GitHub actions actually runs as the same identity across everything and so if you just make calls to the GitHub API you can get a valid signature for the GitHub action spot user regardless of what repo that you put it in and so it becomes an interesting path for attack of dependency confusion of like how do you know these things are actually legitimate you know part of the part of the difficulty is like you know security for a lot of computer things is you know weirdly a very human problem so even if you were like checking checksums uh if code cuff did a legitimate update the hash would change so then the question becomes like how do you know that's actually a legitimate change or a malicious change that is a small taste of what's coming up in this week's episode with Billy Lynch now we dive into the subject of software signing or application signing how can we be sure that the components that we're using are legitimate and not malicious it is a great starting point in understanding how it can play a critical aspect in the software supply chain and making sure it's secure and I have to say Billy is an absolute expert in this and it was a privilege to be able to pick his brains not only about how this all works and fits together but how it potentially could have been used in some real life attacks that have happened and how it possibly could have prevented it along with all the latest developments and new tools that are coming out in this field Billy is a staff software engineer at chain guard working on developer tools and securing software supply chain pain for everyone he's an active contributor and maintainer to a Sig store and techton projects and is the creator of get sign prior to working at chain card Billy worked for several development tool teams at Google including Cloud build Cloud code and Cloud Source repositories so Billy has some fantastic credentials and definitely knows what he's talking about but enough from me I want to dive into this episode so I'm going to hand it over to my partner in crime on the security repo podcast Dwayne McDaniel to kick us off on this week's episode how did that story go how did you get into Security in the first place uh yeah so I I got into security by way of developer tools um so before I was at Shane gardells at Google for over eight years or almost eight years um so I started on like the Google code team uh that became the cloud Source repositories team that became the cloud build team um so I had this like like long winding path to end up in in supply chain security so you know back on the on cloud builds one of the things I was working on was basically Integrations with with other CI platforms so like GitHub uh being able to like trigger things into Cloud build and so part of that is like you have to think about security about like how are you handling people's oauth credentials how are you storing them how are you you know interacting with these platforms um so I I was always really interested in like signing and cryptography and stuff like that I took a cryptography course in in college um and then when I started getting more involved in the open source side um Dan Lorenz art chain guard CEO um he was starting the six door project and I was like huh this is really interesting um and then when Dan and Kim and you know our other co-founders actually went and started chain guard um that became like really you know compelling and tempting to like hey like let's work on this and like really invest in this uh so that's that's how I sort of found myself in both chain guard but also six door as well yeah could you maybe explain to the audience a little bit about what um so chain guard what were chain guard also but also how that interacts with the six store that you you were just talking about so everyone that's maybe not familiar with that yeah absolutely uh so chain guard uh we're we're a startup we've been around for a little over a year now um our whole mission is to make securing Supply chains uh by default and make that as easy as possible um so we do a lot of work for you know working with what are you running in production um and even to sort of take a step back um like what is software supply chain security um so what we're really talking about there is when you think of your like typical supply chain um you know you'll hear on the news like hey you know this this part in a car is being recalled um you know how do you know like what part what cars it's in you know what dealerships it went to who actually bought the cars um and we're able to trace through like the the whole entire physical supply chain to know what's affected and things like that and so what we really want to do is do the same thing but for software and vulnerabilities so knowing hey this new CBE drops you know it it affects this particular Library you know where am I using it in production what assets am I using that they're actually affected by it um so tying together that all that metadata in order to actually make make you know what you're running in sort of what your you know threat model and like you know uh when one of these cve drops like knowing how you need to respond or if you're affected at all um so for chain Garden we do a lot of work for uh making it easy for people to integrate those into their production platforms uh the other thing that we do so so that's our chain guard enforce product um which is really just giving visibility and also policy enforcement to make these supply chain decisions of like you know what do I allow to run a production and uh sort of what's what's running in in my production cluster uh that I may not know about uh the other thing that we work on is something called chain guard images which is a set of secure by default uh container images that we build ourselves so these could be usually popular open source projects so these these could be language images these could be Prometheus this could be you know whatever um where we basically apply the same sort of Supply stream security principles that we want to see out of like the entire ecosystem but we sort of provide them um for images that customers want and so we can generate images uh the the goal is to whenever you run a security scan on one of these images you should get zero cves um and we do this by automating a lot of like the remediation and patching like that so it's actually really impressive I've seen um like we have one of our images is like a git image and there was there's like a recent announcement of like git vulnerabilities that went out and we were able to patch that and push that to production within an hour of of the CBE dropping so um yeah so it's a lot of cool work that we're doing sort of across the ecosystem um and then to get back to the original question of how does six store fit into this um so six store is an open source project part of the openssf which is under the Linux Foundation um and six stores whole goal is to make software signing as easy as possible and so when we're talking about these Supply chains software Supply chains uh part of the difficulty is knowing like why do we need the data to actually link these things together so we need to know like okay from your Docker image you know what's in there like what software what libraries what dependencies do I have uh but then the second thing is like how do you actually trust that data like how do you how do you know where is it coming from who's providing it like is it actually something you can put Faith in uh and that's really where software signing comes in uh so the idea there is you know you have this metadata you'll sign it with some key and then that key basically gives you cryptographic cryptographic proof that that hasn't been tampered with or anything like that um and so traditionally that was done with like gpg Keys uh or other sort of long-lived keys uh but some of the some of the challenges that we you know we saw in the industry is Key Management is a very hard problem and to expect everyone to do that right is is very hard and like a very like very very high you know bar uh in order to as an organization make sure that you're you're you know in compliance of and so the idea behind six store was really like okay what if we can make that easier what if we what if users didn't have to think about Key Management and that's where the idea of keyless signing comes into play uh so this is something that that that's part of six door uh where instead of having users provision their own long lived key have to manage that have to rotate it have to protect it um what we do instead is we generate uh new keys on the Fly and the idea is you generate a key for assigning event you use it to sign whatever you want and then you throw it away never to be used again um and by doing that you're basically rotating a key every single time so you're drastically reducing that window um of like how you know where it can be leaked and how it can be used and things like that um and so there's there's a lot of work that goes into that so instead of tying things to Keys what we do is we tie things to identities and so when we generate that key what we do is we send users through like an oauth flow so your typical sign in with Google sign in with GitHub things like that and then we bind the key that you generate to that oauth identity with the certificate Authority that six store runs um yeah and so that's sort of six star as a whole so we at shin guard leverage that very very heavily uh to basically create these trust genes uh because we're relying on oh often because we're like a lot of this is powered by oidc there's a lot of really cool things you can do with automation because if you look at Cloud providers GitHub actions gcp AWS they actually provide oidc providers for uh the compute runtimes that they run on and so if you apply this in those automated environments you can actually get this for free without any user interaction and you can just sign things as the service account that things are running as rather than like individual keys that you don't necessarily know who has access to um things like that so I will pause there for questions because I'm signing things in general like why why is this important um for the there might be developers out there listening to this that haven't run into the signing problems yet so you just give us a high level and like what are we even talking about when we say signing uh yeah so some of it is there's two sort of paths you can take here so um you know as I mentioned before signing is really the mechanism to prove that some data hasn't been tampered with and has come from like a source that you know uh so this could be you know from a developer this could be um you know often for your cicd platforms so you want to know um so like a common use case is for container image signing uh ideally you would only want to deploy things to prod that have come from your production CI cluster and not just from you know a random developer's machine um so how you go about that is you know you build your Docker image uh but then you sign that container as well with the Identity or the key of your CI cluster uh and then at deployment what you could do uh either with uh six door has a project called policy controller but you could also do this with like kyverno Opa uh you can check these signatures and say like hey before I actually go and run this like did this actually come from my prod CI cluster uh and because of that signing and because of that cryptography you can actually prove yes it did um and so where six store comes in so as the next step is not only do we say you know we we focus Less on the key and we focus more on the identity so it's not did this particular key sign this this this data but it's you know did my prod service account for my CI pipeline sinus data and so that becomes a little much more powerful for writing policies and caring less about the key distribution and Key Management mechanisms of you know doing that cryptography yeah it's very interesting and there's a couple of moments where you know like that I bring to mind where we're signing here has created the red flags and I think that helps at least me to understand it like one that I'm thinking of is there was a supply chain attack uh with a a tool in the CI CD pipeline called codecov um I don't know if you're aware of it but code curve The Bash uploader script was was modified and how this whole thing ended up being discovered because it took nearly three months uh was that the the signatures and in that case I think it was the hashes didn't didn't match up is that is that kind of in line with the scenario or is it is what you're talking about a level beyond that as well uh it's sort of The Next Step so with the code Cub incident um you know if you looked at the the instructions for how to install codecub it was basically just curl codecove.sh you know pipe to bash um and so that that's like very problematic because there's the you know how do you know you're actually getting the content that you expect um you know in that case it was like redirected to some other source or I think it might have been served from codecuff but it was injecting sort of bad content from there um so checksums might have helped there right if you were pinning to like particular versions you would have been able to detect hey something changed uh but then the next step with signing is like okay you know what the the the check some of the content and whether you know that's changed or not but where did it come from like did it come from code Cubs production pipelines did it come from um or did it come from like some other source that just happened to have access to whatever S3 bucket or storage bucket that happened to host that that content um you know one of the classes of things that we're trying to protect against is you know if there is you know if you do happen to get access to some production pocket stuff like that like just being able to upload content isn't enough like you actually want to be able to go through the whole entire review process um and you know we need to have that metadata in order to make those decisions and we need to be able to trust that metadata and that's really where signing comes in um because there's there's also like other cases like we could go into like the solarwinds attack where uh you know even if even if you were checking checksums that wouldn't have helped you because in that case like the CI pipeline itself was compromised and so that also gets into like a a tricky situation right got it but signing would have uh prevented these from happening it says okay you know part of the part of the difficulty is like you know security for a lot of computer things is you know weirdly a very human problem um so even if you were like checking checksums uh if codecov did a legitimate update of their script the hash would change so then the question becomes like how do you know that's actually a legitimate change or a malicious change um you know you could go and look through the entire script see what it was doing um is that realistic for everyone to do probably not uh so what signing lets you do is you know if you have some additional metadata to say like you know yes this has changed but it has come from a trusted source that gives us a little bit more protection you know it may not be perfect you know it may not stop like you know Insider threat risks if like you know a code employee decided they were going to do something malicious intentionally um but it does help some of that you know hey there was a change but it came from this source and we can trust that source and so we're going to allow this yeah so speaking of trusted sources um want to bring up webflow uh not webflow the page builder um want to ever hear webflow just from my background I automatically think of the page builder not the thing that Google named their Global signature key um but that was something that kind of surprised me when I went to your session up at uh cdcon get Ops con um was that this is a real problem out there that Google or not Google I'm sorry GitHub um is signing everything with the same signature across everything yeah yeah it's it's a bit surprising I mean it's it's it's a trade-off um like there's definitely legitimate reasons to do it um so for for some background here um webflow.gpg so GitHub has a feature for sign commits and whenever there's two ways to go about doing sign commits um one is you can generate a gpg key um or an SSH key now uh and you can associate it to your GitHub account and then whenever GitHub sees a signature that matches that key it'll give you that nice green check mark and it'll say verified and you know it knows to to tie to your account uh the other thing that GitHub will do is whenever you do any sort of web operation or API operation um you know GitHub doesn't have access to your private key nor nor should it um so but you know we still want some of those protections around like you know tampering and uh also in particular for GitHub uh knowing that the commit actually came from the author it said it did so git is weird in that like you could just modify git commit data somewhat arbitrarily and this has caused some issues before of like people you know you'll see it all the time like people uh impersonate Linus Torvalds uh when GitHub had uh there's a there's drama a few years ago with GitHub and dmca takedowns and so someone impersonated the GitHub CEO on GitHub they just like inserted their email uh and it showed up as as coming from from the GitHub CEO so signing is useful in order to like tie you know did this commit actually come from that account and so what GitHub does for these API operations is uh because it knows it's authorizing you and it's making the commit and applying the commit uh what it's effectively doing is attesting to say Hey you know we saw that this user made this change and we as GitHub are attesting to that fact um so it does actually give you some guarantees of like you know did it actually come from this user however um GitHub uses the same key to do this for every user across the platform um and so you know to put my my tin foil hat on you know you see issues like um you know just a couple months ago GitHub had to rotate their RSA key and that that caused problems um so you know like I'm sure GitHub is is doing all the right things to secure their keys and stuff like that but things do happen you do need a rotate Keys uh so the question I always raise is like okay well what what happens if uh you know this key ever needs to be rotated and you look around if there's like automated tools uh so I know for like um like kubernetes they use a tool called prow to basically handle merges into their system uh prow will use the webflow.gpg key because it's using the API to make to merge those changes so if you look at projects like kubernetes you look at projects I think even like uh six store itself because you know we often hit the merge button um to merge these commits into main they're all signed by this webflow.gpg key um and this this creates some like weird Behavior because um you can do some weird things with it so I I at my talk at CD con uh I showed off this you know proof of concept where uh GitHub actions actually runs as the same identity across everything and so if you just make calls to the GitHub API you can get a valid signature for the GitHub actions bot user uh regardless of what repo that you put it in and so it becomes an interesting sort of path for attack of dependency confusion of like how do you know these things are actually legitimate um how do you know you know how much trust should you actually put in these signatures um and then depending on how you model your threat model it's like you know if GitHub you know one of the things we're looking for signatures is be able to detect compromises and ideally the signature should be something that we should be able to check against the source of the content from um and so the next thing about like if you bring your own key like even if GitHub is compromised since GitHub doesn't have your private key they can't actually Forge those signatures and so you do have another layer of protection but the question is you know if GitHub has ever compromised do you have the same level of protection with webflow that gpg and that becomes a little more murkier um but you know I don't I don't want to Doom and Gloom too much because you know all that is very hypothetical but it's yeah it's it's something that that I find very interesting and something that's like you know it could very much be a problem one day you know if something ever happens to that key it's like a very critical piece of infrastructure that we just don't realize day to day a pretty big single point of failure if it is a big if if that the key gets compromised and we've seen this happen before Circle CI uh they compromise a developer's Machine by deploying malware and then we're able to compromise Circle CI and get into private do all kinds of nasty stuff you know this is a scenario potentially if the right person was targeted and depending on how GitHub has their key set up you know a lot of hypotheticals here but you can you can see based on what has happened before how it could happen again yeah one of the things is moving on here is there's a tool that you've been working on associated with called Rec talk because you can maybe give us a quick overview what record is and how this relates into kind of the conversation around potentially adding levels of security in metadata and other areas when it comes to to supply chain security yeah so recore is another piece of the six door puzzle um so I mentioned before uh part of what sixstore does is it runs a certificate Authority that knows how to like bind you know keys to user identities you know so when we look at the like the webflow.gpg like one of my dreams you know and I've I've talked to Folks at GitHub about this is like you know if if people are making these API calls to GitHub and they have their their oauth identity already like can we just do this this keyless flow with six door in order to like mitigate some of that or at least spread the load of our webflow.gpg um but part of the challenge when you use the keyless flow is because the the keys are only supposed to be valid for a short period of time um typically when you do validation with like x509 search you're not supposed to trust certificates after they're expired uh because you're meant to like rotate a new one in and so that creates a problem when we want to verify these signatures days weeks months down the line and these certs are no longer valid in terms of like signing and stuff like that uh so that's really where recore comes in so recore is a transparency log uh so if you're familiar with like certificate transparency Logs with like let's encrypt things like that it's it's a very similar vein but this one is meant for signing events and attestations and and things like that um so the idea is uh whenever you do a signature um with with the keyless flow or even if you bring your own key recur can support this as well um you can upload it to recore and record because it's backed by a Merkle tree it's append only um and basically it can't can't be tampered with like it cryptographically like cannot be tampered with when something has been added to the tree um and so what we do is because as part of what it includes it into its its uh transparency log it will also include the timestamp of when it was included uh similar to like um like a timestamp Authority timestamp uh so what we can do to verify signatures is look in recore and say hey does the signature exist with this key and was was it signed at a time where where the certificate was valid and with that then we can go back and verify these signatures even though the certificates themselves are expired um so that's like sort of functionally from a verification standpoint where a record is useful uh the other thing that's really useful is it becomes this log that you can monitor for everything that happens with your identity so you can go to like search.sixstore.dev and you can punch in my email and you can see every single thing I've ever signed with sixstore uh and so if you apply that for your production CI you can start seeing hey what did this sign when did it sign it uh where is this being used uh and that becomes a very useful auditing tool you know if things ever start to go awry of like where are my signatures floating around in the wild um so there's a lot of power and value in that as well and so um six store itself actually runs a public instance of uh fulsio the certificate Authority and recore uh that any open source project can use you can use it for free the only trade-off is you have to be okay with all this information being public in the wild uh but of course sixstar being an open source project if you want to run your own you can uh that's that's one of the things that we do at changar too we can help you know run private six door instances um but yeah like our hope is for like open source projects to adopt this and to integrate this into their deployment pipelines so that everyone can start taking advantage of like these public transparency logs and this this information and metadata that's sort of freely available right on thank you thank you very much for that um I think we'll put a link to Merkle trees out in the description here um for people that might not be familiar with blockchain technology are just getting up to speed on it um so Merkle trees aren't actually blockchain Merkle trees are the same uh technology that power git repositories I was going to say the charts look very similar when you look at the what it does so uh Matt Billman um from netlify uh I heard him give a talk a few years ago to conference I was putting on uh that if you think about it git is really the first broad blockchain that everybody adopted uh except unlike true blockchain it's not immutable you can't go back and mess with your history and completely modify it but if it's shared that's really hard to do and convince everybody else to update theirs so this is kind of block Chaney but all right all right um so to bring things home wrap things up here sounds like the whole name of the game for what you're concerned about is trust and can we actually believe the signature can we believe the person's actually done this can we trust the machine we're on and I guess that's the whole point of security can we actually trust this is true and verify it so in the long term here um how can devs and operations teams devops teams push toward better establishing that trust overall like what's some tips you can leave for folks like how do we make our teams more trustworthy like we make our work more trustworthy especially if we share it outside of our work yeah so I I think the the easiest thing you could start doing is start signing um if you're not doing it already uh there there's a really robust ecosystem around container signing now uh we're also seeing signing sort of pick up in other ecosystems as well so we've been working with folks in the python Community the Java Community uh npm so GitHub just announced public beta for npm Providence uh that's actually backed by six door as well um you know the the first thing we need is sort of an ecosystem is you know we need the metadata regardless of whether people are enforcing it or not like we can't make any policy decisions if we don't have the data so starting to be able to produce that data attach it to our artifacts um you know it gives us a there's sort of a characteristic here right so that the the carrot is uh you know being able to have this metadata is not only useful for one for your organizations to have more trust in what you're deploying but also if you ever need to debug what's going on you leave that um the bread the breadcrumbs of like hey what is actually running in production what am I using uh what libraries are in here uh that gives you sort of Rich metadata for how to respond and look into things uh the stick is there's a lot of legislation especially from the US government that's starting to come down around these things around s-bombs around vulnerabilities and vulnerability managements um and so signing becomes like a very critical part of that like you know you can generate s-bomb but like where did it come from how do you trust it is it just like you know someone filling out an spdx document manually or is it like coming from automation things like that um so signing will really help automate a lot of this process uh and you know I think it will be a very critical part of like hey yeah like we need to work with a vendor for you know and we need to prove you know here's the s-bomb and everything that we're using as part of like the the software that we're shipping you know that should be very easy and something that we can trust because we have those signatures we have those receipts in order to like trust that data and know that everything everything that went into it so it'll be really interesting to see how like a lot of these scenes develop over time I think right now we're seeing like the early adopters phase of people that like see like see the value and really Buy in but I think very very soon we're going to see the the more regulatory side of like you must do this and it'll be interesting to see how companies sort of react to that I mean we're on the date now like you're you're supposed to be turning in those s-bombs right now um what what will happen and how many will get turned in and how they'll even do with them when they get there that's a whole other conversation we don't have time for today yeah yeah it's super interesting and you know part of it is that log4j happens two years too soon where I think if log4j happens now it would be the perfect test to see did this s-bomb uh legislation actually work right because we kind of need some kind of catalyst to be like because I remember uh November if like the phone's non-stop ringing everyone's trying to figure out because we're a vendor uh they're they're trying to figure out what our exposure isn't so that they know what their exposure is it through us and then you multiply this through every element of your supply chain no one had any idea right if now if we had some s problem legislation in and better yet if we were signing these things so that we could actually uh be confident in these s-bombs you know I think where you put it in that context it would make these big security catalysts which are going to happen again believe me it's inevitable uh a lot easier um but it's a shame that it happens it happened when it did because it would be an interesting trial right now to see to see where we are and if people did turn in their response thank you um just to close up from the episode for I want to thank you so much Billy for for coming on uh there's been a lot of technical information in here and I'm definitely one of these people where I'm gonna need to kind of go down and look a little bit deeper into some of these areas so I want to ask you now where's a good place to start um and if people kind of wanted to follow you and I know you speak at conferences or I don't know if you share more content but what are some good resources and what is the best way for our listeners to kind of keep track of you and and stay up to date when your presentations and what you're up to uh yeah so if you're interested in learning more about six store uh sixstore.dev is the main site uh we have tons of documentation for how to get started in the different tools cosign git sign um yeah cosigns for container signing get sign is for git commit signing uh as well as the other tools for like the server side components as well um I would also highly recommend checking out uh the openssf on GitHub uh they have a lot of resources for uh you know securing your projects secure like supply chain security uh we work with them a ton uh you know obviously with six store being a part of the open ssf uh but they they have a lot more resources for just like general you know security practices um for me personally uh if you want to ever follow me I'm on Twitter uh WF Lynch is is my handle um and yeah if you're ever at a conferencing see me always feel free to come up and say hi I'm always happy to talk awesome we'll put those we'll put those links in the show notes for everyone well I will confirm yes you can just walk up to Billy and talk to them at a conference that's awesome well thanks again Billy Dwayne do you have any final words before we click the rate bid button or I want to say thanks again for being on here thanks for all the work you do out there trying to make the world more secure we need more people thinking in those terms so really appreciate your efforts yeah thank you happy to be on

Code signing and securing the software supply chain with Billy Lynch

Table of Contents

Video Transcript