What You Need to Know About Code Repository Threats

Jul 22, 2018

Today’s software developers are under ever-increasing pressure to rapidly develop software – new capabilities in fast iterations. This has led to the more frequent use of open source code, leveraging outside developer communities to introduce innovation. As such, developers are increasingly using open code repositories to reduce cost and time in order to co-develop applications.

While open code repositories increase efficiency in R&D, they also raise issues of security vulnerabilities. Mike Pittenger, vice president of strategy for Black Duck, cautions, “Too often there are sparse records about what versions of what open source software is used, leaving corporate security pros guessing when they’re trying to figure out how vulnerable their in-house apps are.”

One of the leading code repositories, Git-Hub, was launched in 2008, which has facilitated developers to work on programs together as a team, regardless of their locations. Each repository on the site exists as a public folder designed to hold the software code that a developer is working on. The public folders are copies of the developer’s private folders stored on their own computers. Each developer can work on their own version of a piece of source code, and only commit changes to the public repository when satisfied with it. The project leader, called a maintainer, evaluates the versions of the software held in different repositories, and selects the best ones to become part of the main source code.

Code repositories, such as GitHub, have potential security issues that users should be aware of. Developers posting work on these sites have put private files into their repositories, which are then being copied into public repositories and made searchable. Attackers are well aware how commonly open source code is used. They monitor repositories to see who contributes code and which have been identified as problematic.

Examples of leaked user credentials on Git-Hub found by CyberInt’s Argos™ platform. The potential threat of data leakage is immense with over 28 million Git-Hub users as of June 2018.

While individual code repository data may be considered of low importance or value, when combined with active reconnaissance techniques, the data can be leveraged in targeted attacks against employees, such as convincing spear phishing campaigns, or in locating potential targets for attack. These threats can involve infrastructure attacks, social engineering, credential theft, and an entire shut down of systems or applications when codes are altered – resulting not necessarily only from malicious attacks but also from developers’ negligence in inadvertently copying and pasting confidential data along with the code itself.

Infrastructure Attacks/Exploits

Sensitive identifying information is valuable to an adversary as it allows them to focus their efforts on seeking exposed instances and any subsequent vulnerabilities in high-level technologies. Furthermore, the identification and collation of technical indicators such as hostnames, IP addresses and service configurations allow an adversary to build up a picture of the target’s organization infrastructure which can be used to determine the high-value targets.

Given the ease at which relevant vulnerability information can be obtained from the publicly accessible repository, an adversary does not need to conduct a full code analysis and could potentially leverage this in an attack against any deployed code. The 2016 Uber data breach was a stark reminder of the damage hackers cause, who in this case, broke into the source code repository on GitHub opening up infrastructure attacks worldwide. With the personal data for 7 million Uber drivers and 50 million customers compromised, the fallout for the company and for the world of data security was significant.

Social Engineering

Any knowledge acquired, such as the infrastructure or technologies used as well as internal process or behaviors, can be leveraged by an adversary in targeted social engineering attacks. These attacks can take the form of a convincing phishing email, the creation of a fake website that mimics a legitimately used platform or posing as another employee. In the instances of shared screenshots, many may appear innocent, but they provide potential insight into the applications installed and configuration which could assist a social engineer in making them appear plausible through their seemingly ‘internal’ knowledge.

Credential Theft

Any identified credentials or service/platform keys can be used to attempt to access the related service, or tested against other platforms within the organization to test for password reuse. While the risks of password reuse are well known, the presence of ‘test accounts’ used in development processes may imply that they are commonly recycled.

How to protect your organization

Knowing the threat is half the battle, the other half is knowing how to protect yourself and your organization. There are three actions an organization can take to protect itself from threats involving code repositories – security awareness training, perform audits, and penetration testing.

Security Awareness Training

Ensure that contributors are briefed on what is, and is not, appropriate to share on GitHub. For example, safeguard that company-wide confidential and personal data is removed or redacted before publication. Additionally, contributors should be mindful when sharing screenshots or posting comments that are publicly viewable as these can often contain useful intelligence.

Audit/Monitoring

An audit or review of existing code repository permissions should be considered to ensure that repositories for internal-only use are configured with appropriate permissions versus those suitable for public consumption. In some instances, a code repository may be suitable for public sharing although the content and discussions within ‘Issues’ may be considered more sensitive and therefore it’s prominent to exercise control over access to this.

Intelligence-led Penetration Testing

Request an intelligence-led penetration test, which will typically gather exposed data during the reconnaissance phase that will later be leveraged in a cyber attack. Based on this, and the reports generated, a greater understanding of potential attacks and threats can be gained to better prepare and mitigate against them.

Additionally, consider using CyberInt’s Threat Intelligence and digital monitoring products to provide real-time incident alerts should sensitive data be inadvertently shared on platforms such as GitHub, as well as the wider internet including dark and deep web sources.

The author

Gil

Table of contents

Related Articles