Last week, Python Package Index (PyPI) downloaded by thousands of companies, started exfiltrating sensitive customer data to an unknown server. The maintainers suspended new user registrations for a day.
It was a ‘multi-stage attack’ to steal crypto wallets, sensitive data from browsers (cookies, extensions data, etc.), and other credentials using a method called typosquatting.
This involves attackers uploading malicious packages with names deceptively similar to popular legitimate packages. Cybersecurity firm Phylum, which tracked the campaign, noted that the attackers published 67 variations of ‘requirements’, 38 variations of ‘Matplotlib’, and dozens of other misspelt variations of widely-used packages.
For a long time now, software package libraries have been the target of malware attacks. PyPI of Python, Node package manager of Javascript, RubyGems of Ruby are all prone to attacks more sophisticated than the last.
Researchers who studied malicious code in PyPI said, “Over 50% of malicious code exhibits multiple malicious behaviours, with information stealing and command execution being particularly prevalent. We observed several novel attack vectors and anti-detection techniques.”
According to the study, 74.81% of all malicious packages successfully entered end-user projects through source code installation. Researchers also said the malicious payload employed a persistence mechanism to survive reboots. Yehuda Gelb, Jossef Harush Kadouri, and Tzachi Zornstain led the research.
This is not the first time
The PyPI administrators and the Python community are actively working to combat these malicious attacks on the security of the ecosystem.
Like the measures taken last week, PyPI suspended new user registrations in November and December last year, “These temporary suspensions allow the PyPI team to triage the influx of malicious packages and implement additional security measures,” said the researchers.
Moreover, PyPI is taking proactive steps, just like other libraries. The registry now requires two-factor authentication for critical projects and packages, making it harder for attackers to hijack maintainer accounts. The team is also investing in improved malware scanning capabilities to identify and remove malicious packages quickly.
The paper also suggests that end-users should exercise caution when selecting and installing packages using pip and other tools and verify the software packages’ sources and credibility to ensure system security.
The impact of these attacks on businesses are severe. Last year, malicious Python packages stole sensitive information like AWS credentials and transmitted them to publicly accessible endpoints.
A Cat-and-Mouse Game
Since PyPI has grown in popularity, it has become an increasingly attractive target for attackers seeking to infiltrate the software supply chain. The evolution of its security measures has been a constant game of cat and mouse, with attackers continually refining their tactics and PyPI administrators working to stay one step ahead.
In the early days of PyPI, the repository relied on a largely trust-based model, prioritising ease of contribution for the growing Python community. Over the years, one of the most significant steps forward came with the introduction of two-factor authentication (2FA) for PyPI accounts.
As Donald Stufft, a PyPI administrator and maintainer since 2013, explained, “Two-factor authentication immediately neutralises the risk associated with a compromised password. If an attacker has someone’s password, that is no longer enough to give them access to that account.”
PyPI has also implemented other measures, such as API tokens for more secure package uploads and improved malware scanning tools. However, the sheer volume of packages and the constantly evolving threat landscape mean that PyPI’s security team is always playing catch-up.
Feross Aboukhadijeh, the founder of Socket, a company that provides supply chain security for JavaScript, Python, and Go dependencies, highlighted the scale of the problem, “At Socket, we see about 100 attacks like this every single week.”
Despite the challenges, the Python community has made significant progress in recent years. Stufft noted, “We’ve gotten a lot more confident in our 2FA implementation and in what the impact of enabling it is for both people publishing to PyPI, and to the PyPI team itself.”
The repository has also benefited from increased funding and resources, including the hiring of a dedicated PyPI safety and security engineer.
The impact of these attacks on businesses can be severe, as demonstrated by recent incidents where malicious Python packages stole sensitive information like AWS credentials and transmitted them to publicly accessible endpoints.
This not only puts the affected companies at risk but also exposes their customers to potential security breaches and compromised software releases.
As Aboukhadijeh put it, “Open source is one of the best things. But I think one of the things that we don’t appreciate is just the amount of trust that we place in all open source actors to be good.”