ReDoS Attacks
Written by Yuval Donana on
ReDoS Attacks
Written by Yuval Donana on
Introduction
In cybersecurity, attackers constantly find new ways to exploit vulnerabilities in applications. One such technique that has gained attention in recent years is the ReDoS (Regular Expression Denial of Service) attack. ReDoS attacks can disable web applications and bring down servers, making them a significant concern for developers and security professionals.
What is a ReDoS Attack?
ReDoS is an acronym for Regular Expression Denial of Service. It is a cyber attack that takes advantage of inefficient or poorly designed regular expressions in applications. Regular expressions (regex) are powerful tools for pattern matching and data validation, but they can be susceptible to abuse.
In a ReDoS attack, the attacker crafts a malicious input string that exploits the complexity of the regular expression used by the application. When this specially crafted input is processed, the regular expression engine enters into a state of exponential backtracking (a process in the regular expression where the engine explores all possible input combinations to find a match.), significantly increasing processing time. As a result, the application becomes unresponsive or slow, denying service to legitimate users.
To understand the potential impact of ReDoS attacks, we will delve into two hypothetical scenarios that illustrate how these attacks can exploit vulnerable regular expressions in real-world applications. In both cases, we will explore how a seemingly innocent piece of code can become a ticking time bomb when faced with a maliciously crafted input. These scenarios will highlight the importance of secure regular expressions and the critical need for vigilant input validation to safeguard against ReDoS attacks.
Server-Side ReDoS
To demonstrate the attack vector better, we have set up a simple Flask web application for this scenario. The application’s purpose is to verify that the person’s name is valid, and if it is, the system will present his name along with a corresponding greeting.
To verify that the person’s name is valid, the following regex check is performed:
^[a-zA-Z]+(([\‘\,\.\- ][a-zA-Z ])?[a-zA-Z]*)*$ |
Let’s break this regex down:
- ‘^’: This caret symbol denotes the start of the string, meaning that the regular expression should match from the beginning of the input.
- ‘[a-zA-Z]+’: This part matches one or more alphabetical characters (both lowercase and uppercase) at the beginning of the name.
- ‘(‘: The opening parenthesis marks the start of a capturing group. A capturing group allows us to group parts of the regular expression and retrieve them later.
- ‘([\’\,\.\- ][a-zA-Z ])?’: This part is an optional group that matches specific characters following the first name part. It can match:
- An apostrophe ‘’’
- A comma ‘,’
- A period ‘.’
- A hyphen ‘-’
- A space, followed by one or more alphabetical characters (lowercase and uppercase) representing a middle name or initial.
- ‘[a-zA-Z]*’: This part matches zero or more alphabetical characters (lowercase and uppercase) at the end of the name. This allows the regular expression to handle single names without a last name.
- ‘)’: The closing parenthesis marks the end of the capturing group.
- ‘*’: This asterisk denotes that the entire capturing group can repeat zero or more times. This allows the regular expression to handle single names and names with middle names or initials.
- ‘$’: This dollar sign denotes the end of the string, meaning that the regular expression should match until the end of the input.
Now let’s break down some examples of valid and invalid person names:
- Names that would be accepted: John Doe, Jane Smith, Mary-Anne Johnson, O’Connor Williams, James T. Kirk, Jean-Luc Picard, Maria del Carmen.
- Names that would not be accepted: 123John (Contains a digit), Robert@Smith (Contains a special character), Emma Watson! (Contains an exclamation mark), Mary , Brown (Comma is not allowed with a space before it), Joan—Davies (Multiple consecutive hyphens are not allowed).
This regex works well. However, what would happen when we provide the following string:
aaaaaaaaaaaaaaaaaaaaa! |
From what we know so far, this string should be invalid, as exclamation marks are not allowed:
If we continue to explore this string, we can attempt to insert an additional ‘a’ character at the beginning of the string. Let’s observe the server response to the new string:
Once again, the system does not accept the person’s name since there is an exclamation mark at the end of the string. However, if we inspect the response time, we can see that the server took more time to respond (almost doubling the time). This behavior indicates that the server might be vulnerable to a ReDoS attack. We can exploit this behavior by providing a longer string, making the system unavailable to legitimate users by placing a heavy load on the server resources and CPU as a result of the inefficient regex configuration:
Client-Side ReDoS
Regular expressions are widely used in Java/JavaScript based browsers to perform various operations such as form validation, input filtering, and pattern matching. While they are powerful, they can also become a weak point if not designed carefully. Attackers can exploit poorly designed regular expressions to launch client-side ReDoS attacks, leading to significant performance issues or browser crashes.
Let’s explore a simple example of a client-side ReDoS attack. Assume that an attacker crafts a web page containing the following JavaScript code; alternatively, this can be done by leveraging an XSS vulnerability:
<html> <script language=’jscript’> myregexp = new RegExp(/^(a+)+$/); mymatch = myregexp.exec(“aaaaaaaaaaaaaaaaaaaaaaaaaaaX”); </script> </html> |
At first glance, this regular expression seems harmless, but it contains a vulnerability that can be exploited. It tries to match a string that starts with one or more ‘a’ characters, with an arbitrary number of repetitions inside a capturing group. The capturing group itself is repeated one or more times. In essence, it is trying to match strings that consist of one or more ‘a’s, with any number of ‘a’s inside.
When the JavaScript code is executed, it tries to match the provided input string, which consists of a long sequence of ‘a’s followed by a single ‘X’. The regular expression engine attempts to find a match by trying various combinations of the ‘a’s in the capturing group. The nature of this regex is such that the engine will attempt to explore all possible combinations of repetitions, leading to exponential backtracking. This could potentially overwhelm the client resources and deny access to the web application services:
Mitigation
If you have encountered any of the above scenarios, please follow the instructions below provided by Clear Gate for immediate mitigation and to prevent ReDoS attacks further:
- Secure Regular Expressions: Review and optimize all regular expressions used in your application to ensure they are efficient and resistant to ReDoS attacks. Use more specific patterns whenever possible, avoiding excessive use of wildcards and repetition operators.
- Input Validation: Implement strict input validation on both the client and server sides. Validate and sanitize all user inputs to prevent malicious input from reaching the regular expression processing.
- Limit Input Length: Enforce reasonable input length limits for fields processed by regular expressions. This helps prevent attackers from crafting extremely long strings to trigger exponential backtracking.
Conclusion
ReDoS attacks pose a significant threat to web applications and servers in server- and client-side scenarios. Our examples demonstrate that seemingly innocent and straightforward regular expressions can become dangerous when faced with maliciously crafted input, leading to devastating consequences for the application’s performance and availability.
To safeguard against ReDoS attacks, developers and security professionals must prioritize secure regular expressions and robust input validation. Regular expressions should be reviewed and optimized to ensure efficiency and attack resistance. Additionally, strict input validation measures should be implemented on both the client and server sides to prevent malicious input from reaching the regular expression processing stage.
To mitigate these risks, organizations should prioritize security measures such as conducting penetration tests and code reviews.
Clear Gate, a trusted cybersecurity provider, offers comprehensive services to help organizations strengthen their application security and protect valuable data from potential threats.