admin 管理员组

文章数量: 1087139


2024年4月24日发(作者:java教程视频那个人比较牛)

php正则绕过总结

English Answer:

Regular Expression Bypass Techniques in PHP.

Regular expressions (regex) are a powerful tool for

matching and manipulating text. They are used in a wide

variety of applications, from simple text search and

validation to complex data extraction and transformation.

However, regex can also be complex and difficult to

understand, and it is possible to write regex that is

vulnerable to bypass attacks.

A regex bypass occurs when an attacker is able to input

data that matches the regex, but does not match the

intended meaning. This can allow the attacker to bypass

input validation, execute arbitrary code, or access

unauthorized data.

There are a number of different techniques that can be

used to bypass regex. Some of the most common techniques

include:

Escaping characters: Regex characters can be escaped

by using a backslash (). This allows you to match

characters that would otherwise be interpreted as regex

operators. For example, the regex /d+/ will match any

sequence of one or more digits. However, the regex /d+./

will match any sequence of one or more digits followed by a

period.

Using character classes: Character classes allow you

to match any character that belongs to a specific set. For

example, the character class [0-9] matches any digit from 0

to 9. The character class [a-zA-Z] matches any letter from

a to z or A to Z.

Using quantifiers: Quantifiers allow you to specify

the number of times a pattern can match. For example, the

quantifier matches the preceding pattern zero or more

times. The quantifier + matches the preceding pattern one

or more times.

Using lookarounds: Lookarounds allow you to match

patterns that are preceded or followed by specific other

patterns. For example, the lookahead assertion (?=pattern)

matches any pattern that is followed by the specified

pattern. The lookbehind assertion (?<=pattern) matches any

pattern that is preceded by the specified pattern.

Regex bypass attacks can be difficult to detect and

prevent. However, there are a number of steps that you can

take to reduce the risk of your applications being

vulnerable to these attacks.

Use a strong regex engine. A strong regex engine will

be less likely to be vulnerable to bypass attacks.

Validate your input data. Before using regex to

process input data, you should validate the data to ensure

that it is valid.

Use a whitelist instead of a blacklist. A whitelist is

a list of allowed patterns. A blacklist is a list of

disallowed patterns. Whitelists are more secure than

blacklists because they allow you to explicitly specify

which patterns are allowed.

Be careful when using regular expressions. Regex can

be complex and difficult to understand. If you are not

careful, you may write regex that is vulnerable to bypass

attacks.

Chinese Answer:

PHP 正则绕过总结。

正则表达式 (regex) 是一种强大的工具,用于匹配和操作文本。

它们被用于各种各样的应用程序中,从简单的文本搜索和验证到复

杂的数据提取和转换。然而,正则表达式也可能很复杂且难以理解,

并且可以编写容易受到绕过攻击的正则表达式。

当攻击者能够输入与正则表达式匹配但与预期含义不匹配的数

据时,就会发生正则表达式绕过。这可能允许攻击者绕过输入验证、

执行任意代码或访问未经授权的数据。

有许多不同的技术可以用来绕过正则表达式。一些最常见的技

术包括:

转义字符, 正则表达式字符可以用反斜杠 () 转义。这允许

您匹配将被解释为正则表达式运算符的字符。例如,正则表达式

/d+/ 将匹配任何一个或多个数字的序列。然而,正则表达式

/d+./ 将匹配任何一个或多个数字后面跟着一个句点的序列。

使用字符类, 字符类允许您匹配属于特定集合的任何字符。

例如,字符类 [0-9] 匹配 0 到 9 之间的任何数字。字符类 [a-

zA-Z] 匹配 a 到 z 或 A 到 Z 之间的任何字母。

使用量词, 量词允许您指定模式可以匹配的次数。例如,量

词 匹配前面的模式零次或多次。量词 + 匹配前面的模式一次或多

次。

使用环视, 环视允许您匹配前面或后面跟着特定其他模式的

模式。例如,展望断言 (?=pattern) 匹配任何后面跟着指定模式的

模式。回顾断言 (?<=pattern) 匹配任何前面跟着指定模式的模式。

正则表达式绕过攻击可能很难检测和预防。但是,您可以采取

一些步骤来降低应用程序容易受到这些攻击的风险。

使用强大的正则表达式引擎。 强大的正则表达式引擎不太容

易受到绕过攻击。

验证您的输入数据。 在使用正则表达式处理输入数据之前,

您应该验证数据以确保它是有效的。

使用白名单而不是黑名单。 白名单是允许模式的列表。黑名

单是不允许模式的列表。白名单比黑名单更安全,因为它们允许您

明确指定允许哪些模式。

小心使用正则表达式。 正则表达式可能很复杂且难以理解。

如果您不小心,您可能编写的正则表达式容易受到绕过攻击。


本文标签: 匹配 绕过 模式 允许 数据