65、正则表达式

小白量化 2025-12-19 06:11:32 124 举报

1、含义
是强大的字符串处理工具。

2、特点
（1）语法复杂性，可读性较差
（2）通用性强，几乎所有的现代编程语言都支持正则表达式

3、简单使用
（1）导入模块
（2）使用 match() 进行匹配操作

re.match(正则表达式模式, 字符串, 标志位=0)：用于从字符串的开头开始匹配正则表达式。如果匹配成功，则返回一个Match对象，否则返回None。

# 语法格式
re.match(pattern, string, flags=0)

pattern - 要匹配的正则表达式模式；

string - 要在其中搜索的字符串；

flags - 可选参数，用于控制匹配行为（如忽略大小写、多行匹配等）；

# 示例
import re

res = re.match("p", "python")
print(res)    #（匹配成功）

# 示例
import re

# 检查字符串是否以 "Hello" 开头
text = "Hello, World!"
result = re.match(r'Hello', text)

if result:
    print("匹配成功！")
    print(f"匹配到的内容: {result.group()}")  # 输出: Hello
else:
    print("匹配失败！")

（3）匹配成功，通过group()提取数据。

# 语法格式
match.group([group1, ...])

参数说明：

group1, ...（可选）：

可以是整数（从 0 开始）

可以是字符串（命名分组名称）

可以传递多个参数

默认为 0（返回整个匹配）

返回值：

单个参数：返回字符串

多个参数：返回元组

# 示例1
import re

res = re.match("p", "python")
print(res)    #（匹配成功）
print(res.group())

# 示例2
import re

res = re.match("s", "python")
print(res)         # None
print(res.group())  
# 匹配失败，调用group()会报错。 
# AttributeError: 'NoneType' object has no attribute 'group'

# 示例
import re

res = re.match("o", "python")  # match() 是从第一个字符开始匹配
print(res)

# 示例
import re

res = re.match(".", "%1python")
print(res.group())

res = re.match(".", "哈\n\t%1python")
print(res.group())

（2）[]：字符集合。匹配所包含的任意一个字符。

# 示例1
import re

res = re.match("[pP]", "python")  # [pP]: 匹配 p 或 P
print(res.group())

res = re.match("[pP]", "python")  # [pP]: 匹配 p 或 P
print(res.group())

res = re.match("[0-9]", "509210")   # [0-9]: 匹配 0 到 9 的任意一个数字字符
print(res.group())

res = re.match("[a-z]", "zxa509210")   # [a-z]：匹配 a 到 z 的任意一个小写字母字符
print(res.group())

res = re.match("[A-Z]", "Zxa509210")   # [A-Z]: 匹配 A 到 Z 的任意一个大写字母字符
print(res.group())

res = re.match("[a-zA-Z]", "PpZzxa509210")   # [a-zA-Z]: 匹配 a 到 z 或 A 到 Z 的任意一个字母字符
print(res.group())

# 示例2
import re

# 需求；匹配除5之外的任意一个数字字符
res = re.match("[0-46-9]", "5909210")   # [0-46-9]: 匹配 0 到 4 或 6 到 9 的任意一个数字字符
print(res)    # None

（3）\d：匹配任意一个数字字符

# 示例
import re

res = re.match(r"\d", "5PpZzxa509210")
print(res.group())

res = re.match(r"\D", "5PpZzxa509210")
print(res)   # AttributeError: 'NoneType' object has no attribute 'group'

res = re.match(r"\d", "PpZzxa509210")
print(res)    # AttributeError: 'NoneType' object has no attribute 'group'

res = re.match(r"\D", "哈t\nPpZzxa509210")
print(res.group())

# 示例
import re

res = re.match(r"\s", "\n\t ")
print(res.group())


res = re.match(r"\S", "\n\t ")
print(res)    # AttributeError: 'NoneType' object has no attribute 'group'

res = re.match(r"\S", "函数U@1")
print(res)

res = re.match(r"\s", "函数U@1")
print(res)

# 示例
import re

res = re.match(r"\w", "_1Ts函数U@1")
print(res.group())

res = re.match(r"\W", "_1Ts函数U@1")
print(res)

res = re.match(r"\W", "@\n_1Ts函数U@1")
print(res.group())

res = re.match(r"\w", "@\n_1Ts函数U@1")
print(res)

# 示例
import re

res = re.match(r"\w*", "_wwb3fgh#2xccb@")
print(res.group())

res = re.match(r"\w*", "#2xccb@")   # \w*：匹配任意个单词字符
print(res.group())   # ""

# 示例
import re

res = re.match(r"\w+", "123f#2xccb@")   # \w+：匹配至少一个单词字符
print(res.group())

res = re.match(r"\w+", "#2xccb@")   # \w+：匹配至少一个单词字符
print(res)

# 示例
import re

res = re.match(r"\w?", "123f#2xccb@")   # \w?：最多匹配1个单词字符
print(res.group())

res = re.match(r"\w?", "#2xccb@")   # \w?：最多匹配1个单词字符
print(res.group())    # ""

# 示例
import re

res = re.match(r"\w{2}", "s123f#2xccb@")   # \w{2}：匹配2个单词字符
print(res.group())

res = re.match(r"\w{2}", "s#2xccb@")   # \w{2}：匹配2个单词字符
print(res.group())   # AttributeError: 'NoneType' object has no attribute 'group'

（5）{m,n}：匹配前1个字符至少m次，至多n次

# 示例
import re

res = re.match(r"\w{2,4}", "哈哈ssfg6354r#2xccb@")   # \w{2,4}：匹配2-4个单词字符
print(res.group())

res = re.match(r"\w{2,4}", "ss#2xccb@")   # \w{2,4}：匹配2-4个单词字符
print(res.group())

res = re.match(r"\w{2,4}", "s#2xccb@")   # \w{2,4}：匹配2-4个单词字符
print(res.group())    # AttributeError: 'NoneType' object has no attribute 'group'

4.3 匹配开头和结尾【在match()作用不明显，建议在findall()尝试使用】

# 示例
import re

res = re.match(r"^\d", "123abc456")
print(res.group())

res = re.findall(r"^\d", "abc456")
print(res)

# 注意：^在[]中表示对...取反
res = re.match("[^pP]", "python")  # [^pP]：匹配除 p 和 P以外的任意一个字符
print(res.group())   # AttributeError: 'NoneType' object has no attribute 'group'

res = re.match("[^pP]", "#cPython")  # [^pP]：匹配除 p 和 P以外的任意一个字符
print(res.group())

（2）$：匹配字符串的结尾

# 示例
import re

res = re.match(r"\w+", "123abc#")
print(res.group())

res = re.match(r"\w+$", "123abc#")
print(res)

res = re.findall(r"\d$", "123abc456")
print(res)

4.4 匹配分组

# 示例
import re

res = re.match(r"\d|\s", "\n123anc")
print(res.group())

res = re.match(r"\d|\s", "@\n123anc")
print(res.group())   # AttributeError: 'NoneType' object has no attribute 'group'

（2） ()：将括号中字符作为一个分组

# 示例
import re

res = re.match(r"\w+@(qq|126|163)\.com", "917848283@qq.com")
print(res.group())

# 示例
import re

res = re.match(r".*", "哈哈")
print(res)

res = re.match(r".*", "登录")
print(res)

量化小白，从0开始学量化！ 1

著作权归文章作者所有。 未经作者允许禁止转载！

最新回复 ( 0 )

暂无评论
游客

楼主

您需要登录后才可以回帖

立即登录立即注册

小白量化

UID: 10 一级用户组

量化小白，从0开始学量化！

帖子数
73

评论数
12

发新帖