LeetCode 10. Regular Expression Matching

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).

Note:
s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like . or *.
Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".
Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the preceding element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".
Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".
Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches "aab".
Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

题目要求实现一个只支持* 和 . 的正则表达式判定函数

解题思路: 建立一个NFA 然后直接模拟

  1. 对于非*字符,建立一个新的状态 连一条边
  2. 对于字符,连一条到自己的边 (字符为之前的那一个字符)
  3. 重要 由于*可以匹配 0个字符, 需要连一条【之前状态】 到 【现在状态】 【不消耗字符】的边。
  4. 然后记录下当前状态 暴力转移即可,复杂度最坏O(n*m)

非常挫的实现(大佬勿喷) (56 ms):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
class State:
def __init__(self):
self.next = defaultdict(list)

def get_next(self,c):
ret = []
if c=='*':
return self.next[c]
if '.' in self.next:
ret.extend(self.next['.'])
if c in self.next:
ret.extend(self.next[c])
return ret

class Solution:
def isMatch(self, s: str, p: str) -> bool:
if not p:
return not s
start_state = State()

now_state = start_state
prev_state = start_state
for i,c in enumerate(p):
next_state = State()
if c == '*':
pc = p[i-1]
now_state.next[pc].append(now_state)
else:
if i!=0 and p[i-1]=='*':
prev_state.next['*'].append(now_state)
now_state.next[c].append(next_state)

prev_state = now_state
now_state = next_state

if p[-1] == '*':
prev_state.next['*'].append(now_state)

last_state = now_state
def void_transition(states):
if not states:
return []
ret = []
for st in states:
for nxt in st.get_next('*'):
ret.append(nxt)
ret.extend(void_transition(ret))
return ret


states = [start_state]
states.extend(void_transition(states))
for c in s:
new_states = []
for st in states:
new_states.extend(st.get_next(c))
new_states.extend(void_transition(new_states))
new_states= list(set(new_states))
states = new_states

return any(st == last_state for st in states)

思路2:DP 比较简单 参考youtube视频

dp[i][j] 表示s[:i] 和 p[:j] 是否匹配

  1. dp[0][0] = 0
  2. dp[i][j] = dp[i-1][j-1] if s[i] == p[j] or p[j] ==’.’ 单字符匹配
  3. dp[i][j] = dp[i][j-2] if p[j]==’*’ 星号部分完全省略
  4. dp[i][j] = dp[i-1][j] if p[j]==’*’ and s[i] == p[j-1] 星号部分重复至少一次
  5. 答案是dp[n][m]
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
class Solution:
def isMatch(self, s: str, p: str) -> bool:
if not p:
return not s

n = len(s)
m = len(p)
dp = [[0]*(m+1) for _ in range(n+1)]
dp[0][0] = 1

for j in range(1,m+1):
if p[j-1]=='*':
dp[0][j] |= dp[0][j-2]

for i in range(1,n+1):
for j in range(1,m+1):
if s[i-1] == p[j-1] or p[j-1] == '.':
dp[i][j] = dp[i-1][j-1]
if p[j-1] == '*':
dp[i][j] = dp[i][j-2] #match 0 times
if s[i-1] == p[j-1-1] or p[j-1-1] == '.':
#match one or more times
dp[i][j] |= dp[i-1][j]

return bool(dp[n][m])

Bonus

LeetCode 44. Wildcard Matching

用同样的思路可以直接秒掉更简单版本的44题

1
2
3
4
5
6
7
8
9
10
Given an input string (s) and a pattern (p), implement wildcard pattern matching with support for '?' and '*'.

'?' Matches any single character.
'*' Matches any sequence of characters (including the empty sequence).
The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like ? or *.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
class Solution:
def isMatch(self, s: str, p: str) -> bool:
if not p:
return not s

n = len(s)
m = len(p)
dp = [[0]*(m+1) for _ in range(n+1)]
dp[0][0] = 1

for j in range(1,m+1):
if p[j-1] == '*':
dp[0][j] = dp[0][j-1]

for i in range(1,n+1):
for j in range(1,m+1):
if s[i-1] == p[j-1] or p[j-1] == '?':
dp[i][j] = dp[i-1][j-1]
if p[j-1] == '*':
dp[i][j] = dp[i][j-1] #match 0 times
dp[i][j] |= dp[i-1][j] #match 1+ times
return bool(dp[n][m])