sample:<a rpos="" cpos="title" href="http://xxx.com/a.html" style="font-family:Arial,SimSun,sans-serif;font-size:16px;color:#0000cc; text-decoration:none;" target="_blank">
我想获取http://xxx.com/a.html 请问如何写正则表达式
这还不简单
text = '<a rpos="" cpos="title" href="http://xxx.com/a.html" style="font-family:Arial,SimSun,sans-serif;font-size:16px;color:#0000cc; text-decoration:none;" target="_blank">'
urlPattern = r'(href="http://[\s\S]+.html")'
pattern = re.compile(urlPattern)
match = re.findall(pattern,text,0)
/<a[^>]*href="([^"]+)"[^>]*>/
from bs4 import BeautifulSoup
soup = BeautifulSoup(sample)
url = soup.find('a').get('href')