有一个字符串
s = "I. INTRODUCTIONIN analogy with the radio-frequency or microwave antennas,an optical antenna facilitates energy transfer from guided-wave modes of an optical waveguide to optical free-space modesand vice versa. Recently, optical antennas have received signif-icant attention due to their ability to control the light emissionwithin a nano-scale footprint [1]–[15]. Optical antennas have thecapability to boost the efficiency of photodetection [8], [9], sens-ing [10], heat transfer [11], [12] and spectroscopy [13]. Also,very directive optical antennas with electronically controlled ra-diation pattern are the subject of great interest for applicationssuch as planar imaging [16] and LIDAR [17]"
有没有办法把它分解类似
下面的结果?
("Recently, optical antennas have received signif-icant attention due to their ability to control the light emissionwithin a nano-scale footprint.", "[1]–[15]")
("Optical antennas have thecapability to boost the efficiency of photodetection", "[8], [9]")
("sens-ing", "[10]")
("heat transfer", "[11], [12]")
("and spectroscopy", "[13]")
("Also,very directive optical antennas with electronically controlled ra-diation pattern are the subject of great interest for applicationssuch as planar imaging", "[16]")
("and LIDAR", "[17]")
首先,你这字符串里面有个特殊字符fi,在efficiency这个单词里面,如果将其替换为fi的话,我下面的则可以使用.
#!/usr/bin/python
#-*- coding: UTF-8 -*-
#author:zhaoyingnan
import re;
s = "I. INTRODUCTIONIN analogy with the radio-frequency or microwave antennas,an optical antenna facilitates energy transfer from guided-wave modes of an
optical waveguide to optical free-space modesand vice versa. Recently, optical antennas have received signif-icant attention due to their ability to con
trol the light emissionwithin a nano-scale footprint [1]–[15]. Optical antennas have thecapability to boost the efficiency of photodetection [8], [9], s
ens-ing [10], heat transfer [11], [12] and spectroscopy [13]. Also,very directive optical antennas with electronically controlled ra-diation pattern are
the subject of great interest for applicationssuch as planar imaging [16] and LIDAR [17]";
listMatch_1 = re.findall('(?:\.|\,|and)\s([\w\s,-]+)([\[\d\]]+(?:–[\[\d\]]+|,\s[\[\d\]]+)?)', s, re.I);
for i in listMatch_1:
print i;
前面的那一段有特征化么?
re.split(r'(\[[^a-zA-Z]+\])', s)
简单但不严谨的答案
这种规律性较差的,正则不如用 split
来得方便。