Python. Join specific lines on 1 line -
let's have file:
1 17:02,111 problem report related router 2 17:05,223 restarting systems 3 18:02,444 must erase hard disk due compromised data
i want output:
1 17:02,111 problem report related router 2 17:05,223 restarting systems 3 18:02,444 must erase hard disk due compromised data
been trying in bash , got kind of close solution don't know how carry out on python.
thank in advance
if want remove extea lines :
for aim can check 2 condition each 1 if line don't followed empty new line, or line should precede line match following regex ^\d{2}:\d{2},\d{3}\s$
.
so access next line in each iteration can create 1 file object main file object name temp
using itertools.tee
, apply next
function on it. , use re.match
match regex.
from itertools import tee import re open('ex.txt') f,open('new.txt','w') out: temp,f=tee(f) next(temp) try: line in f: if next(temp) !='\n' or re.match(r'^\d{2}:\d{2},\d{3}\s$',pre): out.write(line) pre=line except : pass
result :
1 17:02,111 problem report related 2 17:05,223 restarting systems 3 18:02,444 must erase hard disk
if want concatenate rest third line :
and if want concatenate rest lines after third line third line can use following regex find blocks followed \n\n
or end of file ($
) :
r"(.*?)(?=\n\n|$)"
then split blocks based on line in in date format , write parts output file, note need replace new lines within 3rd part space :
ex.txt:
1 17:02,111 problem report related router line 2 17:05,223 restarting systems 3 18:02,444 must erase hard disk due compromised data line 5 line 6 line 7
demo :
def splitter(s): x in re.finditer(r"(.*?)(?=\n\n|$)", s,re.dotall): g=x.group(0) if g: yield g import re open('ex.txt') f,open('new.txt','w') out: block in splitter(f.read()): first,second,third= re.split(r'(\d{2}:\d{2},\d{3}\n)',block) out.write(first+second+third.replace('\n',' '))
result :
1 17:02,111 problem report related router line 2 17:05,223 restarting systems 3 18:02,444 must erase hard disk due compromised data line 5 line 6 line 7
note :
in answer splitter
function returns generator efficient when dealing huge files , refuse of storing unusable lines in memory.
Comments
Post a Comment