Python. Join specific lines on 1 line -


let's have file:

1 17:02,111 problem report related router  2 17:05,223 restarting systems  3 18:02,444 must erase hard disk due compromised data 

i want output:

1 17:02,111 problem report related router  2 17:05,223 restarting systems  3 18:02,444 must erase hard disk due compromised data 

been trying in bash , got kind of close solution don't know how carry out on python.

thank in advance

if want remove extea lines :

for aim can check 2 condition each 1 if line don't followed empty new line, or line should precede line match following regex ^\d{2}:\d{2},\d{3}\s$.

so access next line in each iteration can create 1 file object main file object name temp using itertools.tee , apply next function on it. , use re.match match regex.

from itertools import tee import re open('ex.txt') f,open('new.txt','w') out:     temp,f=tee(f)     next(temp)     try:         line in f:             if next(temp) !='\n' or re.match(r'^\d{2}:\d{2},\d{3}\s$',pre):                 out.write(line)             pre=line     except :         pass 

result :

1 17:02,111 problem report related  2 17:05,223 restarting systems  3 18:02,444 must erase hard disk 

if want concatenate rest third line :

and if want concatenate rest lines after third line third line can use following regex find blocks followed \n\n or end of file ($) :

r"(.*?)(?=\n\n|$)" 

then split blocks based on line in in date format , write parts output file, note need replace new lines within 3rd part space :

ex.txt:

1 17:02,111 problem report related router line   2 17:05,223 restarting systems  3 18:02,444 must erase hard disk due compromised data line 5 line 6 line 7 

demo :

def splitter(s):     x in re.finditer(r"(.*?)(?=\n\n|$)", s,re.dotall):           g=x.group(0)           if g:             yield g  import re open('ex.txt') f,open('new.txt','w') out:     block in splitter(f.read()):         first,second,third= re.split(r'(\d{2}:\d{2},\d{3}\n)',block)         out.write(first+second+third.replace('\n',' ')) 

result :

1 17:02,111 problem report related router line 2 17:05,223 restarting systems 3 18:02,444 must erase hard disk due compromised data line 5 line 6 line 7 

note :

in answer splitter function returns generator efficient when dealing huge files , refuse of storing unusable lines in memory.


Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -