regex - How to remove all except the first 3 and last of a specific character with sed -


i've looked on place can't find answer. i've used sed before i'm familiar syntax - 1 has me stumped.

i want remove except first 3 instances , last instance of specific character. here specific example:

input.csv:

"first", "some text "quote" blaw blaw", 1 "second", "some more text "another quote" blaw blaw", 3 

i want remove quotes (") except first 3 , last 1 looks this:

output.csv:

"first", "some text quote blaw blaw", 1 "second", "some more text quote blaw blaw", 3 

any pointers? thanks.

$ sed -r ':a; s/([^"]*"[^"]*"[^"]*")([^"]*)"([^"]*")/\1\2\3/; ta' input.csv "first", "some text quote blaw blaw", 1 "second", "some more text quote blaw blaw", 3 

how works

the code works looking first 5 quotes. removes fourth. process repeated looping until there 4 quotes left.

  • :a

    this defines label a.

  • s/([^"]*"[^"]*"[^"]*")([^"]*)"([^"]*")/\1\2\3/

    this looks the first 3 quotes , text that precedes them group 1. looks next set of non-quote characters group 2. looks following double quote. looks non-quote characters followed fifth quote group 3. replaces 3 groups, omitting fourth quote.

    let's break down more explicitly:

    • ([^"]*"[^"]*"[^"]*")

      this looks the first 3 quotes , text that precedes them. saved group 1.

    • ([^"]*)

      this looks next set of non-quote characters. saved group 2.

    • "

      this matches fourth quote on line.

    • ([^"]*")

      this matches next group of non-quote characters followed fifth quote on line. saved group 3.

    the replacement text \1\2\3 has effect of removing fourth quote of 5 quotes found.

  • ta

    if substitution made, loops label a. if not, done line.

bsd or mac osx

try:

sed -e -e ':a' -e 's/([^"]*"[^"]*"[^"]*")([^"]*)"([^"]*")/\1\2\3/' -e 'ta' input.csv 

Comments

Popular posts from this blog

powershell Start-Process exit code -1073741502 when used with Credential from a windows service environment -

twig - Using Twigbridge in a Laravel 5.1 Package -

c# - LINQ join Entities from HashSet's, Join vs Dictionary vs HashSet performance -