Raw Input
A,B,"C,D",E,"F,G",H,"I,J,K"
"Chang, Yao-Jen",33,MIS,"Taiwan, Taipei",M
Desired Output
A|B|"C,D"|E|"F,G"|H|"I,J,K"
"Chang, Yao-Jen"|33|MIS|"Taiwan, Taipei"|M
Script and Comments
Script1
[ 1] :loop
[ 2] s/^\([^"]*\("[^"]*"[^"]*\)*\),/\1|/
[ 3] t loop
Comments
  1. From left to right, we number double quotes from 1.
  2. Double quotes with number 2k-1 and 2k are of the same pair, where k is a positive integer.
  3. We can regard the contents of a line as an alternating sequence of type-A and type-B blocks, where
    • type-A block:
      consists of characters inside a pair of double quotes, including the enclosing double quotes.
    • type-B block:others.
    Using A and B to denote a block of type-A and type-B, respectively, the possible forms of a line are: A, B, AB, BA, ABA, BAB, ..., etc.
    Using regular expressions, we get A?(BA)*B? or B?(AB)*A?.
  4. For any comma NOT in a pair of double-quotes, the possible forms of the sequence of characters before it are: B, AB, BAB, ABAB, ..., etc., or B?(AB)*.
  5. The equivalent REs of type-A and type-B blocks are "[^"]*" and [^"]* , respectively.
  6. For versions of sed not supporting '?' and '{0,1}', we use B*(AB)* instead of B?(AB)*.