Description
  • Given a file contains parentheses where each of them is assigned a number called `depth'.
  • We use a counter and the following procedure to determine the depth of a parenthesis:
    • The initial value of the counter is zero.
    • Scanning is performed:
      • from left to right in a line, and
      • from top to bottom in the file.
    • If an opening parenthesis is reached, increase the counter by one, then the value of the counter used as the depth of that parenthesis.
    • If a closing parenthesis is reached, the current value of the counter is used as its depth, then decrease the counter by one.
  • In the following example, each parenthesis's depth is shown as you move the pointer over it:
    A ( B ( C ( D
    ) E ( F ( G ) H ( I )
    J ) K ) L ) M ( N (
    O ( P ) R )
    S ) T
    
    and we want to extract data enclosed by parentheses of depth 3.
  • For a single line version, please visit Single Line Version.
Raw Input
A ( B ( C ( D
) E ( F ( G ) H ( I )
J ) K ) L ) M ( N (
O ( P ) R )
S ) T
Desired Output
( D )
( F ( G ) H ( I ) J )
( P )
Script and Comments
Script1
[ 1] /\n/!s/.*/\n&\n/
[ 2] :loop
[ 3] s/\n([^()\n]*)/\1\n/
[ 4] /\n\n/{
[ 5] /\n$/d
[ 6] $d
[ 7] N
[ 8] s/\n(\n#+)(\n[^\n]*)$/ \2\1/
[ 9] b loop
[10] }
[11] /\n\(/{
[12] s/$/#/
[13] /\n#{3}$/s/^[^\n]*\n/\n/
[14] }
[15] /\n\)/{
[16] /\n#{3}$/{
[17] s/\n\)/)\n\n/
[18] s/#$//
[19] P
[20] D
[21] }
[22] s/#$//
[23] }
[24] s/\n([()])/\1\n/
[25] b loop
Comments -r
  1. At the beginning of a cycle, Step [1] adds two newline characters to a line:
    • The first one is used to `mark' the parenthesis to be processed, no matter whether it is an opening or an closing one.
    • The last one is used to separate the original data and the counter.
  2. Step [3] moves the first newline character to the left of the parenthesis to be processed; but there may be no parentheses (RE \n\n of Step [4] will match), in this case, we have to take into account the following situations:
    • If the value of the counter is zero, this indicates that every opening parenthesis has been paired with the closing one. The remaining data of this line is no longer required. Step [5] deletes the line and starts a new cycle.
    • Otherwise, some opening parentheses have not been paired; therefore we have to read more data:
      • If the current line is the last one of the file, all we can do is to terminate sed by Step [6];
      • Otherwise, Step [7] appends the next line to PS. Then Step [8] moves the data related to the counter to the end of PS.
  3. If the parenthesis to be processed is a opening one, steps [11] thru [14] will be executed; otherwise, steps [15] thru [23] will be executed.
  4. Then, Step [24] moves the newline character to the right of it.