Description
A file contains logs of automated jobs with each line consisting of several fields separating by blanks:
  1. Mark: `@' for successfully completed, `X' otherwise.
  2. The name of a program.
  3. Time when the job was finished.
  4. One or more arguments, separated by blanks, passed to the program.
We want to find those failed instances without latter successful ones. Two instances are treated as the same if both their program names and arguments are identical.
Raw Input
X aa.sh 01:02:50 arg01 arg02
@ bb.sh 01:02:56 arg03
X cc.sh 01:03:05 arg04 arg05 arg06
X dd.sh 01:04:22 arg07
X dd.sh 02:22:45 arg08 arg09
@ aa.sh 03:22:56 arg10 arg11
X cc.sh 03:30:30 arg12 arg13 arg14
@ dd.sh 03:31:28 arg07
@ cc.sh 04:15:35 arg12 arg13 arg14
@ aa.sh 04:22:52 arg01 arg02
Desired Output
X cc.sh 01:03:05 arg04 arg05 arg06
X dd.sh 02:22:45 arg08 arg09
Script and Comments
Script1
[ 1] :loop0
[ 2] $!{
[ 3] N
[ 4] b loop0
[ 5] }
[ 6] :loop1
[ 7] /^X ([^ ]+) [^ ]+ ([^\n]+)\n.*\n@ \1 [^ \n]+ \2(\n|$)/{
[ 8] s/^[^\n]+\n//
[ 9] b loop1
[10] }
[11] /^X/P
[12] D
Comments
  1. The `-r' option of GNU sed is required to make sed interpret REs as EREs.
  2. The Pattern Space is abbreviated to `PS'.
  3. The first loop consisting of Steps [1] thru [5] are used to copy all lines of the datafile to PS.
  4. If the first line of PS is a failed instance:
    • the RE of Step [7] matches if there is a latter successful instance. In this case,
      • Step [8] is used to remove this failed instance then back to Step [6] for a new iteration;
      • otherwise, this instance will be first printed by Step [11],
        then deleted by Step [12]. sed jumps to Step [1] then Step [6].
    • If the first line of PS is a successful instance, it will be deleted by Step [12].