Description
  • Given a datafile where each line is of the form KEY=value.
  • There may be several lines with a given key. In this case, we want to delete all but the last line with that key.
Raw Input Desired Output
javascript=20
php=15
perl=30
algorithm=18
javascript=14
perl=39
java=88
data structure=40
operating systems=70
algorithm=15
perl=55
operating systems=100
operating systems=100
perl=55
algorithm=15
data structure=40
java=88
javascript=14
php=15
Script and Comments
Script1
[ 1] 1{
[ 2] h
[ 3] d
[ 4] }
[ 5] G
[ 6] s/^(([^=]+)=[^\n]*)(.*)\n\2=[^\n]*/\1\3/
[ 7] $!{
[ 8] x
[ 9] d
[10] }
Comments -r
  1. A line can be printed only if there is no other line with the same key after it. But it is difficult and inefficient to implement this. Therefore a different approach is used:
    • HS is to used to keep lines that is `safe' to print before examining the current line.
    • Among the kept lines, if there exist a line with the same key as the current line, delete that line.
    • The current line is added to the kept lines.
    • The kept lines are printed at the last cycle.
    These are implemented by Steps [5], [6] and [8].
  2. In the RE of Step [6]: ^(([^=]+)=[^\n]*)(.*)\n\2=[^\n]*,
    • the second pair of parentheses is used to capture the `KEY' of the current line.
    • \n\2=[^\n]* is used to match against the line with the same key among the kept lines.
  3. Note that
Script2
[ 1] 1h
[ 2] G
[ 3] s/^(([^=]+)=[^\n]*)(.*)\n\2=[^\n]*/\1\3/
[ 4] $!{
[ 5] x
[ 6] d
[ 7] }
Comments
  1. A neat version.