Perl Script to Parse IRC Logs for Use in MegaHAL.trn

MegaHAL is an interesting AI engine that has no knowledge of the words; it only reassembles sentences and abuses the ability of the human mind to read order in chaos. If you are into MegaHAL IRC bots, the following Perl code might be handy:

1
2
3
4
5
6
#!/usr/bin/perl
while ( <> ) {
    s/<.*>//;
    s/^[^ ]*://;
    print unless m/^\*/;
}

This is for mIRC-style logs without timestamps, i.e:

<Tim> This is an example.
* Tim explains

Line 3 removes the nickname from each line. If the first word in a line is followed by a colon, the fourth line removes that word and the colon. This is useful since people often highlight each other in this style, and you do not want those nicknames. Line 5 prints the resulting fixed line, unless it starts with a *, i.e. CTCP ACTIONs (”/me”) and server messages are not printed. To use the perl script just pipe the output to the MegaHAL training file:

$ perl logfix.pl SomeNet-somechan.log > megahal.trn

Maybe Related?

1 Comment »

  1. Thanks, that helped me.
    Mind to share your IRC megahal code?
    (If you can share it, please send it to my mail)

    Thanks, Nadav

    Comment by Nadav — April 5, 2008 @ 7:36 pm

RSS feed for comments on this post. TrackBack URI

Leave a comment

FireStats iconAnvänder FireStats