Perl Script to Parse IRC Logs for Use in MegaHAL.trn
MegaHAL is an interesting AI engine that has no knowledge of the words; it only reassembles sentences and abuses the ability of the human mind to read order in chaos. If you are into MegaHAL IRC bots, the following Perl code might be handy:
1 2 3 4 5 6 | #!/usr/bin/perl while ( <> ) { s/<.*>//; s/^[^ ]*://; print unless m/^\*/; } |
This is for mIRC-style logs without timestamps, i.e:
<Tim> This is an example. * Tim explains
Line 3 removes the nickname from each line. If the first word in a line is followed by a colon, the fourth line removes that word and the colon. This is useful since people often highlight each other in this style, and you do not want those nicknames. Line 5 prints the resulting fixed line, unless it starts with a *, i.e. CTCP ACTIONs (”/me”) and server messages are not printed. To use the perl script just pipe the output to the MegaHAL training file:
$ perl logfix.pl SomeNet-somechan.log > megahal.trn

Thanks, that helped me.
Mind to share your IRC megahal code?
(If you can share it, please send it to my mail)
Thanks, Nadav
Comment by Nadav — April 5, 2008 @ 7:36 pm