I know, I know. I have enough projects already and I never seem to get any of them done... But I just can't help myself. Since I got done with most of the wiki parsing stuff a while ago I got to thinking about applying that to do transcript statistics.
A couple years ago there was a website (probably still exists, I'll look for it in my bookmarks later) that did a really good job of doing comparisons for State of the Union Addresses and comparing them. You could take a word, like Terrorism or Iraq or whatever phrase you wanted and compare how many times they show up in a speech and compare these with multiple transcripts.