Feeds

Sarah Palin's words get data mined

Whoah, there's data here? Business analysis gets to work on VP transcripts

Security for virtualized datacentres

The exercise was performed using Microsoft’s BI tools by a Microsoft employee at Redmond and the results were kindly made available to The Register. The transcripts are freely available and many of you have access to analysis tools, so why not have a go?

The transcripts were passed through a process that not only counts the number of times that a word is used but also assigns a ‘tf-idf’ weighting (term frequency–inverse document frequency) which gives some indication of the importance of the words in the document.

The (totally and absolutely not scientific) results are fascinating. Top of the list of the early Palin transcript is the term “good guy” with “bad guy” in 26th position. In the later debate she doesn’t use these terms at all. In the interests of fairness it is worth pointing out that Joe Biden (the opposition Vice Presidential candidate) uses the term “bad guy” once in the debate, so perhaps he needs some coaching too.

If we try to look for warm fuzzy patriotic words and compare their position in the early interview and then in the later debate, we find:

Word Interview Debate
Good guy 1 Never appears
Alaska 13 40
Freedom 15 80
Democratic value 16 Never appears
Face 22 170
Bad Guy 26 Never appears

Now suppose we look for words that might be considered to be more presidential – words which give the impression of a potential world leader:

Word Interview Debate
Afghanistan 125 2
People 40 4
Economy Never appears 8
Iraq 29 9
Job 144 12
Tax Never appears 13
War 68 17
Government 166 21
Nation Never appears 25

So have we proved our premise? Given that we have already hammered home the lack of science here, we leave it to you, gentle reader, to decide if there is enough ‘evidence’ here.

Of course, if we were being fair, we would (as the original analysis did) take a look at the comparable Joe Biden transcripts. But we aren’t trying to be fair or unfair. We aren’t trying to score any particular political points; we are trying to show that BI techniques can be applied to any data, not just business data. Which brings us to you.

  • Do you feel that your pointy-haired boss magically changed his or her position on some policy?
  • Do you have access to the minutes of the meetings?
  • And access to BI tools?

If you can answer “yes” to these three questions, then what are you waiting for? Hurry to the BI bonanza and get mining.

However, although BI is broadly applicable, there are data sets to which it would be entirely inappropriate to apply these techniques - for example, the work of fine, upstanding journalists such as those employed at Vulture Central. For reasons that are too technical to go into here, this data is not amenable to analysis of this kind. Which is a shame, because we know that we are always consistent. ®

Providing a secure and efficient Helpdesk

More from The Register

next story
Are you a fat boy? Get to university NOW, you PENNILESS SLACKER
Rotund types paid nearly 20% less than people who didn't eat all the pies
Emma Watson should SHUT UP, all this abuse is HER OWN FAULT
... said an anon coward who we really wish hadn't posted on our website
Japan develops robot CHEERLEADERS which RIDE on BALLS
'Will put smiles on faces worldwide', predicts corporate PR chief
Bruges Booze tubes to pump LOVELY BEER underneath city
Belgian booze pumped from underground
Let it go, Steve: Ballmer bans iPads from his LA Clippers b-ball team
Can you imagine the scene? 'Hey guys, it's your new owner – WTF is that on your desk?'
Amazon: Wish in one hand, Twit in the other – see which one fills first
#AmazonWishList A year's supply of Arran scotch, ta
SLOSH! Cops dethrone suspect - by tipping over portaloo with him inside
Talk about raising a stink and soiling your career
Oz carrier Tiger Air takes terror alerts to new heights
Don't doodle, it might cost you your flight
Oi, London thief. We KNOW what you're doing - our PRECRIME system warned us
Aye, shipmate, it be just like that Minority Report
prev story

Whitepapers

Forging a new future with identity relationship management
Learn about ForgeRock's next generation IRM platform and how it is designed to empower CEOS's and enterprises to engage with consumers.
Storage capacity and performance optimization at Mizuno USA
Mizuno USA turn to Tegile storage technology to solve both their SAN and backup issues.
The next step in data security
With recent increased privacy concerns and computers becoming more powerful, the chance of hackers being able to crack smaller-sized RSA keys increases.
Security for virtualized datacentres
Legacy security solutions are inefficient due to the architectural differences between physical and virtual environments.
A strategic approach to identity relationship management
ForgeRock commissioned Forrester to evaluate companies’ IAM practices and requirements when it comes to customer-facing scenarios versus employee-facing ones.