Data mining requires and justifies huge investments. The smallest part is the data mining software itself. A much bigger part is the investment in data warehouse technology, a subject about which I’ve been posting extensively recently on DBMS 2.com. But there’s yet another part to the picture, namely investing in actually gathering data for analysis, that I’ve written about, most recently in a blog I posted elsewhere and am now copying below.
Analytic business processes — or the areas of overlap between analytics and business process — are poorly understood. Business Activity Monitoring and Operational BI? Great buzzwords, but there’s way too little thought put into figuring out exactly which metrics are most useful for making which kinds of business decisions. Continuous planning/budgeting? The surface has only been scratched. A numerate, “one-truth” enterprise culture? Hah. When we identify an enterprise that truly has a pervasive numbers-oriented culture, it usually is one that winds up pathologically managing to a purely short-term set of goals. (But some exceptions to that rule are among the great corporations of the world.)
One area that really needs more consideration is data capture. You can’t analyze data you don’t have. Certain industries have indeed recognized this. E.g., travel and gaming have been hugely successful with loyalty cards; indeed, casino giant Harrah’s probably gets over 100% of its profits via targeted marketing based on the mining of its loyalty card data. Credit transaction data and the like is of course also heavily exploited. I made this whole case in a Computerworld column a year ago, and if you missed it I suggest still checking that column out today.
But that’s all transactional data. The story for text data is much worse. Indeed, survey forms typically try to force people away from just saying what they think, instead giving them endless checklists that bring back unhappy memories of SATs and #2 pencils. Yet text mining technology now exists that makes it possible to glean crucial information from free-form text. If you haven’t already checked it out, you should.
Particularly interesting, I think, are some examples in the area of text data and analytics.