October 4, 2006

Data mining requires data

Data mining requires and justifies huge investments. The smallest part is the data mining software itself. A much bigger part is the investment in data warehouse technology, a subject about which I’ve been posting extensively recently on DBMS2.com. But there’s yet another part to the picture, namely investing in actually gathering data for analysis, that I’ve written about, most recently in a blog I posted elsewhere and am now copying below.

Analytic business processes — or the areas of overlap between analytics and business process — are poorly understood. Business Activity Monitoring and Operational BI? Great buzzwords, but

there’s way too little thought put into figuring out exactly which metrics are most useful for making which kinds of business decisions. Continuous planning/budgeting? The surface has only been scratched. A numerate, “one-truth” enterprise culture? Hah. When we identify an enterprise that truly has a pervasive numbers-oriented culture, it usually is one that winds up pathologically managing to a purely short-term set of goals. (But some exceptions to that rule are among the great corporations of the world.)

One area that really needs more consideration is data capture. You can’t analyze data you don’t have. Certain industries have indeed recognized this. E.g., travel and gaming have been hugely successful with loyalty cards; indeed, casino giant Harrah’s probably gets over 100% of its profits via targeted marketing based on the mining of its loyalty card data. Credit transaction data and the like is of course also heavily exploited. I made this whole case in a Computerworld column a year ago, and if you missed it I suggest still checking that column out today.

But that’s all transactional data. The story for text data is much worse. Indeed, survey forms typically try to force people away from just saying what they think, instead giving them endless checklists that bring back unhappy memories of SATs and #2 pencils. Yet text mining technology now exists that makes it possible to glean crucial information from free-form text. If you haven’t already checked it out, you should.

Particularly interesting, I think, are some examples in the area of text data and analytics.

Share: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • del.icio.us
  • Digg
  • DZone
  • Mixx
  • Reddit
  • Sphinn
  • StumbleUpon
  • Technorati

Comments

2 Responses to “Data mining requires data”

  1. Will Dwinnell on November 8th, 2006 9:54 pm

    In most cases, data mining does not require “huge investments”. The biggest investment neccessary for data mining is in paying for someone qualified to do the data mining. Assuming the data to be analyzed already exists in some sort of database, all that is needed is a decent PC (at most about $3,000) and software ($2,500 or less). This is what I’ve used for several years to build predictive models used to manage several billion dollars worth of risk. One can pay more, but I’m not sure what benefit that provides.

  2. Curt Monash on November 22nd, 2006 2:45 am

    Will,

    The investment I was referring to was in building and maintaining the data stores, which get up to 100s of terabytes these days in some cases. (Petabytes get mentioned occasionally too, but I don’t know of a single instance where data mining is truly carried out on that scale.)

    But yes, that’s more true in some businesses — especially ones with LOTS of customers or prospects — than others.

    Thanks for your comment,

    CAM

Leave a Reply




Feed including blog about enterprise technology strategy and public policy Subscribe to the Monash Research feed via RSS or email:

Login

Search our blogs and white papers

Monash Research blogs

User consulting

Building a short list? Refining your strategic plan? We can help.

Vendor advisory

We tell vendors what's happening -- and, more important, what they should do about it.

Recent white paper

Pervasive PSQL Summit v10 Highlights

September, 2007

Recent webcast

What leading database vendors don't want you to know

Originally broadcast April 9, 2008

Monash Research highlights

Learn about white papers, webcasts, and blog highlights, by RSS or email.