<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>The Monash Report &#187; Data mining</title>
	<atom:link href="http://www.monashreport.com/category/analytic-technologies/data-mining/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.monashreport.com</link>
	<description>Technology ... politics ... marketing ... strategy ... life</description>
	<lastBuildDate>Mon, 19 Jul 2010 07:49:14 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.8.4</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>Usability engineering is crucial</title>
		<link>http://www.monashreport.com/2007/11/19/usability-engineering-is-crucial/</link>
		<comments>http://www.monashreport.com/2007/11/19/usability-engineering-is-crucial/#comments</comments>
		<pubDate>Mon, 19 Nov 2007 15:37:14 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Blaze Advisor]]></category>
		<category><![CDATA[Fair Isaac]]></category>
		<category><![CDATA[usability]]></category>
		<category><![CDATA[user interface]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2007/11/19/usability-engineering-is-crucial/</guid>
		<description><![CDATA[From a review of the rather powerful Fair Isaac Blaze Advisor, which will surely be far less successful than its functionality deserves:
 But employing a usability expert when designing the tools and observing                      how [...]]]></description>
			<content:encoded><![CDATA[<p>From <a href="http://www.infoworld.com/article/07/11/19/47TC-blaze-advisor-brms_1.html" onclick="javascript:pageTracker._trackPageview('/www.infoworld.com');">a review of the rather powerful Fair Isaac Blaze Advisor</a>, which will surely be far less successful than its functionality deserves:</p>
<blockquote><p><span class="artText"> But employing a usability expert when designing the tools and observing                      how users interact with them would go a long way toward improving their usefulness.</span></p></blockquote>
<p>My mind utterly boggles each time I discover that a large software vendor still doesn&#8217;t seem to have realized this.  Or maybe Fair Isaac did do usability engineering, but entrusted it to a blithering incompetent.  That frankly would be more reassuring than them not having tried at all.</p>
<p><em> </em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2007/11/19/usability-engineering-is-crucial/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Three ways to market analytics-related technology</title>
		<link>http://www.monashreport.com/2007/03/19/three-ways-to-market-analytics-related-technology/</link>
		<comments>http://www.monashreport.com/2007/03/19/three-ways-to-market-analytics-related-technology/#comments</comments>
		<pubDate>Mon, 19 Mar 2007 08:57:55 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data mining]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2007/03/19/three-ways-to-market-analytics-related-technology/</guid>
		<description><![CDATA[“Decision support”, “information centers”, “business intelligence”, “analytic technology”, and “information services” have been around, in one form or other, for 35+ years.   For most of that time, there have been two fundamental ways to sell, market, and position them:

Access to information
Application software

More recently – especially the past five years – there’s been a [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">“Decision support”, “information centers”, “business intelligence”, “analytic technology”, and “information services” have been around, in one form or other, for 35+ years.   For most of that time, there have been two fundamental ways to sell, market, and position them:</p>
<ul>
<li class="MsoNormal">Access to information</li>
<li class="MsoNormal">Application software</li>
</ul>
<p class="MsoNormal">More recently – especially the past five years – there’s been a third way:</p>
<ul>
<li class="MsoNormal">Infrastructure upgrade</li>
</ul>
<p class="MsoNormal">as early-generation implementations get replaced by newer ones.</p>
<p class="MsoNormal">At the 50,000 foot level, here’s some of what I see going on:</p>
<ul>
<li class="MsoNormal"><em>Classical BI marketing is floundering.</em> BI vendors don’t know whether they’re in      the business of quick/easy information access, analytic apps, or      better-enterprise-system-software.</li>
<li class="MsoNormal"><em>A few areas of analytic application are being packaged and marketed      well,</em> with solid business-process stories and good customer acceptance      of same.  The biggies are <em>budgeting/planning </em>and<em> CRM analytics</em>.   On the whole, however, <em>analytic apps are floundering,</em> or      else are little more than reporting front-ends on operational systems (e.g.,      in network management).</li>
<li class="MsoNormal"><em>Data warehouse software startups are on a roll. </em> Especially at the high end, this is a      pure infrastructure-upgrade business.       There’s plenty of room still for improvement, but multiple vendors      each are doing good jobs of marketing on the basis of:
<ul>
<li class="MsoNormal">Speeds and feeds</li>
<li class="MsoNormal">Ease of deployment</li>
<li class="MsoNormal">Ease of administration</li>
<li class="MsoNormal">Price</li>
<li class="MsoNormal">Credibility</li>
</ul>
</li>
<li><em>Data integration is mainly an infrastructure improvement      play. </em>After all, that      integration COULD be hand-coded.  Automating      the process is usually a better-infrastructure story.</li>
<li><em>Text search is still an information-access story. </em>There are multiple niches where      search is booming.   But in all      cases the story is information access.       Evidently the technology and/or market aren’t mature enough yet for      strong infrastructure stories.  And      in the limited cases where text search gets integrated into general      application software packages, it’s usually just for information access      rather than a real business process.</li>
<li><em>Data mining and predictive analytics are mainly information access      plays.</em> Yes, the information      being accessed is calculated rather than raw.  Yes, I believe that the heart of the data      mining market is <a href="http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/" >continuous      process improvement</a>.  Even so, what      users buy from the vendors is usually little more than information      toolkits.</li>
<li><em>Text analytics is mainly an information access play. </em>Text mining and information      extraction have two main uses right now.       Either they resemble – and indeed often feed into &#8212; data mining,      or else they are used to enhance search and search-like document access.</li>
<li><em>Information services have always been an information access      play. </em>When you think about it,      the financial-quote-machine business is a huge part of the whole decision      support market.  Lexis/Nexis is no      slouch either.  And they’re all      about providing information access.</li>
</ul>
<p class="MsoNormal"><em><strong>Related links</strong></em></p>
<ul style="margin-top: 0in" type="disc">
<li class="MsoNormal">This three-headed taxonomy      of strategies is similar to <a href="http://www.monashreport.com/2006/04/06/microsoft-underscores-its-core-paradigm/" >one      I previously postulated for Microsoft, SAP, and IBMOracle</a>.</li>
<li class="MsoNormal">I covered analytic      business processes at length in a <a href="http://www.monash.com/whitepapers.html" onclick="javascript:pageTracker._trackPageview('/www.monash.com');">November, 2004 white paper</a>.  Unfortunately, industry progress since      then has been relatively slow.</li>
<li class="MsoNormal">I’ve written voluminously      about data warehouse software startups on <em><a href="http://www.dbms2.com/category/relational-database-management-systems/rolap/" onclick="javascript:pageTracker._trackPageview('/www.dbms2.com');">DBMS2</a></em>.</li>
<li class="MsoNormal">One example of infrastructure      focus is the <a href="http://www.monashreport.com/2007/03/16/have-analytics-vendors-rediscovered-ease-of-deployment/" >ease-of-deployment      trend</a>.</li>
<li class="MsoNormal">Web search and generic      enterprise search aren’t the only search areas to focus on information      access.  (And yes, they’re most      definitely <a href="http://www.texttechnologies.com/2007/01/22/41-differences-between-web-and-enterprise-search/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">separate      areas</a>.)  Even <a href="http://www.texttechnologies.com/2007/02/15/inquira-mercado-structured-search/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">customer-facing      structured search</a> does; the information is just tailored according to      different criteria. <img src='http://www.monashreport.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2007/03/19/three-ways-to-market-analytics-related-technology/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Have analytics vendors rediscovered ease-of-deployment?</title>
		<link>http://www.monashreport.com/2007/03/16/have-analytics-vendors-rediscovered-ease-of-deployment/</link>
		<comments>http://www.monashreport.com/2007/03/16/have-analytics-vendors-rediscovered-ease-of-deployment/#comments</comments>
		<pubDate>Sat, 17 Mar 2007 01:11:57 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Computing appliances]]></category>
		<category><![CDATA[DBMS vendors and technologies]]></category>
		<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Usability and UI]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2007/03/16/have-analytics-vendors-rediscovered-ease-of-deployment/</guid>
		<description><![CDATA[Business intelligence (BI) used to be characterized by speed and cost-effectiveness &#8212; short sales cycles, low-cost departmental purchases and deployments, evasion of IT departments&#8217; strangleholds of data, and so on and so forth.  That focus has blurred, as BI vendors have increasingly focused on analytic applications or enterprise-wide standardization sales.  But increasingly I&#8217;m [...]]]></description>
			<content:encoded><![CDATA[<p>Business intelligence (BI) used to be characterized by speed and cost-effectiveness &#8212; short sales cycles, low-cost departmental purchases and deployments, evasion of IT departments&#8217; strangleholds of data, and so on and so forth.  That focus has blurred, as BI vendors have increasingly focused on analytic applications or enterprise-wide standardization sales.  But increasingly I&#8217;m seeing signs that the pendulum has swung at least partway back.  For example:</p>
<ul>
<li>Business Objects and Netezza have announced <a href="http://www.businessobjects.com/news/press_release.asp?id=20070313_006264" onclick="javascript:pageTracker._trackPageview('/www.businessobjects.com');">a mid-range BI appliance</a>.</li>
<li>Ingres is <a href="http://www.dbms2.com/2007/03/08/ingres-tries-to-become-relevant-again/" onclick="javascript:pageTracker._trackPageview('/www.dbms2.com');">headed in the same direction</a>.</li>
<li>QlikTech is enjoying great growth for its <a href="http://www.dbms2.com/2007/02/13/qliktech-qlikview-overview/" onclick="javascript:pageTracker._trackPageview('/www.dbms2.com');">fast-deploying BI technology</a>.</li>
<li>KXEN and Verix offer <a href="http://www.monashreport.com/2006/10/04/kxen-and-verix-try-to-disrupt-the-data-mining-market/" >&#8220;easy&#8221; data mining technology</a>.</li>
<li><a href="http://www.texttechnologies.com/2007/02/01/what%e2%80%99s-interesting-about-the-fast-venture-in-bi/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">Search-based BI</a> is trying to circumvent the data warehouse deployment process.</li>
</ul>
<p>It&#8217;s about time.</p>
<p><em><br />
</em></p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2007/03/16/have-analytics-vendors-rediscovered-ease-of-deployment/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The problem with dashboards, and business intelligence segmented</title>
		<link>http://www.monashreport.com/2006/10/05/dashboard-business-intelligence-bi-segmentation/</link>
		<comments>http://www.monashreport.com/2006/10/05/dashboard-business-intelligence-bi-segmentation/#comments</comments>
		<pubDate>Fri, 06 Oct 2006 01:02:55 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Business intelligence]]></category>
		<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Usability and UI]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/10/05/dashboard-business-intelligence-bi-segmentation/</guid>
		<description><![CDATA[It is becoming ever clearer that dashboards aren’t working out too well, any more than predecessor technologies like EIS (Executive Information Systems) did.  The recurring problem with these technologies is that if they’re mind-numbingly simple, people don’t find them very useful; but if they’re not, people are overwhelmed and still don’t find them useful. [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">It is becoming ever clearer that dashboards aren’t working out too well, any more than predecessor technologies like EIS (Executive Information Systems) did.  The recurring problem with these technologies is that if they’re mind-numbingly simple, people don’t find them very useful; but if they’re not, people are overwhelmed and still don’t find them useful.  <a href="http://www.computerworld.com/action/article.do?command=viewArticleBasic&amp;articleId=9003869&amp;" onclick="javascript:pageTracker._trackPageview('/www.computerworld.com');">This column by Sandra Gittlen</a> does a good job of spelling the problem out.</p>
<p class="MsoNormal">I think there are lots of problems like that in BI, and what we need to do is step back and consider all the different kinds of BI that enterprises value and need.  More precisely, let’s consider the major kinds of <em>use</em> of BI, because it seems that each calls for different kinds of technological support.  Here’s one possible list:</p>
<ul>
<li class="MsoNormal">Early      warning of situations that require action.</li>
<li class="MsoNormal">Communication      of company results.</li>
<li class="MsoNormal">Deep      analysis and decision support.</li>
<li class="MsoNormal">Operational      analytics.</li>
</ul>
<p class="MsoNormal">Here’s what I mean by each category.</p>
<p><span id="more-116"></span></p>
<p class="MsoNormal"><strong>Early warning of situations that require action.</strong> This is the classic image of BI.  People get reports or graphs on paper or on screen, see that some numbers are out of whack, and react accordingly.  Dashboards can be a prettier version of the same thing, or they can be more focused on alarms and alerts.  Nowadays, alarms and alerts can also arrive by IM, email, text message, pager, and so on.</p>
<p class="MsoNormal">On the one hand, a large fraction of the economic value in the history of BI has been generated in this area.  On the other, technology for doing so is continually perceived as inadequate.  I continue to think that KPI management, alerting technology, and so on are advancing much more slowly than they should be.   Even the buzz around Business Activity Monitoring doesn’t seem to have accelerated things much.</p>
<p class="MsoNormal">Nor does this change when <a href="http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/" >the warnings are the product of text or data mining</a>.  For example, despite a very interesting approach to generating alerts, at this point in its development <a href="http://www.monashreport.com/2006/10/04/kxen-and-verix-try-to-disrupt-the-data-mining-market/" >Verix</a> delivers them in uninspired ways.</p>
<p class="MsoNormal">Mainly, the supporting technology here is the standard query/reporting stack, including new dashboard/scorecard/alerting tools.  It’s also the main place where data mining and text mining should be more integrated into standard BI than they are – i.e., to define and populate metrics and KPIs.</p>
<p class="MsoNormal"><strong>Communication of company results.</strong> Back in the 1980s, the conventional wisdom was that half the benefit of reporting tools was for actual analytics, while half was just for communicating among enterprise employees.  That’s probably still valid.</p>
<p class="MsoNormal">BI vendors had the good idea a few years ago to build out their collaboration capabilities, but generally didn’t follow through on it, <a href="http://www.monashreport.com/2006/01/20/the-power-of-portals/" >SAP somewhat excepted</a>.   Bad choice, in my opinion.  <a href="http://www.texttechnologies.com/2006/09/01/why-the-bi-vendors-are-integrating-with-google-onebox/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">The use of text search to get at BI results</a> is something of a plausibility argument for my views in this area.</p>
<p class="MsoNormal">Again, the supporting technology here is largely the standard reporting stack.  Portal/collaboration tools should be more involved than they are.</p>
<p class="MsoNormal"><strong>Deep analysis and decision support.</strong> Routine, scheduling reporting was covered in my first two categories.   But this third one is where the bulk of <em>ad hoc</em> query and <a href="http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/" >data mining</a> fall.  Generally, it’s where lots of specialized and/or calculation-intensive analytic technology comes into play.  It’s also where the drilldown aspect of standard reporting shows up.  Also, this is the area that is driving much of the recent transformation and disruption in the data warehouse market, because <a href="http://www.dbms2.com/2006/10/04/data-mining-data-warehousing/" onclick="javascript:pageTracker._trackPageview('/www.dbms2.com');">different kinds of BI need different kinds of data warehousing technology.</a></p>
<p class="MsoNormal"><strong>Operational analytics.</strong> In operational analytics, a small amount of analysis is done real-time or near-real-time, in connection with an operational business process.  The most technically demanding examples are probably the customer-facing ones, such as call centers or personalized e-commerce sites.  This is where the buzz around “active/enterprise data warehousing” is concentrated.  There may also be interesting messaging aspects.  And <a href="http://www.dbms2.com/2006/10/04/data-mining-data-warehousing/" onclick="javascript:pageTracker._trackPageview('/www.dbms2.com');">data mining scoring</a> may be a consideration.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/10/05/dashboard-business-intelligence-bi-segmentation/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>KXEN and Verix try to disrupt the data mining market</title>
		<link>http://www.monashreport.com/2006/10/04/kxen-and-verix-try-to-disrupt-the-data-mining-market/</link>
		<comments>http://www.monashreport.com/2006/10/04/kxen-and-verix-try-to-disrupt-the-data-mining-market/#comments</comments>
		<pubDate>Wed, 04 Oct 2006 10:09:04 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>
		<category><![CDATA[Software as a service]]></category>
		<category><![CDATA[Usability and UI]]></category>
		<category><![CDATA[Verix]]></category>
		<category><![CDATA[KXEN]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/10/04/kxen-and-verix-try-to-disrupt-the-data-mining-market/</guid>
		<description><![CDATA[Data mining is hugely important, but it does have issues with accessibility.  The traditional model of data mining goes something like this:

Data      is assembled in a data warehouse from transactional information, with all      the effort and expense that requires.      [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal"><a href="http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/" >Data mining is hugely important</a>, but it does have issues with accessibility.  The traditional model of data mining goes something like this:</p>
<ol style="margin-top: 0in" type="1">
<li class="MsoNormal">Data      is assembled in a data warehouse from transactional information, with all      the effort and expense that requires.       Maybe more data is even <a href="http://www.monashreport.com/2006/10/04/data-mining-requires-data/" >deliberately      gathered</a>.  Or maybe the data is in      large part acquired, at moderate cost, from third-party providers like      credit bureaus.</li>
<li class="MsoNormal">The      database experts fire up long-running, expensive data extraction processes      to select data for analysis.  Often,      special data warehousing technology is used just for that purpose.</li>
<li class="MsoNormal">The      statistical experts pound away at the data in their dungeons, torturing it      until it reveals its secrets.</li>
<li class="MsoNormal">The      results are made available to business operating units, both as reports      and in the form of executable models.</li>
</ol>
<p class="MsoNormal">Each in its own way, KXEN and Verix (the imminent new name of the company now called <a href="http://www.b-events.com/" onclick="javascript:pageTracker._trackPageview('/www.b-events.com');">Business Events</a>) want to change all that.</p>
<p><span id="more-115"></span></p>
<p class="MsoNormal"><a href="http://www.kxen.com/" onclick="javascript:pageTracker._trackPageview('/www.kxen.com');">KXEN</a> believes they have found a one-size-fits-all set of data mining models and algorithms.  This is <em>not</em> an SVM (Support Vector Machine), which they actually don’t offer any more, but rather something else from the fertile brain of SVM co-inventor Vladimir Vapnik, called Structured Risk Minimization.  While the details have been published, they asked me not to write about them anyway for some kind of security-through-obscurity competitive reason.  So let’s just say that these are <em>not</em> just the linear models they previously were or seemed to be stuck with.  (For a small company with limited footprint, there sure is a lot of false information out there about how the whole thing works.)</p>
<p class="MsoNormal">A limited set of models lets one design a fairly simple user interface, especially when the models are good at helping one zoom through what otherwise can be annoying steps (like variable reduction, in which you choose which 80-90%+ of the data columns to disregard).  Based on that relative simplicity, KXEN wants to let business users data mine directly, without being dependent on statistical specialists and their machinery.  They position this as providing better results, because it allows rapid-cycle-time data exploration.</p>
<p class="MsoNormal">They also have a pickier statistical point to make, which is that their model-building process is streamlined and automated enough that it’s realistic to build lots of parallel “local” models, e.g. for each store or region in a retail chain.  By way of contrast, in traditional data mining one would normally have one model used for all localities, but perhaps with additional variables indicating which locality the model was currently referring to.  KXEN confidently believes that its way is superior, but in a recent discussion didn’t actually provide me with much beyond hand-waving to back that claim up.</p>
<p class="MsoNormal">I don’t actually have a good feel for how well these pitches are being received by the market.  KXEN’s biggest sales successes seem to be via partnerships with various other analytics players, and it’s tough to judge whether that’s due more to price or to embeddability or to the fundamental merits of their overall case.</p>
<p class="MsoNormal">Business Events, imminently to be renamed Verix, is a raw start-up with a story even more extreme than KXEN’s:  Sophisticated analytic results just delivered on a SaaS basis, with no thinking required by the customer at all.   Obviously, this can only make sense if the universe of possible results is rather limited, and indeed it is.</p>
<p class="MsoNormal">Verix’s approach assumes a classical star set-up:  A single measure/fact table and a complexly hierarchical set of dimensions.   Verix looks exhaustively at time series on the facts, pulls out all series that are showing anomalies in two or more dimensions at once, and isolates exactly the point in the dimension network where the anomaly is occurring.  If sales of frobalizing widgets in Houston are off plan, it identifies whether this is really a Houston issue or a Texas one, and whether it’s a problem just for frobalizers or – gulp – for the entire widget category.</p>
<p class="MsoNormal">The company claims that some insights you get this way just wouldn’t have been found by conventional BI.  E.g., if frobalizers are down in Houston but up in Dallas, and the analysis stopped at Texas, nobody (not even the Houston district manager?) would ever know of the great Houstonian frobalizer downturn.</p>
<p class="MsoNormal">The company sounds like they’re working on all the right things to generalize this model.   Initial interest in what they have seems to be concentrated in the pharmaceutical and CPG (Consumer Packaged Goods) industry, although there are a couple of paying telecom customers as well.  One thing pharma and CPG have in common is that a lot of your raw data comes from third parties, such as IMS, and so your sales data are visible to your competitors anyway.  Given that, it’s easy to believe that the SaaS nature of the service isn’t causing a lot of customer discomfort.</p>
<p class="MsoNormal">And by the way, IT departments aren’t involved in the Verix buying process whatsoever.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/10/04/kxen-and-verix-try-to-disrupt-the-data-mining-market/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Data mining requires data</title>
		<link>http://www.monashreport.com/2006/10/04/data-mining-requires-data/</link>
		<comments>http://www.monashreport.com/2006/10/04/data-mining-requires-data/#comments</comments>
		<pubDate>Wed, 04 Oct 2006 05:16:47 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/10/04/data-mining-requires-data/</guid>
		<description><![CDATA[Data mining requires and justifies huge investments.  The smallest part is the data mining software itself.  A much bigger part is the investment in data warehouse technology, a subject about which I&#8217;ve been posting extensively recently on DBMS2.com.  But there&#8217;s yet another part to the picture, namely investing in actually gathering data [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/" >Data mining</a> requires and justifies huge investments.  The smallest part is the data mining software itself.  A much bigger part is the investment in data warehouse technology, a subject about which I&#8217;ve been posting extensively recently on <a href="http://www.dbms2.com/category/relational-technology/" onclick="javascript:pageTracker._trackPageview('/www.dbms2.com');">DBMS2.com</a>.  But there&#8217;s yet another part to the picture, namely investing in actually gathering data for analysis, that I&#8217;ve written about, most recently in a blog I posted elsewhere and am now copying below.<br />
<span id="more-114"></span></p>
<blockquote><p>Analytic business processes &#8212; or the areas of overlap between analytics and business process &#8212; are poorly understood. Business Activity Monitoring and Operational BI? Great buzzwords, but there&#8217;s way too little thought put into figuring out exactly which metrics are most useful for making which kinds of business decisions. Continuous planning/budgeting? The surface has only been scratched. A numerate, &#8220;one-truth&#8221; enterprise culture? Hah. When we identify an enterprise that truly has a pervasive numbers-oriented culture, it usually is one that winds up pathologically managing to a purely short-term set of goals. (But some exceptions to that rule are among the great corporations of the world.)</p>
<p>One area that really needs more consideration is data capture. You can&#8217;t analyze data you don&#8217;t have. Certain industries have indeed recognized this. E.g., travel and gaming have been hugely successful with loyalty cards; indeed, casino giant Harrah&#8217;s probably gets over 100% of its profits via targeted marketing based on the mining of its loyalty card data. Credit transaction data and the like is of course also heavily exploited. I made this whole case in a <a href="http://www.computerworld.com/action/article.do?command=viewArticleBasic&#038;articleId=103054" onclick="javascript:pageTracker._trackPageview('/www.computerworld.com');">Computerworld column</a> a year ago, and if you missed it I suggest still checking that column out today.</p>
<p>But that&#8217;s all transactional data. The story for text data is much worse. Indeed, survey forms typically try to force people away from just saying what they think, instead giving them endless checklists that bring back unhappy memories of SATs and #2 pencils. Yet <a href="http://www.texttechnologies.com/category/text-mining/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">text mining</a> technology now exists that makes it possible to glean crucial information from free-form text. If you haven&#8217;t already checked it out, you should.</p></blockquote>
<p>Particularly interesting, I think, are some examples in the area of <a href="http://www.texttechnologies.com/2006/06/16/data-capture-for-the-sake-of-text-mining/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">text data and analytics</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/10/04/data-mining-requires-data/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>My actual column on data mining</title>
		<link>http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/</link>
		<comments>http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/#comments</comments>
		<pubDate>Tue, 12 Sep 2006 03:59:10 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/</guid>
		<description><![CDATA[In a couple of recent posts about data mining, I referenced a Computerworld column due to run September 11.  Wonder of wonders, they got it posted on the very first day.  Here&#8217;s a link.
]]></description>
			<content:encoded><![CDATA[<p>In a couple of recent posts about <a href="http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/#more-110" >data</a> <a href="http://www.monashreport.com/2006/09/02/further-information-on-data-mining/" >mining</a>, I referenced a <em>Computerworld</em> column due to run September 11.  Wonder of wonders, they got it posted on the very first day.  Here&#8217;s a <a href="http://www.computerworld.com/action/article.do?command=viewArticleBasic&#038;articleId=112733" onclick="javascript:pageTracker._trackPageview('/www.computerworld.com');">link</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/09/11/my-actual-column-on-data-mining/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Where does data mining succeed, and why?</title>
		<link>http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/</link>
		<comments>http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/#comments</comments>
		<pubDate>Sat, 09 Sep 2006 03:16:54 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/</guid>
		<description><![CDATA[As previously noted, I have a Computerworld column coming out next week on data mining. The heart of the column is an enumeration of markets where data mining applications were having genuine success. Before I sat down to actually write the column, my list went something like this:

There’s      a  [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">As <a href="http://www.monashreport.com/2006/09/02/further-information-on-data-mining/" >previously noted</a>, I have a <em>Computerworld</em> column coming out next week on data mining. The heart of the column is an enumeration of markets where data mining applications were having genuine success. Before I sat down to actually write the column, my list went something like this:</p>
<ul style="margin-top: 0in" type="disc">
<li class="MsoNormal">There’s      <a href="http://www.texttechnologies.com/2006/07/27/application-processes-in-text-mining-%25e2%2580%2593-finding-warning-signs/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">a      large set of “early warning” apps where text mining is being deployed</a>. Many of those same apps are addressed by data mining of tabular data too – antifraud, to start with, and also warranty tracking and indeed most of the rest.</li>
<li class="MsoNormal">Data      mining has been huge in CRM.</li>
<li class="MsoNormal">The use of data mining in manufacturing to do failure analysis, improve quality, etc. is really on the rise. This goes at least somewhat beyond what one could reasonably pigeonhole as “early warning.”</li>
<li class="MsoNormal">Data      mining plays a big role in the life sciences, and is being applied to a      broad range of other sciences as well.</li>
<li class="MsoNormal">Data      mining is a huge part of R&amp;D at search engine and antispam vendors.</li>
</ul>
<p><span id="more-110"></span></p>
<p class="MsoNormal">By the time I submitted the column, the list had morphed into:</p>
<ul style="margin-top: 0in" type="disc">
<li class="MsoNormal"><strong>Customer offer targeting</strong>.</li>
<li class="MsoNormal">Other CRM      applications, often of text mining, such as <strong>reputation management</strong> or just <strong>sentiment tracking.</strong></li>
<li class="MsoNormal"><strong>National security, antifraud, and      crime prevention.</strong></li>
<li class="MsoNormal">Purer <strong>portfolio/risk management</strong> applications.</li>
<li class="MsoNormal"><strong>Defect      tracking</strong><strong>.</strong></li>
<li class="MsoNormal"><strong>Health care and scientific research. </strong></li>
</ul>
<p class="MsoNormal">For lots of examples and explanation of the categories, please see the column when available. (Theoretically that should be on the inauspicious date of September 11. In practice, it could be any time next week. I’ll post a link here when I know of one that works.)</p>
<p class="MsoNormal">While the latter version of the list may be slicker and more precise, which is why I went with it in the column, I think the former is more useful for a discussion of <em>why</em> those particular apps are the ones that get adopted.  Simply put, data mining apps are concentrated at two extremes:</p>
<ol style="margin-top: 0in" type="1">
<li class="MsoNormal">Seeking      “gold nuggets” of insight.</li>
<li class="MsoNormal">Continuous      process improvement.</li>
</ol>
<p class="MsoNormal">What’s more, if I had to pick just one of those categories, I’d pick #2. The annals of BI are replete with examples of insights that just leapt out of reports and danced straight to the bottom line. But those stories are generally about reports and OLAP analyses, not full-blown statistical workups. Don’t get me wrong &#8212; I’m sure there are plenty of cases of data mining producing hugely valuable sudden insights. But, uh, I can’t think of any right now, at least not in the mainstream statistical analyses we usually think of when we hear “data mining.” (Perhaps some kindly product vendors will help me out with examples. If nothing else, there should be examples in the life sciences, forensics, product quality, etc. – i.e., in applications where there only ever was one single answer to discover in the first place. )</p>
<p class="MsoNormal">Where data mining does succeed all the time is in areas such as marketing efficiency improvement – mailing smarter, better targeting customer offers, and of course avoiding “bad guy” customers such as fraud or default risks in the first place. Text mining is something of an exception to that rule – but then, despite its name, it’s not clear that all of text mining should be classified as data mining anyway. Some of it is just “knowledge/fact/information extraction”, which generally is used to inform analytic technologies of some sort or other. But those can be regular BI or text search or whatever, with data mining just being one of the candidates on the fact-consumer-technology candidate list.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/09/08/where-does-data-mining-succeed-and-why/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Further information on data mining</title>
		<link>http://www.monashreport.com/2006/09/02/further-information-on-data-mining/</link>
		<comments>http://www.monashreport.com/2006/09/02/further-information-on-data-mining/#comments</comments>
		<pubDate>Sat, 02 Sep 2006 14:11:31 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/09/02/further-information-on-data-mining/</guid>
		<description><![CDATA[My September Computerworld column (I’ll post a link, no sooner than September 11) is about data mining.  As promised in that column, here are some links and guides to further work on the subject.


I have      posted extensively on text mining      over on the Text [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">My September <em>Computerworld</em> column (I’ll post a link, no sooner than September 11) is about data mining.  As promised in that column, here are some links and guides to further work on the subject.</p>
<p class="MsoNormal">
<ul type="disc" style="margin-top: 0in">
<li class="MsoNormal">I have      posted extensively on <a href="http://www.texttechnologies.com/category/text-mining/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">text mining</a>      over on the <em>Text Technologies</em>      blog.</li>
<li class="MsoNormal">In      particular, much of the column was based on a post in which I discussed “<a href="http://www.texttechnologies.com/2006/07/27/application-processes-in-text-mining-%25e2%2580%2593-finding-warning-signs/" onclick="javascript:pageTracker._trackPageview('/www.texttechnologies.com');">early      warning</a>” applications of text mining.</li>
<li class="MsoNormal">The      research was informed by a trip to the KDD 2006 conference, about which I’ve      <a href="http://www.monashreport.com/2006/09/02/kdd-2006-data-mining/" >blogged separately</a>.</li>
<li class="MsoNormal"><a href="http://www.sas.com/technologies/analytics/datamining/index.html" onclick="javascript:pageTracker._trackPageview('/www.sas.com');">SAS</a>      is world’s biggest vendor of this stuff, so if you want to know what the      applications are, you might want to start with their website.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/09/02/further-information-on-data-mining/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>KDD 2006 conference on data mining and knowledge discovery</title>
		<link>http://www.monashreport.com/2006/09/02/kdd-2006-data-mining/</link>
		<comments>http://www.monashreport.com/2006/09/02/kdd-2006-data-mining/#comments</comments>
		<pubDate>Sat, 02 Sep 2006 14:06:27 +0000</pubDate>
		<dc:creator>Curt Monash</dc:creator>
				<category><![CDATA[Analytic technologies]]></category>
		<category><![CDATA[Data mining]]></category>

		<guid isPermaLink="false">http://www.monashreport.com/2006/09/02/kdd-2006-data-mining/</guid>
		<description><![CDATA[I went to the KDD 2006 (Knowledge Discovery in Databases) conference in Philadelphia last week.  It was an interesting, if weird experience.   The conference had been billed to me as the place where all the world’s great data mining/KDD experts gather.  This turns out to have been old news; the conference [...]]]></description>
			<content:encoded><![CDATA[<p class="MsoNormal">I went to the KDD 2006 (Knowledge Discovery in Databases) conference in Philadelphia last week.  It was an interesting, if weird experience.   The conference had been billed to me as the place where all the world’s great data mining/KDD experts gather.  This turns out to have been old news; the conference has apparently fallen off some the past 2-3 years.   What are left are an academic conference and a small trade show that seem to be only loosely coupled.  Here’s what I experienced at each.</p>
<p class="MsoNormal">
<p class="MsoNormal"><span id="more-108"></span></p>
<p class="MsoNormal">
<p class="MsoNormal">At the academic part, I didn’t actually experience all that much.  In part, this was because of the accents.  I don’t think I’ve ever been around as many people with Indian surnames whose English was so incomprehensible.  And while one would typically be prepared for thick accents from Chinese-named folks, usually there are a LOT more exceptions to that stereotype than there were in this particular case.  Even keynote addresses were not immune.  So I didn’t actually attend very many sessions.</p>
<p class="MsoNormal">
<p class="MsoNormal">That said, I attended a few talks, browsed a lot of papers, chatted with a number of attendees, and came away with a few observations, including:</p>
<ul type="disc" style="margin-top: 0in">
<li class="MsoNormal">Text      mining is a big area of research.</li>
<li class="MsoNormal">So is      network/link/graph analysis.</li>
<li class="MsoNormal">Knowledge      is terribly stovepiped.  Graduate      students working on fast algorithms didn’t think about parallel      processing.  A Yahoo presenter with      a small state-transition matrix didn’t think about Markov chains.</li>
<li class="MsoNormal">Notwithstanding      the stovepiping, there’s great interest in applying data mining to other      disciplines, especially academic ones.</li>
<li class="MsoNormal">The      handwringing about our educational system has merit.  Very few of the young people there were      actually Americans, and the young Americans I did talk to seemed to have      more personality than actual grasp of their material.</li>
</ul>
<p class="MsoNormal">
<p class="MsoNormal">The vendor part consisted mainly of a few usual suspects, led by SAS, SPSS, and Oracle, and a few outfits there to hire researchers, led by Google, Microsoft, and Yahoo.  I had the chance to talk for hours each w/ Oracle and SAS, which has heavily informed a column I just submitted to <em>Computerworld.</em>  (Watch this blog for a link on or soon after September 11).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.monashreport.com/2006/09/02/kdd-2006-data-mining/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>
