Thursday, March 11, 2010

Breaches of data confidentiality can be costly

In a previous post last year I mentioned a particularly nasty and blatant breach of confidentiality regarding fixed line telephony data. The update is that Optus recently won a court case in Federal Court to seek damages against Telstra;,optus-wins-telstra-confidentiality-breach-ruling.aspx

This news seemed to slip the major national newspapers, which is quite surprising as it is likely to involve significant amounts of money. To be honest I’m not concerned with the consequences, but as a data miner it does interest me how data *is* used, and how it *could* be used.

As technology advances I’m certain the general public will see more examples of invasions of personal privacy and breaches of data confidentiality that enable organisations to gain the upper hand (unless or until they are caught).  Keep it honest people!


Anonymous said...

Hi there. Excuse me, im not commenting on the original post. I have a question about Clementine and think you can help me. I have a couple of Clementine 12 generated C5.0 and NN models.
I want to deploy them on Excel 2003 sheets for real time scoring and share them (no need for teh end users to know anything about data mining or how their inputs are procesed in the model). I can`t find a good tutorial on how to do this and ive tried everything. Please help ! note : i have basic VBA skills. Thanxs very much.

Tim Manns said...

There is a development add-on to Clementine named Clementine Solution Publisher (well, not sure what's it is named these days, probably PASWASP...).

Solution Publisher allows you to take all the data processing and model scoing and wrap it up as a single executable (or embeddable .dll files that can be used in VB or VB.NET).

You can then simply call the executable from any application (or develop a solution using the dll's).

There are other options for using PMML, but I am not familar with them, and probably requires more development work. You could export a model as PMML (standard of XML) but you would need to write a lot of code to score the model and do data preparation etc.

When I worked at SPSS (5 years ago) I wrote some examples in VB.NET. These might still be on the public SPSS ftp site at;
-> i can't access ftp via work, so can't check this still exists...



i2 said...

Hi, me again. Thanx for your answer. I've ran publisher and now have 3 files (.xml, .par & .pim). No executable or dll. No options available to generate such files.
I could download SP Runtime, but then again is throwing me errors whenever i hit the "RUN" button even with the examples.
Anyway, i would prefer to embed it on Excel rather than a standalone application. Any ideas ??? Tahnx very, very much.

Neena Joshi said...

Very Impressive Hadoop tutorial. The content seems to be pretty exhaustive and excellent and will definitely help in learning Hadoop course. I'm also a learner taken up Hadoop training and I think your content has cleared some concepts of mine. While browsing for Hadoop tutorials on YouTube i found this fantastic video on Hadoop.Do check it out if you are interested to know more.