Tuesday, August 26, 2008

101 reasons not to upgrade to Clementine 12

Why 101 reasons?
-> because that's close to how many bugs have been added :(
Ok, I'm exaggerating…a little.

We recently 'upgraded' (and I user that term similarly to how a Windows XP user 'upgrades' to Vista) from Clementine 10.1 to version 12.0.2.

For what's new in Clementine 12 see;
http://www.spss.com/clementine/whats_new.htm

I agree the new version has some really cool features, but after using it for a few weeks now I have also found that it has obviously been released far too early in the development and testing cycle. UI design is not up to the usual high polished standard, and there are notably more bugs (granted, usually Clementine has *very few* bugs or problem areas). Clementine is still incredibly stable, the problems I have reported relate to UI design and interface problems. I've yet to find a problem or fault on the data processing side. Some of the problems are minor but have been around for 4+ years, and its frustrating that they are still not fixed. Kinda leaves customers thinking there's little point in providing product feedback at all...

Its been a few years since I left SPSS, so I'm not privy to what's going on in Development. In my view Clementine is still easily the best data mining software out there, but SPSS have clearly rushed this release out the door.

SPSS have replied to me and appear to be taking my criticism onboard. Fingers crossed that an update in the coming months will resolve the issues I have raised.

A few of the main points I raised;
- can no longer save the stream whilst data execution is occurring. Nasty loss of feature. I consider this one quite serious. Users should *always* be able to save the stream at any time.
- Partial outer joins no longer auto-tick the first table connected to a merge node if the order of the connected tables is different. This is a change in default behaviour (always dangerous), and will affect old streams opened in the new version 12 (so your join condition could be different – beware!)
- new pop-up ‘info’ windows that have no purpose and cannot be turned off. Really bad UI design, and not akin to Clementine’s usual interface.
- Charts always prompt to be deleted. Like the pop-up windows, this a new behaviour and quite annoying. There is no way to prevent “Are you sure you want to delete this chart” pop-up messages. Didn't they learn from the old version 7.0? (oldies will remember the "Are you sure you want to exit Clementine message"...)
- Quality Node has gone and there is no replacement functionality. Sure, just delete something from the software for no good reason…

Granted, I use Clementine 6 hours a day and am probably going to encounter problems other usedrs wouldn't, but some of these issues are glaringly obvious.

- Tim

Monday, August 18, 2008

Stratified Sampling in SQL

If you use SPSS Clementine as I do, then you are probably familiar with the Balance node. It performs the function of selectively and randomly sampling your data based upon the values of a field or number of fields. Also known as stratified sampling!

If your data is managed by a data warehouse, then Clementine has this cool behaviour of automatically converting functions into SQL, so the data processing can be performed by the database and less data needs to be extracted and duplicated on another file system.

Unfortunately the Balance node isn't one of the functions automatically converted into SQL. In order to perform stratified sampling you have to take a different approach and selectively pick the values of your target column/field and sample them individually.

On KDKEYS.NET I attached one Clementine version 12.0.2 stream (balance node.str) as one example of how to do this. By using a select condition, followed by a random sample, followed by a union (append) it is possible to easily obtain a stratified sample from a huge dataset efficiently.

I have also pasted below an example of the type of simple SQL that gets processed;

SELECT *
FROM (
SELECT *
FROM (
SELECT *
FROM IPSHARE.TMANNS_DRUG4n
WHERE (Drug = 'drugA')
SAMPLE 0.5
) AS TimTemp1
UNION ALL
SELECT *
FROM (
SELECT *
FROM IPSHARE.TMANNS_DRUG4n
WHERE (Drug = 'drugX')
SAMPLE 0.2
) AS TimTemp2
) AS TimTable
;

- sorry, I couldn't work out how to format the SQL properly in this blog :(

Cheers

Tim

Wednesday, August 13, 2008

iPhone update (what bill shock?)

Update to my previous post, subsequent monitoring of iPhone users is showing that most are within their data download limits. Although the new 3G iPhones are showing slight more data download than their 2G counterparts, it is doubtful that mobile customers are going to be recieving unexpected bills with excess data charges.

I've resisted getting an iPhone so far, but my colleagues keep tempting me...

The pros;
- it has a great user interface. The scrolling nature and design of the UI is amazing. The concept of momentum that exists when you scroll through menus and music library is very cool.
- Optimum size. Its not that small, but yes it has a screen you can actually see. It fits in your back pocket.
- some versions have decent sized storage (16GB etc) for music and video.
- most importantly it has little apps such as the StarWars LightSaber. This app uses the momentum / gyro within the iPhone to react as you wave the iPhone around. It sound just like you have a LightSaber. Being able to turn on a Light Saber during a meeting when a colleague makes a dumb witted comment and chop them into pieces is priceless...
They just took this off of the apps library, but they will be replacing it with an offical one (hopefully still free)
see: http://blog.laptopmag.com/best-most-useless-iphone-application-phonesaber
and: http://macenstein.com/default/archives/1559

The cons;
- battery life isn't so good. The screen uses a lot of power.
- battery cannot easily be replaced, can't carry a replacement for emergencies.
- no support for video calls
- no support for picture messaging

For me long battery life is quite important, and although I could send videos via email etc using the iPhone, I'm surprised it only supports the basic forms of mobile communication (especially considering its a 3G phone).

But the LightSaber app is really cool... :)