Posts Tagged ‘warehouse’

Advancing big data technologies for the end users

November 19, 2012 Leave a comment

Last week we formally announced and GA’d a slew of new core big data offerings in our big data platform (There was a reason that I have been quiet/offline) and I thought it would be a great time to share them with you all. We started the discussion of our new technologies at the Information on Demand conference at the end of October – but they are now all fully baked in the marketplace.

The 3 new offerings are:

I’ll plan on digging deeper into each one of these offerings over the next few posts – but in summary, we are building out a platform portfolio that is unmatched in the world of big data. We are making it easier for organizations of all sizes to leverage and exploit ‘big data’ to make better decisions.

On the hadoop front, BigInsights not only updates and includes the latest support/versions for the Apache hadoop initiatives but also starts implementing technologies from across the big data platform. In addition to making hadoop more enterprise (rather than a standalone, open source project) BigInsights 2.0 offers a slew of advanced visualizations and tools for users across the organization.

With InfoSphere Streams 3.0, making decisions in real time has just become easier. While Streams has always incorporated a rich programming language (spl) not every user has had the time and effort to master it on the fly. With version 3.0, InfoSphere Streams now incorporates a visual GUI ‘drag-and-drop’ interface to program your own streams… and yes, that interface also generates the proper code as well so that you can alter and enhance granularly as well.

Last but not least, the Vivisimo acquisition in late Spring has already been integrated into the portfolio with the new InfoSphere Data Explorer 8.2 (formally known as Vivisimo Velocity). This offers fast ROI and minimizes risk by helping customers understand their big data assets and unlocking the value – including federated search – leaving data where it is BEFORE you determine if you are going to use it/analyze it….

Yeah – I’m a product guy, and well – new things are cool – so I get a little nerded out when we release offerings like this. In my next blog installments, I’ll drill into each one of these offerings and show you why we added the features and capabilities that we die (Yes, we did listen) – and most importantly how it helps you with your big data initiatives.


Sorting Through the Clutter

August 14, 2012 37 comments

Entering into this world of Big Data headfirst, I am overwhelmed with the amount of buzz and hype surrounding the topic. The other day I read the article ‘How Big Data Became so Big’ by Steve Lohr via the NY Times website and it really set the stage for the world’s challenge of Big Data. You know something has hit the big time when Dilbert references it in passing.

Per my previous post, I do not view Big Data as a product (or as a group of products), but instead as a challenge that organizations face in their journey to analyze ALL of the data made available to them to make better decisions. Hadoop is one tool to get there – yet not the only one. Over the years we have gone from machine readable punch cards to petabytes of data stored on an array of different disk types – commodity through high performance solid state.

Great – lots of storage for data – more clutter – just like my email account. Could end up being an episode of hoarders for techos.

It’s not just the analysis of the data that is important (think a superfast data warehouse appliance cranking through queries – ala Netezza) but also the determination if the data is actually worth being stored. It is like one big garage sale. There is so much to dig through, so many items old and new – You sure as heck are not going to take it all home with you – as most of the items are garbage and not needed – they would just sit around in your house  (warehouse that is) and waste premium storage – and perhaps trip you up on the way to the car (or to your analytical appliance) that you have revved and raring to go

This is where hadoop comes in handy. Hadoop sorts through your ‘digital exhaust’ (as well as any other massive load of data) and culls insight or information from it. This result can then be sent to the data warehouse for analysis – It does not have to be sent there, but in most cases I’m assuming that many folks would like to include the new insights into their analytics.

Think customer churn models, if hadoop was able to determine 1 or 2 other hidden or unknown traits of a customer segment from lets say, web click through routines  (the exhaust) – The analysis would be much more accurate and theoretically save (or make) the organization money.

There are many ways that hadoop technologies can be a part of your enterprise data warehouse or big data platform – this was just one simple example that I like to use to get my head around the technology.

At the end of the day, hadoop enables analysis of Big Data problems – It might not answer them all on its own – but it is a key player (if not ‘the’ key player) in Big Data Analytics.

Back into the swing of things, …now with a Big Data focus

August 10, 2012 3 comments

It has been a busy summer so far, just getting back home from San Diego the other day for last week’s TDWI world conference.  As usual it was a great event, but one thing that was quite apparent at the event was the advent of ‘Big Data’. I don’t mean just the addition of Hadoop based companies to the mix (ie Hortonworks and the like) but most every vendor had some sort of Big Data story to tell. Even SAS had their booth just plastered with Big Data messages, and does not offer any specific ‘Big Data’ product.

For full disclosure, I’m partial to this growth of Big Data promotion in the marketplace, as I recently migrated my professional focus at IBM to this specific area. There is a level of excitement that surrounds Big Data that I have not seen since the early days of Linux adoption.  Folks are clamoring to get in on this technology and surrounding buzz. From developers through consultants, many of my warehouse discussions had some sort of ‘Big Data’ piece to it.

So what is ‘Big Data’? That is a question that I hear left and right – not just at the conference, but in general day to day business. In my limited layman type approach to the definition, I refer to ‘Big Data’ as the challenge organizations and professionals have to start using ALL of the data available to them to make better decisions.  Are you using all of it – even the stuff you normally throw away.

Does Hadoop help this? – sure – it is a key enabler in the movement.

Does streaming technology ?– absolutely – decisions in motion – awesome.

Data Warehousing? uh of course – what do you do with all of this Hadoop based data once you sort through it and deem it ‘relevant’? 

Look the list goes on and on here, but the fact is that in my opinion ‘Big Data’ is a part of a larger information ecosystem. It is the challenge to leverage all of your data available to you – not just the items that are placed in front of you, but also the ‘digital/data exhaust’ (great term that was created that I openly support) that to this point in time you have not had time to analyze.

I would suggest that part of the issue of understanding is that every company that has a dog in this fight has crafted the message to suit their own needs. Many vendors only have a subset of the products that make up an exhaustive Big Data platform and skew the definition to support this.

At the end of the day – regardless of definition, the question that you have to ask yourself is ‘Would you be better off leveraging ALL of the information that is available to you, to make better decisions?’ If you desire to be an analytical competitor, your answer is definitely yes.  This is why Big Data is big …and why the hype is warranted.

Video: IBM Smart Analytics System 5710 Installed

About a month or so ago, we installed one of our Smart Analytics System 5710s in our executive briefing center here in Raleigh, North Carolina. Having obtained my video camera the day before for my current trip to Nanning, China I decided to see if I could give it a test run with the installation.

It was remarkable that the installation took under 30 minutes. We crafted a 2 minute (or so) video that walks through highlights of the installation and some of the actual screenshots of the dashboards and reports that you get right out of the box.

Take a look and let me know what you think: