The challenge presented by Big Data and what is means for the self-service kiosk industry
James Oladujoye, CEO of GWD Media, gives his thoughts on how kiosk software providers can help their customers overcome the Big Data challenge.
“Big Data” is a term used to refer to data sets whose size is beyond the ability of standard software tools to process in a reasonable timeframe. In recent times, Big Data has more been something to avoid than something that needs dealing with, but as technology leaps forward our attitude to Big Data is changing.
The issue of how to handle Big Data has come to the fore due to an increasing need for just about everything in the world to be monitored, measured and managed. The problem of Big Data is starting to hit home in our own industry (self-service kiosks) where kiosk managers need up-to-the-minute information on what is going on across their estate. An average estate might have 250 self-service kiosks, so when you multiply that an average of 500 transactions every day, it is easy to see how data is being generated quicker than the technology can store, process & manage it.
The following facts illustrate the unique problems and opportunities that are being created as a result of these developments.
- – IDC research findings show that the amount of information created, captured or replicated has exceeded available storage for the first time since 2007.
- – The size of the digital universe in 2012 will be ten times the size it was in 2007.
- – Unstructured data makes up approximately 80% of all new data and requires some form of management.
- – Data generated about an individual is increasing in spite of their wishes to the contrary. There is now data held on the internet for a specific person that was created about them than is created by them.
The challenge therefore is to find innovative ways to manage data. Because humans need to be able to understand the data in ways that are relevant to a specific function, thought needs to be given not just to the management but to the presentation of it. Organisations will increasingly need to find ways to funnel and optimise their data and then find ways to get an edge on their competition through smarter data management. The true cost of this is hugely underestimated.
Many organizations are trying to tackle this issue by throwing ever increasing amounts of storage at the problem. Companies will purchase state-of-the-art storage facilities, which then become outdated within 6-12 months and the whole process has to be completed again. The net effect is excessive and unnecessary expenditure, followed by undue stress for the IT infrastructure managers. The lack of an obvious solution for Big Data causes of great inefficiency.
To add to this, a key factor contributing to the Big Data problem is duplication. A large proportion of a company’s stored data will be replicated many times over. One of our customers, a leading self-service company, generates 100 terabytes of data. Data from the kiosks is requested and stored (copied) by 12 different divisions (which adds 5 terabytes of additional synthesized data per division). Consequentially they have to maintain a total of over a petabyte of data of which less than 150 terabytes is unique. However the entire petabyte of data is backed up, moved to a disaster recovery site and consumes power and space for storage. This is bad practice.
Although pioneers have developed architecture designed to store and process large amount of data. There is still work to be done to manage the whole Big Data lifecycle in a smarter, more resource efficient way.
- Step 1: For starters, reducing data to a unique set will drastically reduce the storage overheads.
- Step 2: Use virtualization technology: organizations must virtualize this unique data set so that multiple applications can reuse the same data and that data footprint can be stored more easily.
Reducing data footprint and virtualizing the reuse/storage of the data and centralizing management of the data set, will result in Big Data ultimately getting converted into small data and as such can be managed as virtual data. Now that the data footprint is smaller, organizations will dramatically improve data management in three key areas:
- – Faster processing speeds
- – Data security through centralized management
- – Accuracy of data
The answer to the Big Data problem therefore is virtualization as it offers better results at lower cost and reduces many of the head-aches traditionally associated with data management.
This is the approach employed by the Genkiosk kiosk management software. Our touchscreen kiosk software processes millions of transactions every hour, so the benefits of this approach are immediately evident.