The Internet of Things (IoT) is expected to deliver a tidal wave of data. It has been estimated that 25 gigabytes of data is generated by the average smart car every hour. This will be a major challenge for any operator of fleet of smart cars. In the future, it is conceivable that all cars will be generating this amount of data and there were roughly 18 million vehicles in Australia alone, in 2015. This is just one example in a myriad of potential applications of IoT.
One project that is generating more data than almost any other in the world is the Square Kilometer Array. It's an international astronomy project currently being built in Australia and South Africa. High frequency radio telescopes are being installed in South Africa and low and medium frequency radio telescopes are being installed in west Australia. This will be a coordinated system of 3000 radio telescopes with a combined area of one square kilometer. The system is going to be operational by 2020 and the computer infrastructure to handle the amount of data that will be coming from that system does not yet exist.
The square km array is expected to generate 30 petabytes of data a day. There is no cloud service that can currently accommodate 30 petabytes of data per day. You may have seen charts which show generation of data from the IoT and production of storage devices, and they're diverging quite significantly. The world is already generating much more data than we can afford to store.
Another example of such a large data volumes is Large Hadron Collider, which generates 10 times less data, about three petabytes of data per day, but they can still cannot afford to store it and process offline because they don't have the luxury to store all that data. Sometimes physicists complain they could miss some important revolutionary discoveries simply because they can't afford to store the data.
So the world is becoming familiar with dealing with data in terms of petabytes (1015) and spy agencies around the world are reportedly collected data in the realm of Yottabytes (1024). However, experts are now saying that the IoT will force us to think in terms of Brontobytes (1027). To compare, three exabytes (1018) is the amount of data contained in half a million of libraries of the size of US Library of Congress, which is considered the largest library in the world.
These large projects and the IoT generally is driving a paradigm shift towards new architectures for data analytics.
- A presentation by Arkady Zaslasky, Data 61, CSIRO in a presentation titled Harnessing the IoT Data Flood