How to intelligently aggregate approximations
The growth of low-cost storage platforms has allowed many companies to actively seeking out new external data sets and combine them with internal historical data that goes back over a very long time frame. Therefore, as both the type of data and the volume of data continue to grow the challenge for many businesses is how to process this every expanding pool of data and at the same time, make timely decisions based on all the available data. (Image above courtesy of http://taneszkozok.hu/) In previous posts I have discussed whether an approximate answer is just plain wrong and whether approximate answers really are the best way to analyze big data . As with the vast majority of data analysis at some point there is going to be a need to aggregate a data set to get a higher level view across various dimensions. When working with results from approximate queries, dealing with aggregations can get a little complicated because it is not possible to “reuse” a...