The enterprise messaging technology lays the foundation of communication for an organization. Different applications, external and internal systems, utilize this technology to ensure a proper flow of information within or amidst organizations and individuals. As enterprises continue to communicate with a large number of individuals and customers, there is a growing need for the resulting data to be skilfully analysed and reported. One of the biggest challenges faced by the organizations today is not the lack of data but a dearth of actionable insights. Therefore, there is an increasing need amongst organizations to analyse historical data, dating back to at least last two years or more. While earlier we would archive data older than say 3 months, now it needs to be active for querying. This would amount for nearly terabytes perhaps petabytes of information for active querying, depending on how actively an organization has communicated in the past.
While traditional technologies such as relational databases are adequate to query or process small bits of data, these do not return results in a reasonable period of time, in case of big data accrued for well over some months. Allowing organizations to examine large chunks of data and uncover hidden patterns, big data technologies gained mainstream prominence throughout 2017. The promise offered by big data analytics, inspired ACL Mobile to venture into the paradigm, and bring in the same fastidious and real-time analysis in the field of enterprise messaging.
However, the hard part was selecting the right technology mix which would be easy to integrate with the existing platform without disturbing the workflow. At the same time, the technology should also provide the ability to view details of each individual message, aggregated results, and text based advanced search in a reasonable time.
A number of alternative technologies such as Hadoop, Sparkand GPU-based alternatives were explored and each presented themselves with a mix set of advantages and challenges. On the GPU side, vendors were limited in number, and mostly based out of the USA. Since it was an upcoming technology, there were not enough use cases of commercial deployments. Hence there was no clear cut winner as to which is the best and prejudging without implementing was not possible.
Among other things, one aspect that ACL wanted was availability of standard SQL interface, which would offer ease of integration with our base system. Availability of this interface, as well as minimised hardware footprint with capability of real time and historical analysis in GPU based big data technology made ACL to go with it.
Integrating any new technology has its own set of challenges, and this very integration was no different. However, with more than six months of working with this technology, we still seem to be learning its nuances and exploring newer, innovative ways for streamlining the data. Though this is initial stage of the technology integration but days when query would not return in reasonable period of time, or restoring near old data in tapes for offline querying, may just become a thing of the past!