Classic data analysis was performed over data resting mostly on a rela- tional database or plain text. The natural consequence of this approach is that the analysis may be conducted on the whole collected data, or on a data batch accumulated over a period of time. Subsequently, because of the massification of portable devices, traditional storage methods be- came insufficient due to the enormous amount of fluid data available. Thus, new technologies like streaming analytics emerged to solve pre- vious limitations.
Streaming analytics processes never ending data originated from con- nected devices(IoT), people networks (social media), and interrelated complex systems (autonomous platforms) among others. Some goals of streaming analytics are to facilitate real time statistical analysis, to per- form machine learning analysis and training, and to interact with other frameworks for permanent data storage. Real time analysis refers not only to real time data analysis, but also to the analysis of data batches
collected over short periods of time, ranging from seconds to minutes. In addition, these systems must be able to place the information in tem- porary storage during a specific period of time and should also be able to store more than one temporal batch.
CCSE-DCR provides a cluster with 2 Kafka nodes at the moment but can be expanded as needed. This is a shared resource for more information contact CCSE-DRC.
Version:
Kafka: 0.10