G-STAT BRAINs™ [Big-data Recommendations for Actionable Insights] applications are based on advanced technologies in the area of big data analytics.
The use of the Spark engine
G-STAT BRAINs applications operate the Spark engine for automatically running thousands of parallel processes using the Spark cluster, enabling high scalability and superior performance. The automatic data management, modeling, machine learning and batch and real-time scoring processes run using Spark SQL, Spark ML and Spark Streaming with embedded R components. In this manner, the Spark engine performs all data management tasks, transformations, calculations and modeling in-memory instead of performing them in a relational DB.
G-STAT Multi-Segment-Modeling Technology
G-STAT patent-pending Multi-Segment-Modeling technology is the "secret sauce" which allows designing and deploying propensity models on customer segments, rather than on whole populations. By using this technology, G-STAT customers enjoy model lifts of up to 50% higher, in comparison to manually-developed models using any data mining tool. This leads to higher response rates in outbound and inbound campaigns, augmented revenues from targeted campaigns and visible ROI after only one or two campaigns.
G-STAT BRAINs applications operate the Spark engine's in-memory analytical processing technology. All model development and deployment processes are done in-memory, without writing to any system tables. This allows significant cost savings in disk space and processing time compared to traditional modeling methods that use common machine algorithms tools and create a multitude of large, temporary tables throughout the process.
Connectivity to Common Big Databases
The Spark engine embedded in G-STAT BRAINs application can read data files from any HDFS resources and can connect to most common databases.
G-STAT BRAINs applications can also work with common relational data bases, using a set of connectors (for Oracle, Exadata, Teradata, SQL Server, Netezza, Vertica, MySql). There is no transfer of huge data sets between servers and environments and thereby, optimal usage of the DB analytical processing power is achieved.