“High performance data analytics are critical to a range of key use cases like click stream data, social media sentiment, buying behavior, and more,” said John Fowler, Executive Vice President of Systems, Oracle. “Through our Software in Silicon Developer Program, developers can now apply our DAX technology to a broad spectrum of previously unsolvable challenges in the analytics space because we have integrated data analytics acceleration into processors, enabling unprecedented data scan rates of up to 170 billion rows per second.”
With the release of the 32-core, 256 thread SPARC M7 processor
, Oracle created a number of Software in Silicon features by building in higher-level software functions into processor design. One of the most exciting new capabilities introduced in the SPARC M7 processor as part of the Software in Silicon innovations in SPARC M7 is DAX, which delivers unprecedented analytics efficiency.
Data Analytics Accelerator on SPARC M7
DAX adds processing capability that can run selective functionality – Scan, Extract, Select and Translate – at incredibly fast speeds. The SPARC M7 DAX accelerates these analytics primitives on a dedicated physical unit separate from the standard compute cores.
Initial software development enabled DAX for Oracle Database 12c, and all the applications above it. This extends analytics acceleration to all Oracle, ISV, and customer applications.
Large scale scan and filter operations are made trivial by transparent use of 32 dedicated DAX co-processors in the SPARC microprocessor which operate at memory bus speeds of up to 160 GB/s between cache and DRAM. These accelerators, implemented for the first time on-chip for the highest level of performance and efficiency, can now be used by developers through APIs in Oracle Solaris 11, and applied to a variety of use cases.
As one notable example of Data Analytics Accelerator integration into machine learning and Big Data use cases, Oracle engineers have shown that the DAX can significantly accelerate Apache Spark, which has become one of the most popular methods for processing large data sets. Through this project, engineers used the DAX with Apache Spark to take one billion rows of data in memory and filter it into a 3D cube so fast that interactive data analytics are now possible.
SPARC M7 and DAX design advantages include:
- Industry-leading delivered memory bandwidth: at an industry-leading 160GB/s memory bandwidth, the SPARC M7 processor provides enough capacity to feed both the DAX units as well as processor cores.
- DAX offload: frees the processor cores for other processing.
- Efficient decompressing combined with in-memory processing: putting decompression in the DAX unit is much faster than software implementations. Designing decompression with scanning means needless back and forth memory transfers are avoided. Results from the DAX are entered into the CPU cache for better CPU efficiency.
- DAX range comparisons: many real-world database analytics queries are written to find data transacted between certain dates, or product cost ranges, etc. The DAX processes range comparisons at the same rate as individual comparisons. Other processors require additional computational time for each comparison.
- Avoiding cache pollution: the DAX does much of its computation without needing to store intermediate data in a cache, freeing the CPU’s cache for other processing.
Partnerships with Developer Community and Leading Higher Education Institutions
Oracle continues to deliver traditional processor enhancements to improve performance of traditional workloads with more than 20 world record results over the competition. Software in Silicon can deliver previously unattainable step function improvements required in areas such as security and data analytics by embedding the functionalities to handle particular algorithms on the processor with greater performance and efficiency.
Oracle has also published several use cases with code samples to maximize developer productivity and expedite projects as well as a detailed example of DAX integration with Apache Spark. Resources can be accessed now via the Oracle Software in Silicon Cloud
, a freely available cloud service for developers and researchers that provides direct access to this technology. Additionally, Oracle is partnering with leading higher education institutions such as Brown University, on innovative research projects with Software in Silicon.
“We are currently working on characterizing the performance of DAX across a suite of modern in-memory data layout schemes. After completing this study, we will work on the optimal use of DAX in accelerating interactive data exploration and visualization with the Tupleware main-memory database system and S-Store real-time stream processing system,” stated Ugur Centimenel, Chairman of the Computer Science Department, Brown University. “Through these studies, we will quantify the performance and scalability of M7 and DAX on real workloads involving sophisticated search and machine learning over large data sets.”
Open APIs for Oracle’s Data Analytics Accelerator are now available for free via the Software in Silicon Cloud. Developers can join the community now to get started on developing the next generation of big data and analytics applications.