You can use the Spark SQL data source API to access data in other relational databases. Access to SAP Vora from other relational databases may be achieved through the JDBC access layer.
To quote from the official SAP Press Release:
“SAP HANA Vora is a new in-memory query engine which leverages and extends the Apache Spark execution framework to provide enriched interactive analytics on Hadoop. As companies take part in their digital transformation journey, they face complex hurdles in dealing with distributed Big Data everywhere, compounded by the lack of business process awareness across enterprise apps, analytics, Big Data and IoT sources.”
In short, SAP HANA Vora extends Hadoop with in-memory computing, enterprise functionality and data science, helping customers simplify their IT landscape and bring context to data lakes. It also extends the HANA platform with the ability to store Big Data and has data temperature management to move data from HANA to Vora.
Vora will run on anything that meets the requirements. For our simple development systems we have used Amazon 8GB or 16GB systems, and in one customer we are currently deploying 2000 cores, 20TB of DRAM and 1PB of disk for a HANA/Vora cluster, thanks to Lenovo System X and EMC Isilon.
Vora is an extension to the Hadoop platform and includes the following features in its first version:
The following features are planned in the near future:
No, Vora is a completely new code base, but the engineering team is the same group as the HANA engineering team, so many concepts and ideas have been borrowed from SAP HANA, as you can see by the feature list.
That’s all quite technical but in short, Vora will run on just about anything. My team has it running in various customers, on Cloudera, Hortonworks and Amazon EMR Hadoop distributions, and on SUSE, RedHat and Ubuntu Linux. It will also run on Windows and Mac.One of the things that I like about Vora is that it goes back to SAP’s roots of being platform-independent. It has some core dependencies, for Vora 1.0 they are – HDFS 2.6, ZooKeeper 3.4.6, Spark 1.4, ProtoBuf 2.6.0, gcc 4.7, Apache Ambari and YARN or Spark Standalone cluster management.
This flexibility is important, because many customers have already made a choice of Linux and Hadoop distributions. Vora runs on all of them.
SAP Vora offers three licensing options to support different requirements:
SAP Vora supports familiar programming languages including SQL, Java, Scala, Python, and C++.
Vora runs both in the Cloud and On-premise. SAP’s platform strategy is “Cloud-first”, so it will be available in the SAP HANA Cloud Platform (HCP) and Amazon Clouds. Both will provide single-click provisioning of Vora systems with no fuss, a little later on this year.
You can of course run Vora On-premise, on any hardware platform, and as noted above, almost any Linux and Hadoop distribution. The only thing to note is that some Hadoop platforms are designed as cold data lakes, and Vora is performance-centric, so it needs more spindles-per-core and more memory to run at its best.
SAP Vora supports major Hadoop distributions, including Cloudera, Hortonworks, MapR, and SAP Cloud Platform Big Data Services (formerly known as Altiscale Data Cloud).
Absolutely not. Vora is a standalone piece of software. There are use cases where SAP HANA won’t be installed alongside Vora, and SAP HANA and Vora run most efficiently on different types of hardware.
That said, it’s certainly true that there are a number of integration scenarios for HANA and Vora – tiered data for SAP S/4HANA, and very large in-memory databases, and SAP will build deep
Vora is a bridge between SAP HANA and Hadoop, and as such there are many interesting use cases.
One straightforward use case is around data tiering for existing SAP customers. It will be possible to use commodity Hadoop clusters to store colder data for SAP ERP, like sales documents, pricing and billing conditions. These require occasional analysis and are read-only. We expect SAP to adjust its data tiering strategy for SAP S/4HANA to include Vora.
But as has been the case for SAP HANA, whilst those straightforward use cases are good, SAP HANA Vora is capable of much more exciting and differentiating things.
We are working with customers on such use cases as analyzing huge amounts of manufacturing test data in order to make better real-time packaging decisions, and analyzing fitness data from connected devices in real-time.
Where Vora is particularly interesting is when you need to “bring compute to the data” – running complex algorithms against in-memory data.
In addition, a lot of customers will do a straight rip-and-replace of Teradata with HANA and Vora, because it is substantially more cost-effective and an order of magnitude faster.
SAP Vora can be deployed with SAP Cloud Platform Big Data Services. You can also deploy SAP Vora in production to any cloud infrastructure running one of the supported Hadoop distributions using a bring-your-own-license model. In addition, SAP Vora is available through AWS Marketplace with community support.
Vora 1.0 is planned to be released to Customers on September 18th, with general availability coming later in the year. Various other pieces of functionality will be available later in the year, like the automatic data tiering components.
SAP Vora delivers OLAP and SQL on Hadoop by enhancing Spark SQL to provide built-in enterprise functions such as hierarchy processing, currency conversions, and a graphical Web-based modeling environment for OLAP and drill-down analysis. In addition, because SAP Vora is an in-memory computing solution, it enables you to gain insights faster compared to solutions based on MapReduce or batch processing. And finally, SAP Vora provides an integrated framework that combines relational processing with time series, JSON documents, and graph processing.
Vora was designed to run on any distributed file system, and doesn’t necessarily require Hadoop. We will see in the future what that means, but the important point is that if another distributed file system becomes popular, Vora can adapt. If you look at what happened in the RDBMS over the last 30 years, SAP’s ability to adapt R/3 to different databases was key to its longevity in the market.
You can get started with SAP Vora today by signing up for a test-drive or for the developer edition.
SAP Vora runs on Hadoop. It is a computing solution that provides interactive analytics on data stored in Hadoop systems.
No. You can deploy SAP Vora on an existing Hadoop installation as long as that installation meets the prerequisites stated in the installation manual for SAP Vora.