Optimize the Data Warehouse

Reduce strain on your data warehouse by offloading less frequently used data and corresponding transformation workloads to Hadoop without coding or relying on legacy scripts and ETL product limitations.

Hadoop Made Simple, Accessible, and 15X Faster

Pentaho simplifies offloading data and transformation workloads to Hadoop and speeds development and deployment time by as much as 15x versus hand-coding approaches. Complete visual integration tools eliminate the need for hand coding in SQL or Java-based MapReduce jobs.

Save on data costs and boost analytics performance

  • An intuitive graphical, no-coding big data integration
  • Access to every data source – from operational to relational to NoSQL technologies
  • Support for every major Hadoop distribution with a future-proof adaptive big data layer
  • Achieve higher processing performance with Pentaho MapReduce when running in cluster
  • 100% Java, fast and efficient

As part of the Pentaho Business Analytics Platform, Optimizing the Data Warehouse Blueprint provides a quick and cost-effective way to immediately get value from data through integrated reporting, dashboards, data discovery and predictive analytics.

Example of how Optimizingthe Data Warehouse may look within an IT landscape:

  • A company leverages data from disparate sources including CRM and ERP systems
  • A Hadoop cluster has been implemented to offload less frequently used data from the existing data warehouse
  • The company saves on storage costs and speeds-up query performance and access to its analytic data mart

The Results

  • Staff savings and productivity: Pentaho’s Visual MapReduce graphical user interface and big data integration means existing data warehouse developers can move data between the data warehouse and Hadoop without coding
  • Time to value: MapReduce development time is reduced by up to 15x versus hand-coding based on comparisons
  • Faster job execution: Pentaho MapReduce runs faster in cluster versus using code-generating scripting tools

Leading Global Network Storage Company

Big Data Goal:

Scaling machine data management to enhance product performance and customer success.

  • Affordably scale machine data from storage devices for customer application
  • Predict device failure
  • Enhance product performance

Architecture Example:

Pentaho Benefits:

  • Easy to use ETL and analysis for Hadoop, Hbase, and Oracle data sources
  • 15x data cost improvement
  • Stronger performance against customer service level agreements