Pentaho Big Data Developer Toolkit
Technology and Tools to Operationalize Your Big Data
To ensure your success with your big data analytics project, Pentaho provides these key resources to get you started down the right path.
Pentaho for big data 30-day free trial
Free 30-day trial to test drive our analytics and data integration solution across any big data source. Download now
Pentaho Big Data Community
Collection of documentation, how-tos, best practices, use cases and forums for Hadoop, MapR, Cassandra and MongoDB. Visit the community
How-tos to get started
Access our collection of technical guides, articles and videos when integrating Pentaho with:
Hadoop
How to:
- Configure Pentaho for Cloudera and Other Hadoop Versions
- Loading Data into a Hadoop Cluster, HDFS, Hive, HBase
- Transforming Data within a Hadoop Cluster using Pentaho MapReduce, Hive, and Pig
- Using Pentaho MapReduce to Parse Weblog Data, Generate an Aggregate Dataset
- Transforming Data within Hive and with Pig
- Extracting Data from the Hadoop Cluster
- Extracting Data from HDFS to Load an RDBMS
- Extracting Data from Hive to Load an RDBMS
- Extracting Data from HBase to Load an RDBMS
- Reporting on Data in Hadoop, HDFS File Data, Reporting on HBase Data, Reporting on Hive Data
- Unit Test Pentaho MapReduce Transformation
- Advanced Pentaho MapReduce
- Using Compression with Pentaho MapReduce
- Using a Custom Partitioner in Pentaho MapReduce
- Using a Custom Input or Output Format in Pentaho MapReduce
MapR
How to:
- Configure Pentaho for MapR
- Loading Data into a MapR Cluster, CLDB, MapR Hive, MapR HBase
- Transforming Data within a MapR Cluster
- Using Pentaho MapReduce to Parse Weblog Data in MapR
- Using Pentaho MapReduce to Generate an Aggregate Dataset in MapR
- Transforming Data within Hive in MapR
- Transforming Data with Pig in MapR
- Extracting Data from the MapR Cluster
- Extracting Data from CLDB to Load an RDBMS
- Extracting Data from Hive to Load an RDBMS in MapR
- Extracting Data from HBase to Load an RDBMS in MapR
- Reporting on Data in the MapR Cluster, CLDB File, DataHBase Data in MapR, Hive Data in MapR
Cassandra
How to:
MongoDB
How to:
