Platform features
Data management and analytics functions |
Projects & components |
Cloudera Private Cloud Base Edition |
Enterprise Data Hub |
HDP Enterprise Plus |
| Distributed batch processing of large data sets | Apache Hadoop | |||
| Database for structured data storage of large tables | Apache HBase +conn, +indx | |||
| Data warehouse summarization & ad hoc querying | Apache Hive | |||
| Metadata store for Hive tables | Hive Metastore (HMS) | |||
| Workflow scheduler to manage Hadoop jobs | Apache Oozie | |||
| Columnar storage format for Hadoop ecosystem | Apache Parquet | |||
| Fast compute engine for ETL, ML, stream processing | Apache Spark | |||
| Bulk data between Hadoop and structured datastores | Apache Sqoop | |||
| Job scheduling and cluster resource management | YARN | |||
| Coordination service for distributed applications | Apache Zookeeper | |||
| Store and manage large data sets across a cluster | Apache Accumulo | |||
| Metadata management, governance & data catalog | Apache Atlas | |||
| OLTP and real-time SQL access of large datasets | Apache Phoenix | |||
| Manage data security across the Hadoop ecosystem | Apache Ranger | |||
| Smallest, fastest columnar storage for Hadoop | Apache ORC | |||
| Data-flow framework for batch, interactive use-cases | Apache Tez | |||
| Fast analytical queries on event-driven data | Apache Druid | |||
| Perimeter security governing access to Hadoop | Apache Knox | |||
| Easy interaction with Spark clusters via REST interface | Apache Livy | |||
| Cryptographic key | Ranger KMS | |||
| Notebook for interactive analytics | Apache Zeppelin | |||
| Data serialization system | Apache Avro | |||
| Manage and control Hadoop ecosystem functions | Cloudera Manager | |||
| SQL workbench for data warehouses | Hue | |||
| Distributed MPP SQL query engine for Hadoop | Apache Impala | |||
| Cryptographic key management | Key Trustee Server | |||
| Column-oriented data store for fast data analytics | Apache Kudu | |||
| Enterprise search platform | Apache Solr | |||
| Key Trustee Server hardware security integration | Key HSM | |||
| Transparently encrypts and secures data at rest | Navigator Encrypt | |||
| Real-time streaming data pipelines and apps | Apache Kafka | |||
| Distributed object store for Hadoop | Apache Ozone | |||
| Streams Messaging for data ingestion and buffering | Apache Kafka | |||
| Monitoring and management of Kafka clusters | Streams Messaging Manager | |||
| Replication of cross-cluster Kafka data | Streams Replication Manager | |||
| Integrate with data sources from Kafka | Kafka Connect | |||
| Governance and management of metadata and schemas | Schema Registry | |||
| Auto-balancing of Kafka clusters | Cruise Control | |||
| Light-weight stream processing engine for Kafka | Kafka Streams | |||
| High-performance format for huge analytic tables | Apache Iceberg | |||
| Disaster Recovery & Backups | Iceberg Replication |
