Yandex Data Processing component interfaces and ports
For some Yandex Data Processing components, such as Hadoop, Spark, Yarn, and Zeppelin, there are custom web interfaces available on a cluster's master host. These interfaces can be used:
- YARN Resource Manager and HDFS Name Node: To manage and monitor cluster resources.
- Spark History and JobHistory: To view job statuses and debug jobs.
- Apache Zeppelin: For collaboration, experiments, or ad-hoc operations.
Yandex Data Processing enables you to create clusters accessible from the internet or only from a cloud network. However, we recommend making service component interfaces inaccessible from outside Yandex Cloud in any configuration. You can connect to Yandex Data Processing component interfaces either using UI Proxy or an intermediate virtual machine.
UI Proxy is a tool that allows you to proxy the cluster component interface with HTTP traffic encryption and authentication via the Yandex Cloud IAM. To access the interfaces, the user must be logged in to Yandex Cloud, have cluster view permissions and the dataproc.user
role.
UI Proxy is disabled by default. To use it, enable it when creating or configuring a cluster and view a list of web interfaces available for connection.
Warning
You may need to additionally set up security groups to use UI Proxy.
Components and ports
Service | Port |
---|---|
HBase Master | 16010 |
HBase REST | 8085 |
HDFS Name Node | 9870 |
Hive Server2 | 10002 |
Livy | 8998 |
MapReduce Application History | 19888 |
Oozie | 11000 |
Spark History | 18080 |
YARN Application History | 8188 |
YARN Resource Manager | 8088 |
Zeppelin | 8890 |