Google is a technology company that develops next-generation technologies impacting billions of users. They are seeking a Software Engineer to work on Cloud Dataproc, focusing on enhancing open-source data analytics workloads in the cloud and developing high-impact features for their products.
Build high-impact customer-facing features which make Cloud Dataproc the best place to run Hadoop and Spark in the cloud
Drive technical design and execution for differentiated Performance and LakeHouse features and enhancements in an ambiguous problem space
Review code developed by other developers and provide feedback to ensure best practices (e.g., style guidelines, checking code in, accuracy, testability, and efficiency)
Enhance Apache Spark for performance, reliability, security, and monitoring, and simultaneously enhance Lake House technologies like Iceberg, Hudi, or Delta Lake for performance, security, and monitoring
Contribute to and adapt existing documentation or educational content based on product and program updates, as well as user feedback, while also extending open-source technologies like Apache Spark, Hive, and Trino to improve their debuggability, observability, and supportability
Qualification
Required
Bachelor's degree or equivalent practical experience
2 years of experience with performance optimization, systems data analysis, visualization tools, or debugging
Experience developing with Spark, Hive, or with similar processing frameworks
Experience with Open-Source
Preferred
Master's degree or PhD in Computer Science or a related technical field
Experience developing frameworks such as Apache Spark, Trino, or Flink
Knowledge of open-source big-data performance optimization problems
Ability to work across boundaries in a distributed team