Responsible for building and maintaining data pipelines and transformation logic using
PySpark/Databricks, enabling analytics and reporting use cases.
Roles & Responsibilities
- Develop ETL/ELT data pipelines using PySpark/Spark leveraging lakeflow declarative pipelines.
- Perform data transformation, cleansing, and enrichment.
- Optimize Spark jobs for performance and cost efficiency.
- Build datasets for analytics, reporting, and BI consumption.
- Leverage AWS query federation SDK/Databricks Datasource API for query federation.
- Work with multiple data sources (S3, APIs, etc.).
- Support data validation, reconciliation, and testing.
- Collaborate with architects, analysts, and downstream teams.
Qualifications
- 3-8 years of experience in data engineering / analytics.
- Hands-on development experience in PySpark/Databricks and Python.
- Strong Experience in PySpark, Spark SQL.
- Strong Experience in Python.
- Strong Experience in Databricks.
- Strong Experience in SQL.
- Experience in Data transformation & pipeline development.
Skills
etl,pyspark,sql,databricks,python