Design, develop, and optimize ETL and data pipelines for large-scale data processing
Write and optimize complex SQL queries including joins, subqueries, CTEs, window functions, and performance tuning
Perform data transformation, validation, cleansing, and data quality checks
Build and maintain scalable data ingestion pipelines using APIs, SFTP, Kafka, and related integration tools
Work with large distributed datasets and ensure efficient data processing
Design and implement graph data models including nodes, edges, and relationships
Work with RDF data structures and semantic data frameworks
Develop and optimize SPARQL queries for graph data retrieval and analysis
Build and maintain knowledge graphs and ontology-driven data systems
Automate data workflows and operational tasks using Python scripting
Monitor, troubleshoot, and optimize pipeline performance and reliability
Required Skills & Experience
Strong hands-on experience with Python, including Pandas, NumPy, and automation scripting
Advanced SQL expertise including query optimization, CTEs, subqueries, joins, and window functions
Strong understanding of ETL concepts, data engineering practices, and pipeline architecture
Experience working with large-scale distributed data processing environments
Solid understanding of graph data concepts including nodes, edges, and relationships
Experience with RDF data models and SPARQL querying
Experience with data ingestion and integration tools such as APIs, Kafka, and SFTPStrong understanding of data transformation, cleansing, and performance optimization techniques
Nice to Have
Experience with graph databases such as Neo4j or Blazegraph
Exposure to cloud platforms including AWS, Azure, or GCPExperience with workflow orchestration tools such as Airflow
Understanding of data lake and data warehousing concepts
Preferred Profile
Strong analytical and problem-solving skills
Ability to work in complex data-driven environments
Experience collaborating with cross-functional engineering and data teams
Self-driven mindset with strong attention to detail and performance optimization
Verified Listing
This role has been verified for authenticity, market-rate compensation, and remote eligibility.
Apply now
Step 1 of 1
Data Engineer (Python + SQL / Sparql + Graph Data) at Test Yantra