Lead/Architect Data Engineer
Job title: Lead/Architect Data Engineer in Indiana at APN Consulting
Company: APN Consulting
Job description: APN Consulting, Inc. is a progressive IT staffing and services company offering innovative business solutions to improve client business outcomes. We focus on high impact technology solutions in ServiceNow, Fullstack, Cloud & Data, and AI / ML. Due to our globally expanding service offerings we are seeking top-talent to join our teams and grow with us.Role: Lead/Architect Data Engineer
Location: Remote (US based-remote anywhere in US)
Duration: ContractJob Summary We are looking for a results-driven Lead Data Engineer (Contractor) to architect, develop, and guide the implementation of modern data pipelines and cloud-native analytics solutions. The ideal candidate will lead end-to-end delivery across engineering, analytics, and product teams, bringing deep experience in Databricks, PySpark, and Azure cloud platforms. This role also requires strong hands-on experience in Databricks architecture, administration, and performance optimization .Key Responsibilities * Lead the architecture, design, and development of scalable ETL/ELT pipelines using Databricks, Pyspark, and SQL across distributed data environments.
- Architect and manage Databricks workspaces, including provisioning and maintenance of clusters, cluster policies, and job compute environments in accordance with enterprise standards.
- Collaborate with platform and infrastructure teams to define Databricks architecture strategy and ensure secure, scalable, and cost-effective implementation.
- Define and enforce cluster policies to ensure proper resource utilization, cost control, and access control based on workload patterns and team requirements.
- Lead performance tuning of Spark jobs, Databricks SQL queries, and notebooks, ensuring optimal execution and minimizing latency.
- Build modular, reusable Python libraries using Pandas, NumPy, and PySpark for scalable data processing.
- Develop optimized Databricks SQL queries and views to powerTableau dashboards
- React and .NET-based applications
- Ad-hoc and real-time analytics use cases
- Work closely with frontend and backend development teams to deliver use-case-specific, query-optimized datasets.
- Leverage Unity Catalog for fine-grained access control, data lineage, and metadata governance.
- Drive DevOps best practices using Azure DevOps, Terraforms, and CI/CD automation pipelines.
- Mentor junior engineers and perform architectural reviews to ensure consistency and alignment with best practices.
- Deep hands-on experience with Databricks architecture, workspace administration, and cluster management.
- Experience defining and managing cluster policies, pools, and autoscaling strategies.
- Strong knowledge of Spark performance tuning and job optimization.
- Proven expertise in Databricks SQL, PySpark, Delta Lake, and large-scale data pipelines.
- Skilled in building reusable Python libraries with Pandas, Openpyxl, XLSXWriter, and PySpark.
- Practical experience working with Unity Catalog for security and governance.
- Strong collaboration experience with front-end/backend development teams and backend integration.
- Strong SQL expertise and hands-on experience with PostgreSQL, SQL Server, or similar.
- DevOps expertise with tools like Azure DevOps, Git, and pipeline automation.
- Excellent communication skills with the ability to lead discussions with cross-functional teams and stakeholders.
- Big Data & Analytics: Databricks, PySpark, Delta Lake, Databricks SQL, Spark Connect, Delta Live Tables
- Programming & Frameworks: Python, Pandas, PySpark, Flask
- Visualization & BI: Tableau
- App Integration: React, .NET, REST APIs
- DevOps & CI/CD: Azure DevOps, Git
- Databases: Databricks SQL, Azure SQL DB, or similar
Expected salary:
Location: Indiana
Apply for the job now!
[ad_2]
Apply for this job