Kafka and Data Lake Engineer

100% REMOTE

Long term(12 MONTHS CONTRACT)

A Kafka and Data L...">

Back to Jobs

KAFKA AZURE DATALAKE ENGINEER WITH SERVICENOW EXPERIENCE(W2) - Virisha LLC

Remote, USA Full-time Posted 2025-11-02
[ad_1]

Kafka and Data Lake Engineer

100% REMOTE

Long term(12 MONTHS CONTRACT)

A Kafka and Data Lake Engineer is a data engineer who designs, builds, and manages data infrastructure using Apache Kafka for real-time data streaming and a data lake for storing large volumes of data. This role is vital for organizations that need to process and analyze both real-time streaming data and historical data to gain insights. For VA, this includes ServiceNow to the Data Lake via Kafka Bus.

Responsibilities

Design data pipelines: Build robust, scalable, and secure data pipelines to ingest, process, and move data from various sources into the data lake using Kafka.

Administer Kafka clusters: Deploy, configure, and maintain Kafka clusters and related ecosystem tools, such as Kafka Connect and Schema Registry, ensuring high availability and performance.

Manage the data lake: Oversee the architecture and governance of the data lake, including managing data storage (e.g., in AWS S3 or ADLS), security, and metadata.

Develop data processing applications: Create producers and consumers to interact with Kafka topics using programming languages like Python, Java, or Scala.

Perform stream processing: Use tools like Kafka Streams, Apache Flink or ksqlDB to perform real-time data transformations and analytics.

Ensure data quality and security: Implement data quality checks, manage data lineage, and enforce security controls such as encryption, access controls (ACLs), and compliance (e.g., GDPR).

Monitor and troubleshoot: Set up monitoring and alerting for Kafka and data lake infrastructure and respond to incidents to ensure operational reliability.

Collaborate with teams: Work closely with data scientists, analysts, and other engineering teams to understand data requirements and deliver reliable data solutions.

Essential skills and qualifications

Experience: Proven experience designing and managing data platforms with Apache Kafka and big data technologies.

Programming: Strong proficiency in languages like Python, Java, or Scala.

Big data technologies: Expertise in big data processing frameworks, such as Apache Spark and Apache Flink.

Cloud platforms: Hands-on experience with cloud environments (AWS, Azure, or Google Cloud Platform) and relevant services like S3, Glue, or Azure Data Lake Storage.

Data lake architecture: A solid understanding of data lake design principles, including storage formats (e.g., Delta Lake, Apache Iceberg), data modeling, and governance.

Databases: Experience with various database systems, including both SQL and NoSQL.

Infrastructure management: Familiarity with infrastructure-as-code tools like Terraform or Ansible and containerization with Docker and Kubernetes.

Professionals in this field can advance from entry-level data engineering positions to senior roles, and then to a Big Data Architect or Solutions Architect, where they oversee large-scale data infrastructure.

Relevant certifications

Pursuing certifications can validate your expertise and boost your career.

For Kafka:

Confluent Certified Administrator for Apache Kafka (CCAAK)

Confluent Certified Developer for Apache Kafka (CCDAK)

For Data Lake and Cloud:

Databricks Certified Data Engineer

AWS Certified Data Engineer

Microsoft Certified: Azure Data Engineer Associate

Employers have access to artificial intelligence language tools (“AI”) that help generate and enhance job descriptions and AI may have been used to create this description. The position description has been reviewed for accuracy and Dice believes it to correctly reflect the job opportunity.


[ad_2]
Apply to this job

Similar Jobs