Tech Lead Data Bricks Job in Robosoft Technologies

Tech Lead Data Bricks

Apply Now
Job Summary

We are seeking a skilled Databricks Architect to design, implement, and optimize scalable data solutions within our cloud-based data platform. This role requires extensive knowledge of Databricks (Azure/AWS), data engineering, and a deep understanding of data architecture principles, with the ability to drive strategy, best practices, and hands-on implementation for high-performance data processing and analytics solutions.


Responsibilities:
  • Solution Architecture:
    • Design and architect end-to-end data solutions using Databricks and Azure/AWS, including data ingestion, processing, and storage.
  • Delta Lake Implementation:
    • Leverage Delta Lake and Lakehouse architecture to create robust, unified data structures that support advanced analytics and machine learning.
  • Data Processing Development:
    • Develop, design, and automate large-scale, high-performance data processing systems (batch and/or streaming) to drive business growth and enhance the product experience.
  • Performance Tuning:
    • Ensure optimal performance of data pipelines and workloads by implementing best practices for resource management, auto-scaling, and query optimization in Databricks.
  • Engineering Best Practices:
    • Advocate for high-quality software engineering practices in building scalable data infrastructure and pipelines.
  • Architecture/Solution Development:
    • Develop Architecture or solution for large data project using Databricks.
  • Project Leadership:
    • Lead data engineering projects to ensure pipelines are reliable, efficient, testable, and maintainable.
  • Data Modeling:
    • Design data models optimized for storage, retrieval, and critical product and business requirements.
  • Logging Architecture:
    • Understand and influence logging to support data flow, implementing logging best practices as needed.
  • Standardization and Tooling:
    • Contribute to shared data engineering tools and standards to boost productivity and quality for Data Engineers across the company.
  • Collaboration:
    • Work closely with leadership, engineers, program managers, and data scientists to understand and meet data needs.
  • Partner Education:
    • Use data engineering expertise to identify gaps and improve existing logging and processes for partners.
  • Data Governance:
    • Collaborate with stakeholders to build data lineage, data governance, and data cataloging using unity catalog.
  • Agile Project Management:
    • Lead projects using agile methodologies.
  • Communication:
    • Communicate effectively with stakeholders at all organizational levels.
  • Team Development:
    • Recruit, retain, and develop team members, preparing them for increased responsibilities and challenges.

Requirements:
  • 10+ years of relevant industry experience.
  • ETL Expertise:
    • Skilled in custom ETL design, implementation, and maintenance.
  • Data Modeling:
    • Experience in developing and designing data models for reporting systems.
  • Databricks Proficiency:
    • Hands-on experience with Databricks SQL workloads.
  • Data Ingestion:
    • Expertise in data ingestion from offline files (e.g., CSV, TXT, JSON) along with API and DB, CDC data ingestion. Should have handled such projects in past.
  • Pipeline Observability:
    • Skilled in setting up robust observability for complete pipelines and Databricks in Azure/AWS.
  • Database Knowledge:
    • Proficient in relational databases and SQL query authoring.
  • Programming and Frameworks:
    • Experience with Java, Scala, Spark, PySpark, Python, and Databricks.
  • Cloud Platforms:
    • Cloud experience required (Azure/AWS preferred).
  • Data Scale Handling:
    • Experience working with large-scale data.
  • Pipeline Design and Operations:
    • Proven experience in designing, building, and operating robust data pipelines.
  • Performance Monitoring:
    • Skilled in deploying high-performance pipelines with reliable monitoring and logging.
  • Cross-Team Collaboration:
    • Able to work effectively across teams to establish overarching data architecture and provide team guidance.
  • ETL Optimization:
    • Ability to optimize ETL pipelines to reduce data transfer and storage costs.
  • Auto Scaling:
    • Skilled in using Databricks SQL s auto-scaling feature to adjust worker numbers based on workload.

Tech Stack:
  • Cloud Platform:
    • Azure/AWS.
  • Azure/AWS:
    • Databricks SQL Serverless, Databricks SQL, Databricks workspaces, Databricks notebooks, Databricks job scheduling, Data Catalog.
  • Data Architecture:
    • Delta Lake, Lakehouse concepts.
  • Data Processing:
    • Spark Structured/Streaming.
  • File Formats:
    • CSV, Avro, Parquet.
  • CI/CD:
    • CI/CD for ETL pipelines.
  • Governance Model:
    • Databricks SQL unified governance model (Unity Catalog) across clouds, supporting open formats and APIs.
Experience Required :

Minimum 10 Years

Vacancy :

2 - 4 Hires

Similar Jobs for you

See more recommended jobs