Tech Lead Data Bricks Job in Robosoft Technologies
Tech Lead Data Bricks
Robosoft Technologies
2 weeks ago
- Mumbai, Maharashtra
- Not Disclosed
- Full-time
Job Summary
Responsibilities:
Requirements:
Tech Stack:
We are seeking a skilled Databricks Architect to design, implement, and optimize scalable data solutions within our cloud-based data platform. This role requires extensive knowledge of Databricks (Azure/AWS), data engineering, and a deep understanding of data architecture principles, with the ability to drive strategy, best practices, and hands-on implementation for high-performance data processing and analytics solutions.
Responsibilities:
- Solution Architecture:
- Design and architect end-to-end data solutions using Databricks and Azure/AWS, including data ingestion, processing, and storage.
- Delta Lake Implementation:
- Leverage Delta Lake and Lakehouse architecture to create robust, unified data structures that support advanced analytics and machine learning.
- Data Processing Development:
- Develop, design, and automate large-scale, high-performance data processing systems (batch and/or streaming) to drive business growth and enhance the product experience.
- Performance Tuning:
- Ensure optimal performance of data pipelines and workloads by implementing best practices for resource management, auto-scaling, and query optimization in Databricks.
- Engineering Best Practices:
- Advocate for high-quality software engineering practices in building scalable data infrastructure and pipelines.
- Architecture/Solution Development:
- Develop Architecture or solution for large data project using Databricks.
- Project Leadership:
- Lead data engineering projects to ensure pipelines are reliable, efficient, testable, and maintainable.
- Data Modeling:
- Design data models optimized for storage, retrieval, and critical product and business requirements.
- Logging Architecture:
- Understand and influence logging to support data flow, implementing logging best practices as needed.
- Standardization and Tooling:
- Contribute to shared data engineering tools and standards to boost productivity and quality for Data Engineers across the company.
- Collaboration:
- Work closely with leadership, engineers, program managers, and data scientists to understand and meet data needs.
- Partner Education:
- Use data engineering expertise to identify gaps and improve existing logging and processes for partners.
- Data Governance:
- Collaborate with stakeholders to build data lineage, data governance, and data cataloging using unity catalog.
- Agile Project Management:
- Lead projects using agile methodologies.
- Communication:
- Communicate effectively with stakeholders at all organizational levels.
- Team Development:
- Recruit, retain, and develop team members, preparing them for increased responsibilities and challenges.
Requirements:
- 10+ years of relevant industry experience.
- ETL Expertise:
- Skilled in custom ETL design, implementation, and maintenance.
- Data Modeling:
- Experience in developing and designing data models for reporting systems.
- Databricks Proficiency:
- Hands-on experience with Databricks SQL workloads.
- Data Ingestion:
- Expertise in data ingestion from offline files (e.g., CSV, TXT, JSON) along with API and DB, CDC data ingestion. Should have handled such projects in past.
- Pipeline Observability:
- Skilled in setting up robust observability for complete pipelines and Databricks in Azure/AWS.
- Database Knowledge:
- Proficient in relational databases and SQL query authoring.
- Programming and Frameworks:
- Experience with Java, Scala, Spark, PySpark, Python, and Databricks.
- Cloud Platforms:
- Cloud experience required (Azure/AWS preferred).
- Data Scale Handling:
- Experience working with large-scale data.
- Pipeline Design and Operations:
- Proven experience in designing, building, and operating robust data pipelines.
- Performance Monitoring:
- Skilled in deploying high-performance pipelines with reliable monitoring and logging.
- Cross-Team Collaboration:
- Able to work effectively across teams to establish overarching data architecture and provide team guidance.
- ETL Optimization:
- Ability to optimize ETL pipelines to reduce data transfer and storage costs.
- Auto Scaling:
- Skilled in using Databricks SQL s auto-scaling feature to adjust worker numbers based on workload.
Tech Stack:
- Cloud Platform:
- Azure/AWS.
- Azure/AWS:
- Databricks SQL Serverless, Databricks SQL, Databricks workspaces, Databricks notebooks, Databricks job scheduling, Data Catalog.
- Data Architecture:
- Delta Lake, Lakehouse concepts.
- Data Processing:
- Spark Structured/Streaming.
- File Formats:
- CSV, Avro, Parquet.
- CI/CD:
- CI/CD for ETL pipelines.
- Governance Model:
- Databricks SQL unified governance model (Unity Catalog) across clouds, supporting open formats and APIs.
Experience Required :
Minimum 10 Years
Vacancy :
2 - 4 Hires
Similar Jobs for you
×
Help us improve TheIndiaJobs
Need Help? Contact us