Big Data Engineer Process Analytics Group Job in Tridiagonal Solutions Pvt. Ltd.
Big Data Engineer Process Analytics Group
- Pune, Pune Division, Maharashtra
- Not Disclosed
- Full-time
- Permanent
Must have skills: An Engineer/analysts with strong ability to comprehend engineering/science, mathematical, statistical, and analytical skills Personnel who is willing to travel for customer site for project requirements. The location for deployment shall depend on the target customer. Experienced in AWS / Azure IIOT services, such as AWS IOT core, greengrass, sitewise, Azure IOT hub and others Expertise in big data processing/movement and architecting the data governance Designing and implementing highly performant data ingestion pipelines from multiple sources using Apache Spark and/or Azure Databricks / AWS redshift/RDS Experience in building ETL / data warehouse transformation processes Delivering and presenting proofs of concept to of key technology components to project stakeholders. Hands on experience designing and delivering solutions using the Azure Data Analytics platform (Cortana Intelligence Platform) including Azure Storage, Azure SQL Data Warehouse, Azure Data Lake, Azure Cosmos DB, Azure Stream Analytics Developing scalable and re-usable frameworks for ingesting of geospatial data sets Integrating the end to end data pipleline to take data from source systems to target data repositories ensuring the quality and consistency of data is maintained at all times Working with event based / streaming technologies to ingest and process data Working with other members of the project team to support delivery of additional project components (API interfaces, Search) Evaluating the performance and applicability of multiple tools against customer requirements Working within an Agile delivery / DevOps methodology to deliver proof of concept and production implementation in iterative sprints. Strong knowledge of Data Management principles Direct experience of building data piplines using Azure Data Factory and Apache Spark (preferably Databricks). Microsoft Azure Big Data Architecture certification. Experience with Apache Kafka / Nifi for use with streaming data / event-based data Experience with other Open Source big data products Hadoop (incl. Hive, Pig, Impala) Experience with Open Source non-relational / NoSQL data repositories (incl. MongoDB, Cassandra, Neo4J) Studio Team Services, Chef, Puppet or Terraform Must have worked on Databricks Cloud / Azure Databricks / Databricks Delta for at least 2+ year Highly experienced in Big Data solutions, particularly Hadoop and Spark Good understanding of Lambda Architecture Very advanced expertise in any of the programming language Java / Python / Scala Experience with integration of data from multiple data sources Knowledge of building Real-Time Data processing solutions using Kafka and Spark-Streaming Fair knowledge on NoSQL databases such as MongoDB, HBase, Cassandra etc Fair knowledge on any RDBMS and Data Warehousing Concepts Must have experience in understanding business vision to prepare ETL/ELT solution architecture Identifying solutions to key technical challenges in projects concerning Data Integration. Experienced in Implementation of below mentioned tool across all platforms for Data Ingestion (like Spark/Storm etc.), Data Scheduling (like Apache), Data Processing (Hadoop Yarn),Data Search & Index (Solr), Data Storage (Hadoop HDFS, Apache Hbase, Kudu), Cluster & configuration management( Apache Ambari) Effective mining of data and analysis of data to build performance model and decision support frameworks such as predictive/Diagnostics analytics Knowledge with database languages like SQL, MySQL, mongoBD, MariaDB Proficient with statistical (process) and time series/time-stamped data (DOE, manufacturing / Historian) Knowledge of creating data visualizations from Superset, Tableau, Zeppelin,..,etc. Knowledge of big data tools like Hive, HBase, Zookeeper or Pig Good to have skills: Experienced/Hands-on in either one of the deployable solutions such as seeq, Azure, AWS to provide the analytical solutions Innovative in terms of presenting client demonstration (case studies) / upskilling the client-side team (Digitization) / training the centralized team on the developed use case Knowledge of development languages and statistical programming languages like R, Scala, Pyspark and RSpark or Python Handling data from multiple sources and formats sensors, logs, structured data from an RDBMS, video, text, etc. Ready to relocate as per the client s requirement, be it on-site or off-site or travelling/site visit as per needed Influence stakeholders from various disciplines and across different levels of seniority across the organization. Experienced in understanding problems, collecting data, establishing facts and drawing valid conclusions Provide expertise in deployment of high-end technological initiatives and digital solutions like historian, data analytics, BI, workflow automation tools, Data Analytics (using partner platforms), Collaborate / work closely with Service Delivery Team & Marketing team Successful track record in leading and delivering projects, large programs, applications across companies and domains Results oriented leadership and management skills with strong communication, people management and problem-solving skills Self-starter, quick learner, experience in working in multi-cultural environment across different geographies Understanding of ML/AI model development and deployment
Fresher
2 - 4 Hires