Big Data Engineer

2 weeks ago


الرياض, Saudi Arabia Insights Advisory Full time

**Job Summary**:
We are looking for a Data Engineer with in-depth experience in working with Cloudera, Informatica, and Alteryx to design, implement, and manage robust data engineering solutions. In this technical role, you will work with large-scale data processing systems, build high-performance ETL pipelines, and ensure the smooth integration of data from multiple sources. This position requires proficiency in big data technologies, data integration platforms, and automation tools, along with a strong ability to optimize workflows for performance and scalability.

**Key Responsibilities**:
1. Design, implement, and optimize data pipelines for batch and real-time data processing using Cloudera (Hadoop, Hive, Spark, Impala) and Informatica (PowerCenter, Cloud Data Integration).

2. Build data extraction, transformation, and loading (ETL) workflows using Informatica PowerCenter for large-scale data integration from source systems (e.g., relational databases, flat files, APIs) into Cloudera Data Lake or data warehouse environments.

3. Implement Spark jobs on Cloudera for distributed data processing and optimization of data workflows.

4. Leverage Informatica for orchestrating ETL workflows, including data extraction, cleansing, transformation, and loading into data repositories (HDFS, Hive, SQL databases, etc.).

5. Create Alteryx workflows to automate data preparation, cleansing, and transformation, making data available for downstream analysis or reporting.

6. Leverage Alteryx's native connectors to integrate with external data sources (e.g., SQL databases, APIs, cloud services).

7. Optimize the Informatica and Alteryx workflows to minimize runtime, ensure smooth data integration, and maintain high data quality.

8. Utilize Hadoop and Spark on Cloudera to process large datasets and implement data transformations using MapReduce, Spark SQL, and PySpark.

9. Leverage Impala for low-latency SQL queries on Hadoop, ensuring real-time access to processed data.

10. Implement partitioning, bucketing, and indexing strategies in Hive and HBase to improve query performance on large datasets.

11. Implement and enforce data quality rules within Informatica and Alteryx workflows, ensuring that all transformations meet the required standards for completeness, consistency, and accuracy.

12. Ensure compliance with data governance and security protocols (e.g., encryption, masking, access control) in accordance with industry best practices.

13. Automation and Scheduling: Automate ETL workflows using Informatica and Alteryx Server, integrating with Airflow, Nifi or other workflow orchestration tools for scheduling and monitoring jobs.

14. Utilize Cloudera Navigator for monitoring and auditing data processes within the Hadoop ecosystem.

15. Perform regular tuning of the ETL pipelines, data flows, and SQL queries to ensure optimal performance.

**Required Qualifications**:
1. Education: Major in Computer Science or related filed.

2. Years of experience: 4+

3. Cloudera Platform Experience: Proven experience with the Cloudera Distribution of Hadoop (CDH), including expertise in HDFS, Hive, Impala, Spark, and HBase.

4. Informatica Expertise: Strong hands-on experience with Informatica PowerCenter (ETL), EDC, IDQ, B2B, and Axon.

5. Alteryx Expertise: Proficiency in developing and automating data workflows using Alteryx Designer and Alteryx Server for end-to-end data transformation, integration, and reporting automation.

6. Big Data & ETL Knowledge: Deep understanding of ETL best practices, data pipelines, and distributed computing technologies such as Spark, MapReduce, PySpark, and Hadoop ecosystem components.

7. SQL Proficiency: Advanced SQL skills for data manipulation, aggregation, optimization, and reporting across relational and non-relational data stores (e.g., SQL Server, MySQL, PostgreSQL, Hive, Impala).

8. Programming Skills: Experience in Python and SQL.

Data Warehousing: Strong background in data warehousing principles and data modeling, including dimensional modeling (star schema, snowflake schema) and OLAP/OLTP considerations.


  • Big Data Specialist

    2 days ago


    الرياض, Saudi Arabia Master-Works Full time

    Master-Works is looking for a talented Big Data Specialist to join our team and help us leverage large-scale data for strategic insights. In this role, you will be responsible for designing and implementing advanced big data solutions that enhance our analytical capabilities and drive business decision-making. **Key Responsibilities**: - Develop and...


  • الرياض, Saudi Arabia Talent Pal Full time

    The Role Job Description - Design and implement large-scale data processing systems and pipelines. - Develop, test, and deploy robust big data solutions using technologies like Hadoop, Spark, and Kafka. - Optimize data storage and retrieval strategies for performance and efficiency. - Collaborate with data scientists, analysts, and stakeholders to understand...


  • الرياض, Saudi Arabia Talent Pal Full time

    Design and implement large-scale data processing systems and pipelines. - Develop, test, and deploy robust big data solutions using technologies like Hadoop, Spark, and Kafka. - Optimize data storage and retrieval strategies for performance and efficiency. - Collaborate with data scientists, analysts, and stakeholders to understand data requirements. -...

  • Data Engineer

    2 days ago


    الرياض, Saudi Arabia Master-Works Full time

    **Data Collection and Integration**: Data engineers collect data from various sources, including databases, APIs, external data providers, and streaming sources. They must design and implement efficient data pipelines to ensure a smooth flow of information into the data warehouse or storage system. **2. Data Storage and Management**: Once the data is...


  • الرياض, Saudi Arabia Black Pearl Full time

    **Job Information**: Industry - TechnologyCity - RiyadhCountry - Saudi ArabiaZip/Postal Code - 11564Number of Positions - 1re you a dynamic leader with a strong background in Big Data and AI, ready to spearhead operations in the Kingdom of Saudi Arabia (KSA)? We are seeking an experienced Business Head to lead the expansion of a leading Big Data and AI...

  • Data Engineer

    1 week ago


    الرياض, Saudi Arabia Master-Works Full time

    Develop and maintain robust data architectures that support business needs and provide reliable data accessibility. - Collaborate with cross-functional teams to define data requirements and deliver scalable data solutions. - Implement ETL processes for data extraction, transformation, and loading, ensuring high data quality and integrity. - Optimize data...


  • الرياض, Saudi Arabia Insights Advisory Full time

    **Job Summary**: **Key Responsibilities**: **Informatica Administration**: 1. Install, configure, and maintain Informatica PowerCenter and Informatica Cloud Data Integration environments, ensuring optimal performance and availability. 2. Manage and monitor Informatica repository, domain, and services, ensuring smooth operations across development, testing,...

  • Data Engineer

    1 week ago


    الرياض, Saudi Arabia Master-Works Full time

    **Experience Required**: - 3+ years of experience in data engineering or a related field. - Expertise in MLOps (Machine Learning Operations). - Preferred experience or certification in DataIKU. **Key Skills**: - Strong knowledge of data engineering principles and best practices. - Proficiency in MLOps frameworks and tools for the development, deployment,...


  • الرياض, Saudi Arabia Esri Full time

    **Overview** Join us to work collaboratively with our talented team of dynamic and passionate engineers to deliver capabilities that enable our customers to make a difference. You'll deploy and operate ArcGIS Velocity and ArcGIS Workflow Manager SaaS solutions. You will also have the opportunity to design, deploy, and operate next-generation real-time and...

  • Cloud Data Engineer

    1 week ago


    الرياض, Saudi Arabia Master-Works Full time

    Master-Works is seeking a skilled Cloud Data Engineer to become an integral part of our data team. In this role, you will be responsible for designing, building, and optimizing our cloud-based data systems to enable high-quality data analytics and reporting. **Key Responsibilities**: - Design, develop, and maintain data pipelines and data lakes on GCP. -...