Big Data EngineerApply Now
- Working knowledge of using Python or other languages for creating pipelines and data processing
- Build high-performance algorithms, predictive models, and prototypes.
- Ensure that all systems meet the business/company requirements as well as industry practices.
- Identify, design, and implement internal process improvements: automating manual processes, optimizing data delivery, re-designing infrastructure for greater scalability, etc.
- Build the infrastructure required for optimal extraction, transformation, and loading of data from a wide variety of data sources using SQL and ‘big data’ technologies.
- Build analytics tools that utilize the data pipeline to provide actionable insights into customer acquisition, operational efficiency and other key business performance metrics.
- Work with data and analytics experts to strive for greater functionality in our data systems.
- Build processes supporting data transformation, data structures, metadata, dependency and workload management.
- Proficient understanding of distributed computing principles
- Management of Hadoop cluster, with all included services
- Ability to solve any ongoing issues with operating the cluster
- Proficiency with Hadoop v2 or higher, MapReduce, HDFS
- Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming
- Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
- Experience with Spark
- Experience with integration of data from multiple data sources
- Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
- Knowledge of various ETL techniques and frameworks, such as Flume