Job description

Data Engineer - Senior Associate

Apply Now    
Job Category:   Analytics|Information Technology
Line of Service:   Assurance
Location(s):   FL-Tampa
Travel Requirements:   0-20%
Level:   Senior Associate
Job ID:   103236BR
PwC/LOS Overview
PwC is a network of firms committed to delivering quality in assurance, tax and advisory services.

We help resolve complex issues for our clients and identify opportunities. Learn more about us at

At PwC, we develop leaders at all levels. The distinctive leadership framework we call the PwC Professional ( provides our people with a road map to grow their skills and build their careers. Our approach to ongoing development shapes employees into leaders, no matter the role or job title.

Are you ready to build a career in a rapidly changing world? Developing as a PwC Professional means that you will be ready
- to create and capture opportunities to advance your career and fulfill your potential. To learn more, visit us at

What will you do if you work in Assurance at PwC?
You'll ask questions and test assumptions. You'll help determine if companies are reporting information that investors and others can rely on. You'll help businesses solve complex issues faced by management and boards. You'll serve the public interest and the capital markets by conducting quality audits. Visit for more information on PwC's Assurance practice.

The world is quickly changing, that's why PwC is quickly adapting. We're capitalizing on trends that will impact corporate reporting.

Our focus is on globalization, technology, sustainability and environmental reporting, population shifts and regulation. We combine skills and experience to help our clients address their challenges.

Job Description
The Assurance Innovation group is developing capabilities leveraging the latest in Open Source technologies to automate and accelerate our client engagements across the enterprise. We are focused on incorporating the latest in machine learning, Big Data, NoSQL, cutting edge development languages,

and advanced data processing techniques to include structured and unstructured information in a loosely coupled ecosystem delivering a technology platform that positions PwC for the future.

As a Data Engineer, you will work in a team together with Data Scientists, Software Engineers and Product Managers to drive Innovation and Technical solutions into the practice.

Data Engineers will focus on the design and build out of data models, codification of business rules, mapping of data sources to the data models (structured and unstructured), engineering of scalable ETL pipelines, development of data quality solutions, and continuous evaluation of technologies to continue to enhance the capabilities of the Data Engineer team and broader Innovation group.

Position/Program Requirements
Minimum Year(s) of Experience: 2.5

Minimum Degree Required: Bachelor's degree

Knowledge Preferred:

Demonstrates thorough knowledge and/or a proven record of success in the following areas:

- Python and experience with data extraction, data cleansing and data wrangling;

- SQL and experience with relational databases;

- Codification of business rules (analytics) in one of the programming languages listed above;

- Working with business teams to capture and define data models and data flows to enable downstream analytics;

- Data modeling, data mapping, data governance and the processes and technologies commonly used in this space;

- Data integration tools (e.g. Talend, SnapLogic, Informatica) and data warehousing / data lake tools;

- Systems development life cycles such as Agile and Scrum methodologies; and,

- API based data acquisition and management.

Skills Preferred:

Demonstrates thorough abilities and/or a proven record of success in the following areas:

- Object-oriented/object function scripting languages such as Python, R, C/C++, Java, Scala, etc.;

- Relational SQL, distributed SQL and NoSQL databases including but not limited to:

- MSSQL, PostgreSQL, MySQL;

- MemSQL, CrateDB;

- MongoDB, Cassandra;

- Neo4j, AllegroGraph, ArangoDB;

- Big data tools such as Hadoop, Spark, Kafka, etc.;

- Data modeling tools such as ERWin, Enterprise Architect, Visio, etc.;

- Data integration tools such as Talend, Informatica, SnapLogic, etc.;

- Data pipeline and workflow management tools: Azkaban, Luigi, Airflow, etc.;

- Business Intelligence Tools such as Tableau, PowerBI, Zoomdata, Pentaho, etc.;

- Cloud technologies such as SaaS, IaaS and PaaS within Azure, AWS or Google and the associated data pipeline tools;

- Linux and proven comfort level with bash scripting; and,

- Docker and Puppet and agile development processes.

Demonstrates thorough abilities and/or a proven record of success in the following areas:

- Building enterprise data pipelines and the ability to craft code in SQL, Python, and/or R;

- Building batch data pipeline with relational and columnar database engines as well as Hadoop or Spark, and understanding their respective strengths and weaknesses;

- Building scalable and performant data models;

- Possessing computer science fundamentals: data structures, algorithms, programming languages, distributed systems, and information retrieval;

- Presenting technical and non-technical information to various audiences;

- Working with large data sets and deriving insights from data using various BI and data analytics tools;

- Thinking outside of the box to solve complex business problems
Understanding of the security requirements for handling data both in motion and at rest such as communication protocols, encryption, authentication, and authorization;

- Understanding of Graph databases and graph modeling; and

- Understanding of the requirements of data science teams.

Apply Now    
Link for schema