Content and procedure
With the growing diversity and volume of data, the challenges of processing and managing (big) data are also increasing. This workshop gives an overview of the resulting profile and the corresponding activities of a data engineer and introduces the concepts and processes of Big Data via the classical world of Data Warehouse and structured data. The participants will first become familiar with the terminology and the underlying technologies as well as possible application areas of Data Warehouse and Business Intelligence on the basis of their daily work. A corresponding basic understanding of the use of SQL in this context is guaranteed by hands-on exercises.
In addition to the clarification of traditional data organizations and models, the limits of these approaches will be discussed and the topic of Big Data Processing and Architectures will be introduced. This workshop is aimed at employees and managers from the classic areas of data warehousing, BI and data engineering and provides an introduction to traditional and new methods for modeling and managing structured and unstructured data.
- Speaker: Simon Kaltenbacher
- Language: English
- 19 March 2019
- 10:00 – 17:15
- Sofitel Hotel Munich Bayerpost, Bayerstraße 12, 80335 Munich, Germany
Insight into the world of a data engineer
Introduction to management and modeling of structured and unstructured data
Hands On exercises provide a basic understanding of SQL
is Head of Technology at Alexander Thamm GmbH. There he advises customers on the development of data platforms and supports them in the implementation of data pipelines. He has been following the Apache Spark project intensively since version 0.9 and has already held several training courses and lectures on this technology.
- Basic concept all around data
- Introduction to data models and databases
- Data Warehouse Architecture – Layers, Components and Characteristics
- SQL for Data Warehousing
- Normalized and de-normalized data models
- Introduction to Big Data
- How many V’s?
- Limits of traditional approaches
- Concepts (ACID, BASE, CAP)
- Data models (Key-Value, Document, Column-based, Graph)
- Batch Models
- Hadoop, Map-Reduce, HDFS
- Pipelining with Spark
- Data Lakes Architectures
Basic knowledge in statistics, mathematics and computer science is necessary. A laptop is required for the practical exercises.