The purpose of the DSSC Track Core Course is to introduce students of diverse backgrounds to the basic concepts and skills required for data analysis and numerical simulation. This should enable the students to take other advanced courses in the DSSC track, communicate across disciplinary boundaries, and do their thesis research either in data / simulation intensive fields of natural and life science, or in development of new DSSC methodologies for computer science and statistics.

More specifically, the course has the following goals: provide hands-on experience and scientific insight into different DSSC problems and method- ologies; learn about the vocabulary and conceptual framework that different fields developed to construct, analyze, and evaluate models; build a community of computational / data students by project work; practice the following skills: handling data, extracting knowledge from data, creating models, running numerical simulations, identifying and understanding sources of error, working in mixed background teams, written and oral communication.

Target group: None

Prerequisites: This course is intended primarily for students who wish to do thesis-related research in groups that participate in the Data Science and Scientific Computation (DSSC) track; typically, these will be students with computer science, machine learning, statistics, physics, bioinformatics, or applied math backgrounds. Ideally, the students should have the interest in working with real or realistic (simulated) data, and background in the following two areas: undergraduate mathematics: linear algebra, calculus, probabilities and Basic procedural programming in a language of your choice, ability to understand C and Python code.The above prerequisites are detailed in the Appendix. The students should be familiar with the enumerated concepts, even if they don’t remember all details at this moment. By the beginning of the course, the students should be able to work with these concepts and answer (most of) the mock exam ( without having to consult an external source (too much).
DSSC core course is not an introductory course in data analysis for life science students. This course has been successfully taught the course to students with non-formal back- grounds if they were very familiar either with the required math but did not have much prior coding experience (e.g., only on the level of the programming service courses given at IST); or to students lacking some of the required math background but fluent in coding. Lacking one of the two prerequisites requires a substantially higher time investment for the student, but is feasible. We strongly advise, however, against taking the course with insufficient background both in math and coding.

Evaluation: Evaluation is based on homeworks and/or written or oral (mini-)project reports at the end of each segment. Segments are weighted equally.
The two segments with data analysis and numerical simulation focus will usually be graded 50% from homeworks and 50% from a mini-project / extended homework at the end of the segment. Typically, these segments will contain around three small homeworks per segment, to ensure that students regularly apply the methodologies learned in class and build up the scripts that they need to solve the mini-projects. At the end of each segment, students present the mini-projects and/or hand in short reports, as agreed with the instructor and the TAs.
The integrative / project-driven segment will be graded based on project idea, execution, par- ticipation, and presentations.

Teaching format: The course consists of 3 segments, taught by 3 instructors, of approximately 4-5 weeks in duration. Each year, the selection of the particular segments to be taught that year will be announced by the course instructors in advance, making sure to representatively balance the data analysis and numerical simulation aspects of the track. Typically, this balance will be achieved by combining one segment focused on data analysis, one segment focused on numerical simulation, while the third, usually the last segment, will focus on integrative and interactive project-driven work, and will specifically emphasize visualization and presentation skills. In all segments, the emphasis is on dealing with data and computations in a hands-on fashion.

ECTS: 6 Year: 2019

Track segment(s):
DSSC-CORE Data Science and Scientific Computing - track core course

Bernd Bickel Gasper Tkacik Christopher Wojtan

Teaching assistant(s):
Michal Hledik

If you want to enroll to this course, please click: REGISTER