Kursöversikt

SF2935 MODERN METHODS OF STATISTICAL LEARNING

KTH Mathematics  


Modern Methods of Statistical Learning Theory SF2935

The aim of the course is to introduce some of the basic algorithms and methods of statistical learning theory at an intermediate level. These are essential tools for making sense of the vast and complex data sets (c.f. big data) that have emerged in fields ranging from biology to marketing to astrophysics in the past decades. The course presents some of the most important modeling and prediction techniques, along with some relevant applications. Topics presented include classification, artificial neural networks with exponential families of distributions, Bayesian learning, resampling methods, tree-based methods, and clustering, highdimensional data. This is a good part of the background required for a career in data analytics. The course is lectured and examined in English.

Recommended prerequisities:

  • SF 1901 or equivalent course of the type 'a first course in probability and statistics (for engineers)'
  • Multivariate normal distribution
  • Basic differential and integral calculus, basic linear algebra.
  • Proficiency in R (optional)

Lecturers:

Timo Koski (examiner) homepage and contact information

  • Daniel Berglund email

  • Jimmy Olsson email

  • Tetyana Pavlenko email

  • Course literature::

    • G. James, D. Witten, T. Hastie, R. Tibshirani: An introduction to Statistical Learning web page for the book (acronym below: ISL) by the publisher Springer
    • some sections of: Avrim Blum, John Hopcroft and Ravindran Kannan: Foundations of Data Science pdf from the authors
    • Supplementary reading and material from the lectures web page

    The textbook ISL can be bought at THS Kårbokhandel, Drottning Kristinas väg 15-19.


    Examination:

    • Computer homework (3.0 cu): there are two compulsory computer projects/home work that are to be submitted as written reports. Each report should be produced by a group of two (2) students. The reports are examined at the Project presentation seminars on TBA of November and TBA of December, 2017. The computer homework will be graded with Pass/Fail.
    • There will be a written exam (4.5 cu), consisting of five (5) assignments, on Thursday 11th of January, 2018, 08- 13.00 hrs.

    • Bonus for summaries of the guest lectures and papers An individually written summary (max. 2xA4) of the scientific contents of a guest lecture (2 x E.A), (LK) (SV) will provide one (1) bonus point for the exam. In addition can bonus points be gained by written summaries of at most two scientific articles (TBA). The summary is expected to be based on the students' own notes taken during the lecture or reading of a paper. The summaries must be submitted with deadline Fri 16th of December at 15 hrs. The bonus points are valid for the ordinary Exam on Thursday 11th of January, 2018, and in the re-examination on (TBA). The maximum number of bonus points to be gained is five (5).


    • Important: Students, who are admitted to a course and who intend to attend it, need to activate themselves in Rapp . Log in there using your KTH-id and click on "activate" (aktivera). The codename for sf2935 in Rapp is statin17.
      Registration for the written examination via "mina sidor"/"my pages" is required.
      Grades are set according to the quality of the written examination. Grades are given in the range A-F, where A is the best and F means failed. Fx means that you have the right to a complementary examination (to reach the grade E). The criteria for Fx is a grade F on the exam, and that an isolated part of the course can be identified where you have shown a particular lack of knowledge and that the examination after a complementary examination on this part can be given the grade E.

    • Supervision for computer projects
      Teaching assistant Daniel Berglund will be available for advice and supervision for computer projects at times to be announced.

      Plan of lectures
      KTH Social .
      (TK=Timo Koski, JO= Jimmy Olsson TP=Tetyana Pavlenko, DB= Daniel Berglund, EA= Erik Aurell, LK= Lukas Käll, SV= Sara Väljamets, ISL = the textbook, FoDSc= Foundations of Data Science ) 

      The addresses of the lecture halls and guiding instructions are found by clicking on the Hall links below


      Day Date Time Hall Topic Lecturer
      Tue 31/10 13-15 Q2 Lecture 1: Introduction to statistical learning (perceptrons, feedforward neural nets) and the course work. Introduction to computer projects Chapter 2 in ISL.
      TK
      Thu 02/11
      08-10 Q2 Lecture 2:
      Supervised Learning Part I.
      Chapter 4 in ISL

      TP
      Fri
      03/11
      10-12 Q2 Lecture 3: Supervised Learning Part II.
      Chapter 4 in ISL
      TP
      Tue
      07/11 14-16 Q2 Lecture 4: Bootstrap
      TP
      Thu
      09/11 08-10 Q2 Lecture 5: Introduction to R in a computer class Chapter 2 in ISL DB
      Fri
      10/11 10-12 Q2 Lecture 6: feedforward neural networks as statistical models I, handouts.

      TK
      Tue
      14/11 13-15 Q2 Lecture 7: feedforward neural networks as statistical models II, Support vector machines (SVM) I Chapter 9 in IS TK
      Thu
      16/11 08-10 Q2 Lecture 8: SVM II Chapter 9 in ISL
      TK
      Fri
      17/11 08-10 Q2 Lecture 9: Bayesian Learning I, Handouts TK
      Tue
      21/11 13-15 D3
      Lecture 10:Project presentation seminar 1
      TK
      Thu
      23/11 08-10 Q2 Lecture 11:Bayesian Learning II Handouts
      TK
      Fri
      24/11 10-12 E3 Lecture 12: Guest Lecture: TBA SV
      Tue
      28/11 13-15 E3 Lecture 13: Unsupervised learning part I. Chapter 10 in ISL
      TK
      Thu
      30/11 08-10 Q2 Lecture 14: Unsupervised learning part II. Chapter 10 in ISL
      TK
      Fri
      01/12 10-12 E3
      Lecture 15: GUEST LECTURE: An insight into computational and statistical mass spectrometry-based proteomics LK
      Tue
      05/12 13-15 E3 Lecture 16: Random Trees and Classification. Chapter 8 in ISL JO
      Thu
      07/12 08-10 Q2 Lecture 17: Geometry of High-Dimensional Spaces, Gaussians in high Dimensions, Johnson -Lindenstrauss Lemma, Separating Gaussians.Part I, Chap.2 in FoDSc. TK
      Fri
      08/12 10-12 Q2
      Lecture 18: Guest Lecture: Inferring protein structures from many protein sequences I
      EA
      Tue
      12/12 13-15 E3
      Lecture 19: Guest Lecture: Inferring protein structures from many protein sequences II
      EA
      Fri
      14/12 08-10 Q2
      Lecture 20: Geometry of High-Dimensional Spaces, Gaussians in high Dimensions, Johnson -Lindenstrauss Lemma, Separating Gaussians, Part II. Chap.2 in FoDSc. TK
      Fri
      16/12 10-12 E51
      Lecture 21:Project presentation seminar 2 TK, TP
      Thu
      11/01/2018 TBA Q24, Q26, Q22 Exam TK
      Xy
      xx/xx/2018 TBA TBA Re-exam TK

      Welcome, we hope you will enjoy the course (and learn (sic) a lot)!

      Tetyana, Jimmy & Timo


      To course web page




      Published by: Timo Koski
      Updated:20176-10-12

    Kurssammanfattning:

    Datum Information Sista inlämningsdatum