Machine Learning

The newest version of the course on wiki (in Russian only).

Semester 1:

  1. Introduction (18.10.2007):
    basic notions of learning by examples, examples of applied problems, empirical risk minimization, maximization of likelihood, overfitting, cross-validation.
  2. Bayesian theory of classification (16.04.2008):
    the optimality of Bayesian classifier, non-parametric density estimations, Parzen window, quadratic discriminant, linear discriminant of Fisher, the mixture model of density, EM-algorithm, the network of radial basis functions.
  3. Similarity-based classification (16.04.2008):
    nearest neighbor classifier and its generalization, Parzen window again, potential functions method, object weights optimization, objects filtering.
  4. Clustering (22.11.2007):
    graphs approaches, statistical clustering (EM and k-means), agglomerative algorithm, dendrogram.
  5. Regression (21.12.2007):
    non-parametric regression, robust non-parametric regression (LOWESS), linear regression, singular value decomposition, regularization and ridge regression, nonlinear regression, logistic regression, generalized additive models, orthogonalization and stepwise regression, lasso and least angle regression.
  6. Generalization and model assessment (13.12.2006):
    cross-validation, Vapnik-Chervonenkis theory, overfitting, structural risk minimization, Akaike information criterion, Bayes information criterion.
  7. Features selection (21.12.2007):
    internal and external criteria, complexity optimization concepts, add-del and stepwise methods, depth-first and breadth-first search strategies, iterative procedure of GMDH (group method of data handling), genetic features selection, adaptive stochastic search.

Semester 2:

  1. Neural netwirks (18.10.2007):
    one-layer perceptron, Rozenblatt's, Hebb's, delta (ADALINE) rules, Novikov's theorem, stochastic gradient, multi-layer perceptron, back-propagation, weight decay, optimal brain damage, many heuristics against overfitting, paralysis and slow convergence. Kohonen network, WTA, WTM and CWTA strategies, Kohonen maps and the art of their interpreting.
  2. Support Vector Machine (18.10.2007):
    optimal separating hyperplane, hard- and soft-margin hyperplane, kernel trick, ten rules to build kernels, links with two-layer networks, RBF and EM-algorithm.
  3. Compositions (22.11.2007):
    simple, majority, seniority and weighted voting, heuristic algorithms for learning simple and seniority voting compositions, boosting, bagging, random subspace method, mixture of experts, hierarchical mixture of experts.
  4. Logic (discrete) classification (21.12.2007):
    basic notions of predicate, rule and regularity, informativity criteria, forms of regularities (conjunctions, balls? and hyperplaines in low-dimension subspaces), stochastic local search for learning conjunctions, decision list, decision tree, ID3, pre-pruning and post-pruning, look-a-head and anytime algorithms, depth-first and breadth-first search strategies (KORA and TEMP algorithms), weighted voting of rules via boosting, Zhuravlev's algorithm of estimations calculation, association rules and aPriory algorithm.

The course programme (in Russian)
Practical work (in Russian)

Scientific research directions

Full text here: (in Russian).

  1. Bounding the probability of overfitting, enhancement the generalization ability of learning algorithms, Vapnik-Chervonenkis theory, computational learning theory, shell bounds.
  2. Combinatorial statistics, exact statistical tests, non-parametric statistics.
  3. Multiple classifier systems, ensemble learning, classifier fusion, mixture of experts.
  4. Collaborative filtering and clients environment analysis, web usage mining, personalization, client relationship management.