A KDD'16 Tutorial

1pm - 5pm on Saturday, the 13

Imperial Ballroom A&B

From understanding the structure of data, to classification and topic modeling, graphical models are core tools in machine learning and data mining. They combine probability and graph theories to form a compact representation of probability distributions. In the last decade, as data stores became larger and higher-dimensional, traditional algorithms for learning graphical models from data, with their lack of scalability, became less and less usable, thus directly decreasing the potential benefits of this core technology. To scale graphical modeling techniques to the size and dimensionality of most modern data stores, data science researchers and practitioners now have to meld the most recent advances in numerous specialized fields including graph theory, statistics, pattern mining and graphical modeling.

This tutorial will cover the core building blocks that are necessary to build and use scalable graphical modeling technologies on large and high-dimensional data.

Time | Duration | Content | Specific skills |
---|---|---|---|

13:00 | 20min | Introduction | Definition and usefulness of graphical models |

13:20 | 25min | Graphical Models 101 | Main families of graphical models. Issues for scaling |

13:45 | 20min | Graph theory | What are decomposable models? Why are they useful? Standard associated algorithms. |

14:05 | 20min | Scoring decomposable models | How to score decomposable models? |

14:30 | 30min | Break | How good the coffee is in SF |

15:00 | 40min | Efficient search | What are clique-graphs? How to perform greedy search over graphical models with 1,000+ variables? |

15:40 | 40min | The nitty-gritty | How do we make it really work? How to count efficiently? How not to do the same thing twice? |

16:20 | 15min | Use cases | What can we do once we have graphical models for high-dimensional data? |

16:35 | 10min | Wrapping up! | Summary and description of the main remaining issues in the field |