Cambridge Machine Learning Colloquium and Seminar Series


The Cambridge Machine Learning Colloquium and Seminar Series is being organized by Microsoft Research New England, Harvard, Massachusetts Institute of Technology, and the Istituto Italiano di Tecnologia.

See here the calendar, and here the organizers. For updated information and announcements please subscribe to the CambridgeML mailing list.

 

Seminar:

TALK: How users evaluate things and each other in social media
Speaker: Jure Leskovec
Speaker Affiliation: Stanford University
Host: Sham Kakade
Host Affiliation: Microsoft Research New England

Date: Wednesday, September 5, 2012.
Time: 4:00 PM - 5:00 PM
Location: Microsoft Research New England. First Floor Conference Center. One Memorial Drive, Cambridge, MA.

In a variety of domains, mechanisms for evaluation allow one user to say whether he or she trusts another user, or likes the content they produced, or wants to confer special levels of authority or responsibility on them. We investigate a number of fundamental ways in which user and item characteristics affect the evaluations in online settings. For example, evaluations are not unidimensional but include multiple aspects that all together contribute to user's overall rating. We investigate methods for modeling attitudes and attributes from online reviews that help us better understand user's individual preferences. We also examine how to create a composite description of evaluations that accurately reflects some type of cumulative opinion of a community. Natural applications of these investigations include predicting the evaluation outcomes based on user characteristics and to estimate the chance of a favorable overall evaluation from a group knowing only the attributes of the group's members, but not their expressed opinions

Jure Leskovec is assistant professor of Computer Science at Stanford University where he is a member of the Info Lab and the AI Lab. His research focuses on mining large social and information networks. Problems he investigates are motivated by large scale data, the Web and on-line media. This research has won several awards including best paper awards at KDD (2005, 2007, 2010), WSDM (2011), ICDM (2011) and ASCE J. of Water Resources Planning and Management (2009), ACM KDD dissertation award (2009), Microsoft Research Faculty Fellowship (2011), Alfred P. Sloan Fellowship (2012) and NSF Early Career Development (CAREER) Award (2011). He received his bachelor's degree in computer science from University of Ljubljana, Slovenia, Ph.D. in machine learning from the Carnegie Mellon University and postdoctoral training from Cornell University. You can follow him on Twitter @jure


Upcoming Seminar:

TALK: Theoretical and Algorithmic Foundations of Online Learning
Speaker: Sasha Rakhlin
Speaker Affiliation: University of Pennsylvania
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: Wednesday, October 17th, 2012.
Time: 4:00 PM - 5:00 PM
Location: McGovern Seminar Room, MIT 46-3189

Within the framework of sequential prediction (online learning), data arrive in a stream, and the learner is tasked with making a sequence of decisions. Such a basic scenario has been studied in Information Theory, Decision Theory, Game Theory, Statistics, and Machine Learning. The learning protocol and the non-i.i.d. (or even adversarial) nature of observed data constitute a big departure from the well-studied setting of Statistical Learning Theory. In the latter, many important tools and complexity notions of the hypothesis class have been developed, starting with the pioneering work of Vapnik and Chervonenkis. In contrast, the theoretical understanding of online learning has been lacking, as most results are obtained on a case-by-case basis. In this talk, we first focus on no-regret online learning and develop the relevant notions of complexity in a surprising parallel to Statistical Learning Theory. We characterize online learnability through finiteness of the sequential versions of combinatorial dimensions, random averages, and covering numbers. This non-constructive study of inherent complexity is then augmented with a recipe for developing online learning algorithms via a notion of a relaxation. To demonstrate the utility of our approach, we develop a new family of randomized methods and new algorithms for the matrix completion problem. We then discuss extensions of our techniques beyond no-regret learning, including Blackwell approachability and calibration of forecasters. Finally, we present open problems and directions of further research.

Sasha Rakhlin is Assistant Professor at the Department of Statistics University of Pennsylvania, The Wharton School.


Colloquium:

TALK: From Rosenblatt's learning model to the model of learning with nontrivial teacher.
Speaker: Vladimir Vapnik
Speaker Affiliation: Royal Holloway University of London
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: September 26th, 2012
Time: 4:00 PM - 5:00 PM
Location: MIT Bldg 46-3002 Singleton Auditorium

Vladimir Naumovich Vapnik is one of the main developers of the Vapnik-Chervonenkis theory. He received his master's degree in mathematics at the Uzbek State University, Samarkand, Uzbek SSR in 1958 and Ph.D in statistics at the Institute of Control Sciences, Moscow in 1964. He worked at this institute from 1961 to 1990 and became Head of the Computer Science Research Department. At the end of 1990, he moved to the USA and joined the Adaptive Systems Research Department at AT&T Bell Labs in Holmdel, New Jersey. The group later became the Image Processing Research Department of AT&T Laboratories when AT&T spun off Lucent Technologies in 1996. Vapnik Left AT&T in 2002 and joined NEC Laboratories in Princeton, New Jersey, where he currently works in the Machine Learning group. He also holds a Professor of Computer Science and Statistics position at Royal Holloway, University of London since 1995, as well as a position as Professor of Computer Science at Columbia University, New York City since 2003. He was inducted into the U.S. National Academy of Engineering in 2006. He received the 2005 Gabor Award, the 2008 Paris Kanellakis Award, the 2010 Neural Networks Pioneer Award, the 2012 IEEE Frank Rosenblatt Award, and the 2012 Benjamin Franklin Medal in Computer and Cognitive Science.

Upcoming Colloquium:

TALK: Some Mathematics of Immunology and of Protein Folding.
Speaker: Steve Smale
Speaker Affiliation: City University of Hong Kong
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: November 28th, 2012
Time: 4:00 PM - 5:30 PM
Location: MIT Bldg 46-3002 Singleton Auditorium

A geometrical picture of certain current activities in biology will be presented. In particular proposals for "good kernels" will be suggested for sets of amino acid strings as well as sequences of Ramachandran angle pairs.
Stephen Smale is an American mathematician from Flint, Michigan. He was awarded the Fields Medal in 1966, and spent more than three decades on the mathematics faculty of the University of California, Berkeley (1960-1961 and 1964-1995). Since 2002 Smale is a Professor at the Toyota Technological Institute at Chicago; starting August 1, 2009, he is also a Distinguished University Professor at the City University of Hong Kong. In 2007, Smale was awarded the Wolf Prize in mathematics.

Seminar:

TALK: Deep Architectures and Deep Learning: Theory, Algorithms, and Applications.
Speaker: Pierre Baldi
Speaker Affiliation: University of California, Irvine
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: December 13th, 2012
Time: 3:30 PM - 4:30 PM
Location: MIT Bldg 32 Seminar Room G449 (Patil/ Kiva)

Deep architectures are important for machine learning, for engineering applications, and for understanding the brain. In this talk, we will provide a brief historical overview of deep architectures from their 1950s origins to today. Motivated by this overview, we will study and prove several theorems regarding deep architectures and one of their main ingredients--autoencoder circuits--in particular in the unrestricted Boolean and unrestricted probabilistic cases. We will show how these analyses lead to a new family of learning algorithms for deep architectures--the deep target (DT) algorithms. The DT approach converts the problem of learning a deep architecture into the problem of learning many shallow architectures by providing learning targets for each deep layer. Finally, we will present simulation results and applications of deep architectures and DT algorithms to the protein structure prediction problem.
Pierre Baldi is Chancellor's Professor in the Department of Computer Science and Director of the Institute for Genomics and Bioinformatics and Associate Director of the Center for Machine Learning and Intelligent Systems at the University of California, Irvine. He received his PHD degree from the California Institute of Technology. His research work is at the interface of the computational and life sciences, in particular the application of artificial intelligence and statistical machine learning methods to problems in chemoinformatics, genomics, systems biology, and computational neuroscience. He is credited with pioneering the use of Hidden Markov Models (HMMs), graphical models, and recursive neural networks in bioinformatics. His group has developed widely used databases, software, and web servers including the ChemDB database and chemoinformatics portal for the prediction of molecular properties and applications in chemical synthesis and drug discovery, the SCRATCH suite for protein structure prediction, the Cyber-T program for the differential analysis of gene expression data using Bayesian statistical methods, the MotifMap system for charting transcription factor binding sites on a genome-wide scale, and the CRICK expert system for analyzing molecular networks and pathways in healthy and disease systems. Dr. Baldi has published over 260 peer-reviewed research articles and four books. He is the recipient of the 1993 Lew Allen Award, the 1999 Laurel Wilkening Faculty Innovation Award, a 2006 Microsoft Research Award, and the 2010 E. R. Caianiello Prize for research in machine learning. He is also a Fellow of the Association for the Advancement of Science (AAAS), the Association for the Advancement of Artificial Intelligence (AAAI), and the Institute of Electrical and Electronics Engineers (IEEE). Through his consulting company IDLAB, Inc. he has consulted for government agencies, publishers, and several companies in the information technology and biotechnology industries. He was co-founder and CEO of Net-ID, Inc. in the 1990s, a company focused on the application of machine learning methods to fingerprint recognition and bioinformatics. More recently, he co-founded Reaction Explorer, a company developing interactive expert systems for chemical education and Group IV Biosystems, a synthetic biology company.

Seminar:

TALK: Transportation Distances and their Application in Machine Learning: New Problems
Speaker: Marco Cuturi
Speaker Affiliation: University of Kyoto
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: December 12th, 2012
Time: 4:00 PM - 5:00 PM
Location: MIT Bldg 32 Seminar Room G449 (Patil/ Kiva)

MC got his PhD in 2005 at the Ecole des Mines de Paris, under the supervision of JP Vert. He has worked in the Institute of Statistical Mathematics in Tokyo and in Princeton University, and is now Associate Professor at Kyoto University.
I will present in this talk two new research topics related to the optimal transportation distance (also known as Earth Mover's or Wasserstein) and its application in machine learning to compare histograms of features. I discuss first the ground metric learning problem, which is the problem of tuning automatically the parameters of transportation distances using labeled histogram data. After providing some reminders on optimal transportation, I will argue that learning transportation distances is akin to learning an L1 distance on the simplex, namely a distance with polyhedral level sets, and I will draw some parallels with Mahalanobis distances, the L2 distance and elliptic level sets. I will then introduce our algorithm (arXiv:1110.2306) and more recent extensions. In the second part of my talk, I address the fact that transportation distances are not Hilbertian by showing that they can be cast as positive definite kernels through the "generating function trick". We prove that the trick, which uses the generating function of the transportation polytope to define a similarity - rather than focusing exclusively on the optimal transport to define a distance - leads to a positive definite kernel between histograms (arXiv:1209.2655).

Seminar:

TALK: Regularized Learning in Reproducing Kernel Banach Spaces.
Speaker: Jun Zhang
Speaker Affiliation: Department of Psychology. University of Michigan, Ann Arbor
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: Wed. April 17th, 2013
Time: 4:00 PM - 5:00 PM
Location: MIT Bldg 42 Seminar Room G449 (Patil/ Kiva)

Regularized learning is the contemporary framework for learning to generalize from finite samples (classification, regression, clustering, etc). Here the problem is to learn an input-outputmappingf: X->Y given finite samples {(xi, yi), i=1,...,N}. With minimal structural assumptions on X, the class of functions under consideration is assumed to fall under a Banachspace of functions B. The learning-from-data problem is then formulated as an optimization problem in such a function space, with the desired mapping as an optimizer to be sought, where the objective function consists of a loss term L(f) capturing itsgoodness-of-fit (or the lack thereof) on given samples {(f(xi), yi), i=1,...,N}, and a penalty term R(f) capturing its complexity based on prior knowledge about the solution (smoothness,sparsity, etc). This second, regularizing term is often taken to be the norm of B, or a transformation Phi thereof: R(f) = Phi(|f|). This program has been successfully carried out for the Hilbert space of functions, resulting in the celebrated Reproducing Kernel Hilbert Space methods in machine learning. Here, we will remove the Hilbert space restriction, i.e., the existence of an inner product, and show that the key ingredients of this framework (reproducing kernel, representer theorem, feature space) remain to hold for a Banach space that is uniformly convex and uniformly Frechet differentiable. Central to our development is the use of a semi-inner product operator and duality mapping for a uniform Banach space in place of an inner-product for a Hilbert space. This opens up the possibility of unifying kernel-based methods (regularizing L2-norm) and sparsity-based methods (regularizing l1-norm), which have so far been investigated under different theoretical foundations.

Dr. Jun Zhang is a Professor of Psychology at the University of Michigan, Ann Arbor. He received the B.Sc. degree in Theoretical Physics from Fudan University in 1985, and Ph.D. degree in Neurobiology from the University of California, Berkeley in 1992. He has also held visiting positions at the University of Melbourne, the University of Waterloo, and RIKEN Brain Science Institute. During 2007-2010, he worked as the Program Manager for the U.S. Air Force Office of Scientific Research (AFOSR) in charge of the basic research portfolio for Cognition and Decision in the Directorate of Mathematics, Information, and Life Sciences. Dr. Zhangserved as the President for the Society for Mathematical Psychology (SMP) and serves on the Federation of Associations in Brain and Behavioral Sciences (FABBS). He is Associate Editor for the Journal of Mathematical Psychology, anda Fellow of the Association for Psychology Sciences (APS). Dr. Zhang has published about 50 peer-reviewed journal papers in the field of vision, mathematical psychology, cognitive psychology, cognitive neuroscience, game theory, machine learning, etc. His research has been funded by the National Science Foundation (NSF), Air Force Office for Scientific Research (AFOSR), and Army Research Office (ARO).

Seminar:

TALK: Causal Inference and Anticausal Learning.
Speaker: Bernhard Schölkopf
Speaker Affiliation: Max Planck Institute for Intelligent Systems, Tubingen, Germany
Host: Tomaso Poggio, Lorenzo Rosasco
Host Affiliation: Laboratory for Computational and Statistical Learning, MIT-IIT

Date: Fri. June 7th, 2013
Time: Starting 3:30 pm
Location: MIT Bldg 46 Singleton Auditorium

Causal inference is an intriguing field examining causal structures by testing their statistical footprints. The talk introduces the main ideas of causal inference from the point of view of machine learning, and discusses implications of underlying causal structures for popular machine learning scenarios such as covariate shift and semi-supervised learning. It argues that causal knowledge may facilitate some approaches for a given problem, and rule out others.