Unsupervised learning, one notion or many?

Unsupervised learning, as the name suggests, is the science of learning from unlabeled data. A look at the wikipedia page shows that this term has many interpretations: (Task A) Learning a distribution from samples. (Examples:... Continue

Generalization and Equilibrium in Generative Adversarial Networks (GANs)

The previous post described Generative Adversarial Networks (GANs), a technique for training generative models for image distributions (and other complicated distributions) via a 2-party game between a generator deep net and a discriminator deep net.... Continue

Generative Adversarial Networks (GANs), Some Open Questions

Since ability to generate “realistic-looking” data may be a step towards understanding its structure and exploiting it, generative models are an important component of unsupervised learning, which has been a frequent theme on this blog.... Continue

Back-propagation, an introduction

Given the sheer number of backpropagation tutorials on the internet, is there really need for another? One of us (Sanjeev) recently taught backpropagation in undergrad AI and couldn’t find any account he was happy with.... Continue

The search for biologically plausible neural computation: The conventional approach

Inventors of the original artificial neural networks (NNs) derived their inspiration from biology. However, as artificial NNs progressed, their design was less guided by neuroscience facts. Meanwhile, progress in neuroscience has altered our conceptual understanding... Continue

Gradient Descent Learns Linear Dynamical Systems

From text translation to video captioning, learning to map one sequence to another is an increasingly active research area in machine learning. Fueled by the success of recurrent neural networks in its many variants, the... Continue

Linear algebraic structure of word meanings

Word embeddings capture the meaning of a word using a low-dimensional vector and are ubiquitous in natural language processing (NLP). (See my earlier post 1 and post2.) It has always been unclear how to interpret... Continue

A Framework for analysing Non-Convex Optimization

Previously Rong’s post and Ben’s post show that (noisy) gradient descent can converge to local minimum of a non-convex function, and in (large) polynomial time (Ge et al.’15). This post describes a simple framework that... Continue

Markov Chains Through the Lens of Dynamical Systems: The Case of Evolution

In this post, we will see the main technical ideas in the analysis of the mixing time of evolutionary Markov chains introduced in a previous post. We start by introducing the notion of the expected... Continue

Saddles Again

Thanks to Rong for the very nice blog post describing critical points of nonconvex functions and how to avoid them. I’d like to follow up on his post to highlight a fact that is not... Continue