Samet Oymak

Email: name at ece dot ucr dot edu (name=oymak)


I am an Assistant Professor in the Department of Electrical and Computer Engineering at UC Riverside. Previously, I worked at The Voleon Group and Google and spent time at Berkeley as a postdoctoral scholar at the Simons Institute. I received my PhD degree in Electrical Engineering from Caltech.

I am broadly interested in optimization, statistics, and machine learning. This includes convex/nonconvex optimization problems, nonlinear models, time-series analysis, and large scale distributed algorithms. I am particularly interested in developing principled algorithms with solid theoretical foundation that have good tradeoffs between speed, accuracy, scalability, and the data-efficiency.

Here is my CV. Some recent publications are listed below.

Recent papers

Non-asymptotic system ID: We consider the problem of learning a realization for a linear time-invariant (LTI) dynamical system from input/output data. Given a single input/output trajectory, we provide finite time analysis for learning the system's Markov parameters, from which a balanced realization is obtained using the classical Ho-Kalman algorithm. Data Enrichment: We consider groups of observations arising from shared and per-group individual parameters, each with its own structure such as sparsity or group sparsity. We provide data and computation efficient estimators to tackle this problem while allowing arbitrary number of groups. Learning Compact Neural Nets: Proper regularization is critical for speeding up training, improving generalization performance, and learning cost efficient models. In this work, we propose and analyze regularized gradient descent algorithms for learning shallow networks. We introduce covering dimension and show that problem becomes well conditioned and local linear convergence occurs once the amount of data exceeds the covering dimension. We also establish generalization bounds and consider convolutional and sparsity constraints as applications. End-to-end Deep Learning: We propose data-efficient tensor decomposition algorithms for training convolutional neural nets (CNN). We rigorously establish a connection between low-rank tensors and CNNs and propose a multistage approach that can learn convolutional kernels of all layers simultaneously. Sharp Time-Data Tradeoffs for Linear Inverse Problems: We present a unified convergence analysis of the gradient projection algorithm applied to inverse problems. We *very* accurately characterize the convergence rate associated with a wide variety of random measurement ensembles in terms of the data amount and structural complexity of the model. Learning Feature Nonlinearities with Binned Regression: What to do when the dependence betweens feature and response is nonlinear? We propose a principled algorithm to learn additive models based on feature binning. Borrowing techniques from high-dimensional statistics, we show that such models can be learned with linear convergence and using near optimal amount of data. Our findings naturally highlight the role of model complexity. Sketching any set with RIP matrices: We show that for the purposes of dimensionality reduction certain class of structured random matrices behave similarly to random Gaussian matrices. This class includes several matrices for which matrix-vector multiply can be computed in log-linear time, providing efficient dimensionality reduction of general sets. Universality Laws for Randomized Dimension Reduction: Gaussianity assumption plays a central role in various data science problems. In this work, we show that a large class of random matrices behaves in a identical fashion to Gaussians for various optimization and learning problems. Fast and Reliable Estimation from Nonlinear Observations: We show that it is possible to quickly learn nonlinear models even if the relation between input and output (i.e. link function) is not known. The key idea is linear approximation of the nonlinearity. Sample Optimal Bounds for Fast Binary Embedding: A common strategy for fast document and image retrieval is to create a binary signature of the data by embedding it into Hamming cube. Ideally, we want a fast and space efficient embedding. We show that subsampled Fourier transform indeed provides an almost-optimal embedding strategy even after {+1, -1} quantization. Simultaneously structured models: In several applications, the model of interest has several structures at the same time. Examples include, quadratic compressed sensing, sparse PCA, low-rank tensor completion and estimating sparse and smooth signals (fused lasso). We show a weakness result for multi-objective convex penalizations that aims to recover and/or estimate such signals. The recently updated paper has simpler failure conditions that apply to a wide-range of measurement ensembles under much less assumptions. Graph clustering with missing data: Graph clustering problem arises in community detection, social networking, and complex networks. Given access to noisy and incomplete connectivity information of the graph, can we identify the communities that matter (i.e. clusters)? We develop simple but precise performance bounds for convex optimization based clustering techniques. These bounds reveal intriguing phase transitions as a function of model parameters.

My thesis can be found here. Finally, you can click here for some personal facts.