Big Data Econometrics
ECON 4984 – This course covers the theoretical, computational and statistical underpinnings of the big data analysis. The focus will be the econometric models and machine learning techniques to analyze the high-dimensional data sets a.k.a. “Big Data” and their implications in research focusing on interesting economic questions that arise from considering the rapid changes in data availability and computational technology. Big data econometric models provide a vehicle for modeling and analyzing complex phenomena and for incorporating rich sources of confounding information into economic models. The goal of this course is to give an applied, hands-on introduction to these methods.
- Python Crash Course; (NumPy, SciPy, pandas, matplotlib, scikit-learn, PyTorch)
- Big Data and the Curse of Dimensionality in Economics and Finance
- Regression with Many Regressors: Standard Approaches to Model Selection
- Penalized Regression Methods: Lasso, Ridge, and Elastic Net
- Factor Models; Estimation and Inference
- Prediction with a Large Number of Covariates (“Big P”)
- Analysis with Large Sample Sizes (“Big N”)
- High-dimensional Methods and Inference on Structural and Treatment Effects
- A Brief Introduction to Bayesian Inference and Bayesian VARs
- Nonlinearity in Big Data Sets and Nonlinear Dimensionality Reduction
- Neural Networks and Deep Learning for Big Data Analysis
- Spark and Python for Big Data with PySpark
Machine Learning and Data Classifiers
C7: Module for Program In Business Analytics – Finding patterns and relationships in large volumes of data are very useful in marketing, fraud detection, and national security among other applications. Artificial intelligence methods that can lend itself to patterns and relationships in data will be introduced in this module. Applications of classification and learning algorithms will be discussed. Integration of these algorithms to business analytics frameworks will be demonstrated using real-world examples. Different learning techniques like supervised and unsupervised learning, deep learning techniques, text analytics, and recommender systems will be covered.
- Python Crash Course
- NumPy, pandas, matplotlib, SciPy, Sklearn & Pytorch
- Brief Reminders of
- Linear Algebra, Probability Theory, Convex Optimization
- Learning Theory
- Types of Learning (Supervised, Unsupervised, RL), the PAC Learning
- Supervised Learning
- Review of Linear Regression, Least Square Estimation, Logistic Regression
- Moving Beyond Linear Methods
- Polynomial Regression, Regression Splines, Generalized Additive Models,..
- Kernel Methods
- VC-Dimension, Support Vector Machines (SVM)
- Neural Networks and Deep Neural Networks
- Regularization and Model/Feature Selection.
- Binary Classification with +/-1 Labels, Multi-Class Classification, Instant-based (e.g. kNN), Generative (e.g. Naive Bayes), Discriminitive (e.g. Tree-based methods)
- Unsupervised Learning and Clustering Methods
- k-means Clustering, Hierarchical Clustering, Principal Component Analysis, Autoencoders and Factor Analysis.
- Reinforcement Learning and Control
- Practical Advice for ML projects
- Examples of AI and Machine Learning in Practice