Recent Publications

. Identification and Inference in Bayesian Word Embeddings. In the Proceedings of the Third Workshop on NLP and Computational Social Science. 2019., 2019.

Preprint PDF Code Dataset

. Mapping Indonesia's Civil Service. Policy Report. World Bank Group., 2018.

PDF

. Statistics and International Security. In The Oxford Handbook of International Security, Oxford University Press., 2018.

Preprint PDF Dataset

Recent Posts

Introduction Recently, I came across the idea that you can get relevant keywords for word2vec by tokenizing a corpus on stopwords, in addition to standard punctuation (found via). This seemed like a really cool unsupervised way of capturing (hopefully!) relevant phrases. I was intrigued. A brief note: “tokenizing” refers to splitting a document into words or phrases based on a pre-defined set of rules. The most common way to do this is by splitting on spaces and “end-of-sentence” punctuation (ex: “!

CONTINUE READING

Just a brief note. For my dissertation, I needed to create a monthly panel of militarized interstate dispute (MID) data, which proved to be more difficult than initially anticipated. To “be kind to my future self,” I wrote up what I did on a gist, and I’ll also copy the code below. If you find yourself in a similar situation, I hope this helps! # Code to create a monthly, directed dyadic panel of # Militarized Interstate Dispute data, with directional initiation # uses cshapes to create monthly list of all dyads, then uses data.

CONTINUE READING

Generalized Propensity Score Weighting Adam Lauretig 2018-11-03 Introduction I’m writing this gist to better understand both the generalized propensity score, and marginal structural models, and especially, their intersection. In this, the goal is to estimate how a real-valued treatment can be estimated and understood for a time varying effect. Here, I denote treatment A at time t as At, and covariates X for individual i at time t as Xi, t.

CONTINUE READING

Teaching

I was the instructor of record for POL4782: Data Analysis in Political Science II in Spring 2018, the syllabus is available here. I taught students regression modeling using ordinary least squares (OLS), and generalized linear models (GLMs) using Maximum Likelihood Estimation. The Rmarkdown template I had my students use for problem sets and materials for getting acquainted with Rmarkdown are available here. I hope you find them helpful!

I was the graduate teaching assistant for the first course in the graduate quantitative methods sequence in Fall 2018. I was responsible holding recitations for teaching students the R software package, the materials I used are available here.