This paper is under review which means review has begun. You can track the progress of this review on GitHub over here »

vtreat is an R data.frame processor/conditioner that prepares messy real-world data for predictive modeling in a statistically sound manner. Common problems vtreat defends against: invalid values, missing values, too many categorical levels, rare categorical levels, and new categorical levels (levels seen during application, but not during training).

Archive DOI: pending