Notebooks


Extrapolation and Interpolation in Machine Learning Modeling with Fast Food and astartes

Jackson Burns, Kevin Spiekermann, and William Green

Machine learning is a groundbreaking tool for tackling high-dimensional datasets with complex correlations that humans struggle to comprehend. An important nuance of ML is the difference between using a model for interpolation or extrapolation, meaning either inference or prediction. This work will demonstrate visually what interpolation and extrapolation mean in the context of machine learning using astartes, a Python package that makes it easy to tackle in ML modeling. Many different sampling approaches are made available with astartes, so using a very tangible dataset - a fast food menu - we can visualize how different approaches differ and then train and compare ML models.


Stylo2gg: Visualizing reproducible stylometric analysis

James Clawson

For researchers using R to do work in stylometry, the Stylo package in R is indispensable, but it also has some limitations. The Stylo2gg package addresses some of these limitations by extending the usefulness of Stylo. Among other things, Stylo2gg adds logging and replication of analyses, keeping necessary files and introducing a systematic way to reproduce past work. With visualization as its initial purpose, Stylo2gg also makes exploring stylometric data easy, providing options for labeling, highlighting subgroups, and double coding data for added legibility in black and white or in color. Finally, as hinted by the name, the conversion of graphics from base R into Ggplot2 changes the style of the output and introduces more options to extend analyses with many other packages and addons. The reproducible notebook shown here, 01-stylo2gg.qmd or rendered in 01-stylo2gg.html, walks through much of the package, including many features that were added in the past year.