Notebooks
Extrapolation and Interpolation in Machine Learning Modeling with Fast Food and astartes
Jackson Burns, Kevin Spiekermann, and William Green
Machine learning is a groundbreaking tool for tackling high-dimensional datasets with complex correlations that humans struggle to comprehend. An important nuance of ML is the difference between using a model for interpolation or extrapolation, meaning either inference or prediction. This work will demonstrate visually what interpolation and extrapolation mean in the context of machine learning using astartes, a Python package that makes it easy to tackle in ML modeling. Many different sampling approaches are made available with astartes, so using a very tangible dataset - a fast food menu - we can visualize how different approaches differ and then train and compare ML models.
Stylo2gg: Visualizing reproducible stylometric analysis
James Clawson
For researchers using R to do work in stylometry, the Stylo package in R is
indispensable, but it also has some limitations. The Stylo2gg package addresses
some of these limitations by extending the usefulness of Stylo. Among other
things, Stylo2gg adds logging and replication of analyses, keeping necessary
files and introducing a systematic way to reproduce past work. With visualization
as its initial purpose, Stylo2gg also makes exploring stylometric data easy,
providing options for labeling, highlighting subgroups, and double coding data
for added legibility in black and white or in color. Finally, as hinted by the
name, the conversion of graphics from base R into Ggplot2 changes the style of
the output and introduces more options to extend analyses with many other
packages and addons. The reproducible notebook shown here, 01-stylo2gg.qmd
or
rendered in 01-stylo2gg.html
, walks through much of the package, including many
features that were added in the past year.