Expand Your Data Science Toolkit with Our Latest Math and Stats Must-Reads

Published in

Towards Data Science

4 min readApr 25, 2024

Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.

The fundamental principles of math that data scientists use in their day-to-day work may have been around for centuries, but that doesn’t mean we should approach the topic as if we only learn it once and then store away our knowledge in some dusty mental attic. Practical approaches, tools, and use cases evolve all the time—and with them comes the need to stay up-to-date.

This week, we’re thrilled to share a strong lineup of recent math and stats must-reads, covering a wide range of questions and applications. From leveraging (very) small datasets to presenting linear regressions in accessible, engaging ways, we’re sure you’ll find something new and useful to explore. Let’s dive in!

N-of-1 Trials and Analyzing Your Own Fitness Data
The idea behind N-of-1 studies is that you can draw meaningful insights even when the data you’re using is based on input from a single person. It has far-reaching potential for designing individualized healthcare strategies, or, in the case of Merete Lutz’s fascinating project, establishing meaningful connections between alcohol consumption and sleep quality.
How Reliable Are Your Time Series Forecasts, Really?
Making long-term predictions is easy; making accurate long-term predictions is, well, less so. Bradley Stephen Shaw recently shared a useful guide to help you determine the reliability horizon of your forecasts through the effective use of cross-validation, visualization, and statistical hypothesis testing.
Building a Math Application with LangChain Agents
Despite the major strides LLMs have made in the past couple of years, math remains an area they struggle with. In her latest hands-on tutorial, Tahreem Rasul unpacks the challenges we face when we try to make these models execute mathematical and statistical operations, and outlines a solution for building an LLM-based math app using LangChain agents, OpenAI, and Chainlit.

A Proof of the Central Limit Theorem
It’s always a joy to see an abstract concept take concrete shape and, along the way, become much more accessible and intuitive for learners. That’s precisely what Sachin Date accomplishes in his latest deep dive, which shows us the inner workings of the central limit theorem, “one of the most far-reaching and delightful theorems in statistical science,” through the example of… candy!
8 Plots for Explaining Linear Regression to a Layman
Even if you, a professional data scientist or ML engineer, fully grasp the implications of your statistical analyses, chances are many of your colleagues and other stakeholders won’t. This is where strong visualizations can make a major difference, as Conor O'Sullivan demonstrates with eight different residual, weight, effect, and SHAP plots that explain linear regression models effectively.

Looking to branch out beyond math and stats this week? We hope so! Here are some of our best recent reads on other topics:

If you’re thinking of giving back to the community by contributing to an open-source project, don’t miss Mike Clayton’s terrific recap of his experience fixing bugs on the ever-popular Pandas library.
Climate change might be the defining global challenge we face today; Thu Vu shares a helpful data-backed perspective on its magnitude, and reflects on AI’s potential to help us mitigate some of its consequences.
For anyone in the mood for some hands-on tinkering, we strongly recommend Alison Yuhan Yao’s new tutorial on semiautomatic image-segmentation labeling, based on a recent project that focused on runway show images.
Robust unit-testing practices are common among software developers; Jonathan Serrano advocates for their wider adoption in data science and machine learning workflows, too, and explains how this kind of upfront investment can pay off in the long run.
ML product managers are paying a lot of attention to the technical infrastructure powering their tools, but as Janna Lipenkova stresses, it’s equally crucial to ensure they offer a smooth user experience.
It’s no secret that the current job market is challenging for many data professionals. Erin Wilson’s visual recap of her recent journey offers a healthy dose of inspiration—and pragmatic insights—to support you on your job search.
What will it take to push humanoid robots into the mainstream on assembly lines? Nikolaus Correll reports from the forefront of robotics innovation, and looks at how recent advances in AI might drive a major shift in the field.

Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.

Until the next Variable,

TDS Team

Expand Your Data Science Toolkit with Our Latest Math and Stats Must-Reads

Written by TDS Editors