Ask HN: Anyone working in traditional ML/stats research instead of LLMs?

jononor · 2025-05-17T18:17:48 1747505868

I work on machine learning applied to sensor data, for understanding physical phenomena. Both time-series models etc for condition monitoring of technical machinery (specifically HVAC in buildings) at https://soundsensing.no And also running ML inference directly on microcontroller-based sensors, via the open-source project emlearn - https://github.com/emlearn/emlearn

usgroup · 2025-05-17T06:57:30 1747465050

Statistical modelling is largely unrelated to machine learning in its ideology. If you're a professional Statistician then you're most likely working as part of some function heavily utilising randomised experiment design, or less frequently, observational designs. This would include hard sciences, actuarial sciences, finance (risk), manufacturing, poll/census research.

The main commercial opensource language for serious Statisticians is R. You can Google for the sorts of jobs requiring R as a marker, if you're interested in applications of Statistics unrelated to LLMs.

To answer your own question about classical ML, you can Google for jobs requiring the specific classical ML technologies in which you are interested as a marker.

epaprat · 2025-05-17T04:45:05 1747457105

There’s still a lot of active research in traditional ML areas that LLMs haven’t solved. Causal inference, robustness to distribution shifts, and adversarial resilience remain open challenges. Continual and online learning, where models adapt without forgetting, are crucial for real-world deployment. Multi-modal learning beyond text, especially fusing vision, time series, and structured data, is another tough frontier. Interpretability, especially in high-stakes domains, still requires far more than attention maps. LLMs are impressive, but they haven’t made most of classical ML research obsolete.

helltone · 2025-05-17T04:41:16 1747456876

I'm building a tool to make ml on tabular data (forecasting, imputation, etc) easier and more accessible. The goal is to go from zero to a basic working model in minutes, even if the initial model is not perfect, and then iteratively improve the model step by step, while continuously evaluating each step with metrics and comparisons to the previous model. So it's less ml foundation research, and more trying to package it in a user friendly way with a nice workflow, but if that's interesting feel free to reach out (email in profile).