Day 43 of 100 Days of AI

Logistic regression thresholds.

A few weeks back I built a number of logistic regression models without quite appreciating the impact of the thresholds you set. A few pages into a chapter on assessing ML model performance helped me eliminate this gap today.

It turns out that if you lower the classification threshold, you increase the chances of correctly identifying positive cases, the True Positives. This comes at the cost of increased False Positives. However, there are some cases where this is the best approach.

This book shares the example of ebola cases. In that situation you would rather have a lower classification threshold and increase your True Positive Rate (i.e. recall) at the expense of more False Positives. This ensures you catch the maximum number of ebola cases. A less morbid example is venture capital. If you had a startup success prediction model for early-stage companies you would be better off with a lower threshold since that limits the chances of missing out on a potential outlier success.

I’ll continue with this thought and more reading tomorrow.

Day 42 of 100 Days of AI

Naive Bayes. This is a supervised machine learning classification technique that uses Bayes’ theorem to make classification predictions. Today I worked through a simple example of the technique and also found this explainer of Bayes’ theorem especially useful, along with an explanation by ChatGPT4.

In machine learning Navie Bayes comes in a variety of types. The simplest one—multinomial naive bayes—is especially useful when you have discrete data. The example in this book is a fantastic place to start, and it’s what I used to grasp the concept, with a simple email spam filter that uses the probability of words in regular messages versus spam, to make predictions about whether a specific email is spam or not.

I’m currently 45% of the way through this book and will continue tackling a chapter at a time, while also doing python workouts to keep my coding knowledge tight.

Day 41 of 100 Days of AI

One of the great things about continuous learning in this age is the variety of educational material you can find. I’ve found that consuming YouTube videos, online courses, blogs, and books has helped me learn concepts more dynamically.

For example, today I went over logistic regression again, this time using a book, and realised I’d missed a crucial aspect: when you fit a curve to the data, you search for a curve or plane that maximises likelihood. This is different to probability. Likelihood here is used to find a curve or plane that best fits the distribution of the data—a tricky concept for a beginner like me, but one that’s important to appreciate how logistic models actually work. (The likelihood—much like errors in linear regression—tells us how well our model fits the observed data.)

This is the benefit of circling back to concepts using different learning materials. You catch things you might have missed before.

Day 40 of 100 Days of AI

Gradient descent is one of those machine learning concepts that looked intimidating to me. But as is the case with learning anything new, starting with the simplest version of a concept can provide the building blocks required for more complex understanding.

Today I took such a step (pun intended for those that already know gradient descent) by reading through an example that uses gradient descent to estimate the intercept and slope of a straight line. It’s a lengthy example and I had just a few minutes to review it but with more time tomorrow, I will continue to take iterative steps towards understanding gradient descent.

Day 39 of 100 Days of AI

Today I reviewed linear regressions and also flicked through this fascinating paper that demonstrated how you can fool machine vision models with a pair of special glasses.

In the example below, the man in the top left picture wore glasses that could fool neural networks into thinking he was actress Milla Jovovich.

The paper is now old by AI standards and these attacks have probably been mitigated against in the latest models. However, this is a telling example of how machine learning models can be “hacked” to make false predictions.

Day 38 of 100 Days of AI

Today I went through some concepts I’m already familiar with: sum of the squared residuals, mean squared error, and R-squared. These are important in assessing how good a machine learning model is. Tomorrow I’ll continue revising some key mathematical concepts before getting back into other applied ML basics.

Day 37 of 100 Days of AI

I went through a primer on histograms today, and reviewed Binomial and Poisson distributions. To my surprise, even these simple school maths concepts underly some of the most valuable machine learning classification algorithms we use today.

Tomorrow I’ll continue working through key statistics methods used in machine learning.

Day 36 of 100 Days of AI

I’m travelling for the next 6 days without my laptop so I’ll keep the posts very short given I’m on mobile.

No laptop means no code. However, I’ll work through this wonderful visual introduction to ML from the YouTube channel, StatQuest.

Key takeaways from the first two chapters:

  • Supervised machine learning is broadly just about two things: classification predictions (e.g a binary prediction of whether a particular email is spam or not) and regressions (e.g. given some number of variables, can we predict a house price?)
  • Unsupervised machine learning goes beyond that (e.g. clustering algorithms, neural networks).
  • There are so many machine learning techniques that choosing which ones to use very much depends on the problem. We can also use techniques like cross-validation (k-fold, or leave-one-out etc) to measure which technique provides the best models.

Day 35 of 100 Days of AI

Precision & Recall (aka Sensitivity)

On classification models for ML, I still confuse precision for recall and vice versa. So it was good today that I came across this handy chart that illustrates the differences.

Chart from Zeya, 2021

A good precision score is really important in areas where getting a false positive is high. For example in email spam detection, it’s bad to falsely flag an email from a friend as spam. False positives should be minimised as much as possible in this case. The precision score — the percent measure of predicted true positives relative to “predicted true positives + falsely predicted positives” — is good to know in such cases.

A good recall (or sensitivity) score on the other hand is worth knowing particularly in areas where the cost of a false negative is very high. For example if we built a model that could predict which companies were going to succeed, we don’t want a model that falsley tells us a company is going to fail and we miss out on investing in it. The recall score — the percent measure of predicted true positives relative to all actual cases of positives — is key in that example.

I also came across this chart on Linkedin which explains the same concept even more visually.

Day 34 of 100 Days of AI

I’m sceptical about “prompt engineering” being a deep specialised thing. It felt a bit like gimmicky hype of a new job profession that would ride the wave of generative AI hype. Why? Because writing LLM prompts isn’t exact an science or precise engineering (more on this here.) However, I’m increasingly changing my view on how credible the skill really is.

Today, I read this article about innovation through prompting, which shares some cool ideas on how to use prompts and LLMs in education. What stood out to me though — and what I also think might happen — was this line in the article:

“…increasingly AIs will just prompt themselves to solve your problem based on your goals.

So actually, will prompt engineering really be a thing or will AI agents just do it?

We’ll find out soon enough.