Gradient descent is one of those machine learning concepts that looked intimidating to me. But as is the case with learning anything new, starting with the simplest version of a concept can provide the building blocks required for more complex understanding.
Today I took such a step (pun intended for those that already know gradient descent) by reading through an example that uses gradient descent to estimate the intercept and slope of a straight line. It’s a lengthy example and I had just a few minutes to review it but with more time tomorrow, I will continue to take iterative steps towards understanding gradient descent.
Today I reviewed linear regressions and also flicked through this fascinating paper that demonstrated how you can fool machine vision models with a pair of special glasses.
In the example below, the man in the top left picture wore glasses that could fool neural networks into thinking he was actress Milla Jovovich.
The paper is now old by AI standards and these attacks have probably been mitigated against in the latest models. However, this is a telling example of how machine learning models can be “hacked” to make false predictions.
Today I went through some concepts I’m already familiar with: sum of the squared residuals, mean squared error, and R-squared. These are important in assessing how good a machine learning model is. Tomorrow I’ll continue revising some key mathematical concepts before getting back into other applied ML basics.
I went through a primer on histograms today, and reviewed Binomial and Poisson distributions. To my surprise, even these simple school maths concepts underly some of the most valuable machine learning classification algorithms we use today.
Tomorrow I’ll continue working through key statistics methods used in machine learning.
Supervised machine learning is broadly just about two things: classification predictions (e.g a binary prediction of whether a particular email is spam or not) and regressions (e.g. given some number of variables, can we predict a house price?)
There are so many machine learning techniques that choosing which ones to use very much depends on the problem. We can also use techniques like cross-validation (k-fold, or leave-one-out etc) to measure which technique provides the best models.
On classification models for ML, I still confuse precision for recall and vice versa. So it was good today that I came across this handy chart that illustrates the differences.
A good precision score is really important in areas where getting a false positive is high. For example in email spam detection, it’s bad to falsely flag an email from a friend as spam. False positives should be minimised as much as possible in this case. The precision score — the percent measure of predicted true positives relative to “predicted true positives + falsely predicted positives” — is good to know in such cases.
A good recall (or sensitivity) score on the other hand is worth knowing particularly in areas where the cost of a false negative is very high. For example if we built a model that could predict which companies were going to succeed, we don’t want a model that falsley tells us a company is going to fail and we miss out on investing in it. The recall score — the percent measure of predicted true positives relative to all actual cases of true positives — is key in that example.
I also came across this chart on Linkedin which explains the same concept even more visually.
I’m sceptical about “prompt engineering” being a deep specialised thing. It felt a bit like gimmicky hype of a new job profession that would ride the wave of generative AI hype. Why? Because writing LLM prompts isn’t exact an science or precise engineering (more on this here.) However, I’m increasingly changing my view on how credible the skill really is.
Today, I read this article about innovation through prompting, which shares some cool ideas on how to use prompts and LLMs in education. What stood out to me though — and what I also think might happen — was this line in the article:
AI is going to displace lots of jobs. So should we all just retrain away from our non-AI work and specialise to work exclusively on AI?
I completed the “AI for Everyone” course today, and near the end, AI expert Andrew Ng shared a story that will be relevant to a variety of careers.
A radiologist just starting out asked Andrew for some career advice. Since AI can now read x-rays, should the radiologist just quit their profession and go into AI instead?
Andrew’s advice was pragmatic: No, you don’t have to.
Yes, you can switch careers if you wish. Some people do so successfully and there are lots of online resources that can help you retrain without spending lots of money.
However, it’s worth considering the alternative: Stay in the radiologist profession but also learn enough about AI to be able to uniquely operate at the intersection of radiology and AI. This combination could be hugely more valuable.
Moreover, AI isn’t yet good enough to replace radiologists completely. In addition, we are facing a global radiologist shortage!
Similar advice can apply to other careers. If you’re training to be a lawyer, learn enough about AI to be a leader at the intersection of law and technology. If you’re a journalist, learn enough about AI to use it in your work to create more value for the public. In my case, I’ve been operating at the intersection technology and finance for a while, but increasingly, I’m also exploring how AI can be leveraged in the venture capital profession.
Ps. Here’s a radiologist’s take. He offers another pragmatic view, posing questions such as, “who’s liable for conclusions of the AI software?”, “how much does it really help?” (good radiologists can read x-rays fast anyway), “do patients trust it?”, and he concludes that he doesn’t think AI will replace him in the near term.
Pss. This video below offers another take. The summary is: (1) AI will augment (not replace) doctors and radiologists. This will allow them to focus on more complex situations. (2) Radiologists will have to adapt and use AI to do better work. (3) We need to use AI responsibility and ethically in such a life-critical field.
In today’s “AI for Everyone” course material, a key highlight was an insight from Andrew Ng. He’s worked on tons of AI projects at companies like Google, Baidu, and other clients in traditional industry.
Usually, people want to dive right into strategy before building things. But his view is that if you want to develop a great AI strategy, do it AFTER you’ve had a chance to experiment and build several pilot projects. Don’t start with strategy. Start with pilots.
It’s hard to appreciate what AI can and can’t do if you haven’t yet built a series of successful small experiments. Only then does it make sense to step back and consider a broader vision and strategy of how AI can add value to a business.
So start early. Start small. Get some quick wins. And only after that is it good to consider a broader AI strategy.
In today’s “AI for Everyone” video, I went through a section on a framework you can use to figure out what AI projects are worth pursuing in a business (rather than research) setting. The framework is simple. Before taking on a project, you work to answer two questions:
(1) Is this AI project feasible?
(2) Is this project valuable?
The first question involves technical due diligence and figuring out whether AI can actually accomplish a particular goal with sufficient performance.
The second question involves business diligence and ascertaining the value an AI project can create, whether that’s financial or otherwise.
Sometimes you know the answers to these questions right away. At other times, you have to build and test proofs-of-concept to get closer to a meaningful answer.