July 26, 2024
If I ask you to describe what is Artificial Intelligence, what would you say?
I bet many would answer with "chatGPT".
Ok, maybe not everyone.
Some of you might know (or have heard) about machine learning (or deep learning, or neural networks).
Those answers would be correct.
But only partially. There's much more than that in AI.
In two different posts we aim to cover all the important ground required to bring you up to speed to every branch of AI, both the mainstream and the experimental.
We have built a very unique AI ecosystem for investing at Pantar.ai and it is not what you might think of. So we are here to share with you our explorations and discoveries (well, without giving away our secret sauce).
We bet that after reading the two posts you will be the coolest person around a dinner table this summer.
So what do we mean by AI and how many computer activities are related to AI ?
AI branches
AI has four main branches (and several sub-branches):
We split the essay in two parts. In this first one, we will explore machine learning and robotics.
In the second one we will focus fuzzy logic and expert systems: they are lesser known, less mainstream and very cool thanks to emerging technology coming from other parts of AI. We use some of these models here.
All branches in summary
-- Machine learning (ML) allows computers to make predictions on data.
-- Robotics help study the creation of intelligent robots or machines powered by AI software to perform autonomous tasks.
-- Fuzzy logic is a method of computer reasoning (that resembles human reasoning) to make decisions based on imprecise and non-numerical information in ambiguous context.
-- Expert systems are applications that use AI techniques to simulate the judgment and behavior of human experts, solving problems that normally require narrow, deep knowledge and experience in specific fields without delegating everything to statistics (as some other branches of AI do).
Ready to take a deeper look at machine learning and robotics?
Sure.
But first, what is AI ?
Artificial intelligence (= AI) is a set of technologies that enable computers to perform a variety of advanced functions resembling human activities, including the ability to see, understand and translate spoken and written language, analyze data, make recommendations, and more.
How can AI....happen then? There is much less magic and much more advanced statistics that you may think of.
1. Robotics
Robotics and artificial intelligence starts as two very different disciplines. Robotics is a branch of engineering/technology focused on constructing and operating robots (=hardware).
Robots are programmable machines that can autonomously or semi-autonomously carry out a task by using sensors to interact with the physical world. They are capable of movement and must be programmed with software to perform a task.
AI has brought about a paradigm shift in robotic sensing and perception capabilities. Traditional robots often relied on pre-programmed instructions and limited sensor inputs. With AI instead, robots can now interpret and respond to their surroundings in real-time.
Machine learning algorithms (we will talk about them in a second) enable robots to adapt to dynamic environments, recognize objects, and make informed decisions based on sensory data, fostering a more intuitive and responsive interaction with the world.
If you want to more check here and also see successful applications of AI and robotics here.
2. Machine learning
Machine learning (ML) is a field of computer science (and a branch of AI) that allows computers to learn and then predict....without being explicitly programmed.
It involves training algorithms (= algorithms are a set of rules to be followed in calculations or other problem-solving operations) on large amounts of data so they can identify patterns and make predictions on new and unseen data.
Here's a breakdown of how machine learning works:
Confused?
Let's recap: in ML we supply a dataset (= say pictures) including input and output information (input: pictures of different things, output: this is a cat, this is a dog, this is a table...) to a specific ML model (which is a coded piece of logic) to recognize patterns (= recognize if a picture shows a dog or a cat).
After training the model (showing as many pictures as possible with as many labels), the model will have "learned". At that point we give to such trained model a new input (a new picture) without an output...asking to predict if that's a cat or a dog.
The ML model will easily answer.
Hope the concept is now clear. And...congratulations to get to this point. You know the basics of machine learning!
Only if you are keen to know more, you should know that:
=> we can use different methods or paradigms to make predictions, depending on the context:
=> given a chosen paradigm, we can take different routes (or models) can be taken to get make predictions:
For who's curious, here's a full tour of ML predictive algos and what each one does and why.
Special ML models
At this point we know that machine learning is a way to predict new output from certain input data, after some training.
The relatively recent boom in AI (that per se is popular since the 1960s) is due to some specific models. We should talk about them in some detail, as they will unblock further pieces of knowledge that we need to continue on this journey.
=> Neural networks and deep learning
Neural networks are a special model of machine learning, providing predictive results in an original way:
Neural networks are different than other ML models because they make more accurate decisions with a high degree of autonomy, learning from experience and previous errors. They leverage their interconnected nodes progressively discovering relationships between input and output.
In other words, neural networks are "more powerful" ML models often (not always) leading to better predictions.
If you struggle to picture what different layers do in a computer model (which would be normal), think about this:
Cool, let's park the concept for a second and move on to deep learning.
Deep learning (or deep learning neural networks) is just a sub-category of neural networks with many hidden layers, where the input information is processed further and for longer, hopefully to achieve a better (predicted) output.
Yes!!
With neural networks computer systems can predict things with more accuracy, more inferences while requiring processing power.
There is much more that the above intro to neural networks and deep learning. For example, the process of learning with networks requires different activation functions (that tend to be non-linear, see below) that decide whether the neuron’s input to the network is important or not in the process of prediction.
No panic, explaining activation functions here would be too much and you do not need to know these details unless you are an AI programmer. You can find more details here and here.
=> special deep learning networks: RNNs and Transformers
Different deep learning networks achieve better predictions in different fields. Here's a primer on the most famous deep learning networks (RNN, CNN, ANN, transformers).
I know you might feel overwhelmed by the information given to this point. Bear with us, as we are getting closer to the end of the ML journey here.
Let's talk for a second about RNNs (recurrent neural networks).
|| Recurrent neural networks
More performing than other networks in several fields, and unlike traditional neural networks that process each data point independently, RNNs can take into account the relationships between elements in a sequence. Sequential data is data where the order of elements is important - such as words, sentences, or time-series data (think daily stock market prices): sequential components interrelate based on complex semantics or rules ("Tom is my cat", or "the cat is owned by Tom" look very similar but they have a completely different meanings.
We need a special neural network to understand sequences. That's the RNN.
The way RNNs work is to save the output of an individual processing node for the next one layer. We can think that each node in the RNN model acts as a memory cell, useful to improve the accuracy of the learning operations until we get a final output.
Thanks to such "memory", RNNs are behind speech recognition, machine translation or text generation.
RNNs do not have inifnite memory (see the problem vanishing gradient if interested). A variant of RNNs, which is a RNN with Long short-term memory (LSTM) is capable to expand their memory capacity to accommodate a longer timeline.
|| Transformers
They are the neural networks behind large language models, chatGPT and the likes.
A transformer model is a neural network that learns context and thus meaning by tracking relationships in sequential data like the words in this sentence.
They look exactly like RNNs, right?
Well, transformer models apply an evolving set of mathematical techniques, called attention or self-attention, to detect subtle ways even distant data elements in a series influence and depend on each other.
For example, in the sentence:
She poured water from the bottle to the cup until it was full.
We know “it” refers to the cup, while in the sentence:
She poured water from the bottle to the cup until it was empty.
We know “it” refers to the bottle.
Transformers understand this nuance. RNNs don't.
But hey, while Transformers have revolutionized the understanding of human language by computers (see further below), RNNs with LSTM remain better at working with numerical time-series data (what we have in financial markets). More on this here.
Computer vision
Computer vision is a field of computer science that focuses on enabling computers to identify and understand objects and people in images and videos (even in real-life, real-time). Like other types of AI, computer vision seeks to perform and automate tasks that replicate human capabilities - both the way humans see, and the way humans make sense of what they see.
Computer vision is powered by deep learning too (and CNNs specifically). More on it here.
Natural language processing (NLP)
Deep learning neural networks (RNNs and CNNs) are the pivotal technology behind NLP, powering how computers can interact with us using human language.
NLP teaches computers to understand, interpret, and generate language in a way that is both meaningful and useful. This includes tasks like:
NLP can be divided into two overlapping sub-fields:
-- natural language understanding (NLU), which focuses on semantic analysis or determining the intended meaning of text
-- natural language generation (NLG), which focuses on text generation by a machine.
You can find more insights here.
=> Large language models (LLM)
LLMs are another type of neural networks that you might have heard of now that chatGPT has become so popular (I know we still need to explain what chatGPT is exactly). More in detail, large language model is a deep learning algorithm using Transformers that can perform a variety of natural language processing (NLP) tasks.
LLMs use models trained using massive datasets (hence, large) to recognize, translate, predict, or generate text or other content in many different situations.
LLMs use a particular neural network architecture called a transformer, which is designed to process and generate data in sequence, like text. Transformers, developed in 2017 by Google's researchers, are based on the idea of “attention,” whereby certain neurons are more strongly connected (or “pay more attention to”) other neurons in a sequence.
LLMs can predict which word follows the previous one to make sense to humans. What makes LLMs impressive is their ability to generate human-like text in almost any language (including coding languages): these models are a true innovation. Nothing like them has existed in the past.
More details here and here.
=> Generative AI
We are almost done with the wide branch of machine learning...but we still miss a buzzword that you will have heard of: Gen AI.
Generative AI is a broad category of AI that refers to any artificial intelligence that can create original content. Generative AI tools are built on underlying AI models such as a LLM for generating text, or image generation models (like Midjourney and DALL-E).
Within the GenAi category sits chatGPT.
Here we are. We had to get to this point to properly explain chatGPT, a chatbot service offered by OpenAI and powered by an LLM (specifically called GPT, or Generative Pre-Trained Transformer).
ChatGPT leverages NLP techniques to engage in conversations with users, providing information and completing tasks as instructed.
* * *
That's it on machine learning folks. We have made it.
We have explained machine learning, neural networks, deep learning, RNNs, Transformers, LLMs and finally Gen AI so far.
Congratulations to all of us.
Before closing on ML, you will be curious to understand the implications of using ML models in financial markets. While we will discuss more about this in the second post on AI, we anticipate something here.
ML and financial markets
Our ride to this point is pivotal to mention something that we believe is very important about investing and AI...something that - we suspect - not all market professionals see clearly behind the cloud of euphoria that is palpable in the financial markets industry.
ChatGPT and GenAI in general feel like magic, right ?
By now we know that they are complex, smart systems based on advanced statistics. But they "feel" like coming out of sci-fi movie.
As a consequence of the incredible achievements of GenAI, an expectation to see magic results behind everything that AI does has mounted. So market professionals are looking at AI-powered investing as the holy grail to outperform markets: let's put stock market history as training dataset for a neural networks to predict security prices tomorrow, trade on those predictions and make money easy.
Sorry to say that it does not work.
We tried to fine-tune several different models for long time. Predictions are poor.
The main reason is too much white noise (randomness) in markets for advanced statistics to capture stable patterns.
This is why we work at Pantar.ai with a different branch of AI to properly invest.
Conclusion
We have described two branches of AI - machine learning and robotics. We have spent good time on the nuances related to ML models, neural networks and the most recent discoveries with LLMs and chatGPT.
The hype is high.
The excitement that has built around these applications is notable and has propagated to other areas of interest including investing. People, including investment professionals, believe that neural networks can predict anything. But we have anticipated that modern ML models tend to fail at predicting financial markets.
So what's next with AI and investing?
Here's the answer: second part.