How Artificial Intelligence (AI) can change your world

I’ll start with Aldi. What does Aldi have to do with AI, you might ask? Well, a lot. Aldi has an AI-generated voice for their new commercials. This voice has been trained on the voices of 10 Aldi employees. Currently, it is noticeable that it is an artificial voice, but in the coming years, it will become so good that you won’t be able to tell the difference. Check out the video where Aldi presents this: https://www.youtube.com/watch?v=sR9pacfjMQY

You’ve probably noticed on LinkedIn, Twitter, or in the news that a new technology has been developed. You might think that it doesn’t affect you much because you’re not a technology company or because you don’t understand it.

When you think of AI, you immediately think of robots and science fiction, but it’s much more than that. In this blog, we’ll talk about generative AI. These are AI models that have learned from an enormous amount of text, images, or videos. These AI models are given a task, and then they generate something new based on that task (prompt). This is a kind of prediction, based on the learned data, or what is most logical.

”A wise man adapts himself to circumstances, as water shapes itself to the vessel that contains it.” —Chinese Proverb

What is generative AI?

Although the technology has been in development for years, the company OpenAI, which was initially a non-profit, has developed the technology to the point where it is usable for everyone.

When it comes to generating non-existent images, there has been a website for some time called: https://thispersondoesnotexist.com/. This shows AI-generated people who don’t actually exist. That technology has been around for a while, although it was often very limited because the data the model was trained on was often just one type. In the case of that website, humans.

Generating Text with AI

Let’s start by looking at OpenAI’s most famous product: chatGPT. ChatGPT is a language model that is capable of generating text in a natural and coherent way that resembles human-written text. The AI model has been trained on an enormous amount of text from the internet. It recognizes patterns and therefore seems to understand what you’re asking. This allows it to be used for a variety of tasks, such as writing texts (this blog was actually written by a human), giving advice, answering questions, and solving problems.

What’s so special about this model is that it seems as if it can think for itself. It also seems as if it understands your question, even though this is not really the case. Mistakes are also often made and the model does hallucinate from time to time. Hallucination occurs when an AI model gives an answer that is completely at odds with reality.

However, ChatGPT is no longer the only text-generative AI. There’s also Gemini van Google, CoPilot van Microsoft, Claude van Anthropic, and many more. Although some are better than others, this is often in a different area. For example, Claude has a version that is very good at solving programming problems.

Github Copilot (based on OpenAI’s Codex) is also widely used in programming (there are others like CodeWhisperer and Kite). It’s built into your IDE (the program where you write code) and provides suggestions. Here’s an example. This doesn’t mean that AI writes your code, it’s just a tool to quickly write certain, repetitive functions. This Copilot has also been trained on an enormous amount of code, so it roughly knows what the logical next step in your code is.

Generated code by Github Copilot

Generating Images with AI

In addition to generating text, you can also generate images with AI. In 2020, text-to-image (text to image) models became popular. These AI models generate an image based on a prompt (text/command). The images are generated using a technique called Diffusion.

Diffusion models are also trained on a massive amount of images with their corresponding descriptions. This allows it to learn to make connections with the description and find patterns. When you enter a prompt, it starts with a random initial image and adds noise to it. Step by step, the noise is reduced until the image that matches your prompt/description emerges.

The most well-known image generator is Midjourney, which produces super-realistic images. Try it out and let me know what you think. Additionally, you have Adobe Firefly, which is built into Photoshop, DALL-E 2 van OpenAI, and a few smaller ones. There are also open-source alternatives like StableDiffusion.

You can try Adobe Firefly for free, and I really recommend that you try it out if you’ve never tried it before. You can create super-realistic street photography without the street actually existing; you can imagine anything you want. It’s always a bit strange, but it can be super useful, especially if you don’t have much content of your own. However, you need to be careful not to make everything AI-generated, because even though it’s good, you can still see that it’s AI.

Music Generation with AI

If text and images can be generated, it is not surprising that music can also be generated. These AI models are also trained on vast amounts of data. They learn to recognize patterns and structures of music and then generate new, unique musical pieces.

This can be done on websites like Suno, Beatoven, or Soundfull. You can hear that it is generated by AI, but it still sounds as if it was made by humans. This, of course, raises questions about copyrights and intellectual property. It is, after all, trained on music made by people, and this is the underlying part of the generated music.

Video Generation with AI

This may be the one that will have the most impact in the future. Currently, these videos are not very good, but you should think about the consequences if these videos are no longer distinguishable from reality.

OpenAI’s Sora is the most well-known, but there are many more companies developing this, such as Runway Gen 2 or Pika. Below, you can see an example of such a video. These videos are currently still short, and you can clearly see that they are generated by AI. With images, this was also the case in the beginning, and if you look at where that is now, you can understand that this will also happen with videos.

OpenAI Sora generated video about an astronaut

A video AI is also often called a multimodal AI. This is not a single AI model like text or images, but an AI that combines multiple models into one. In a video, this is images, and sometimes even audio.

Speech to/from text AI

Speech to text and text to speech are two technologies that existed for quite some time before 2020. The first model “recognizes” spoken language and can transcribe it into text. The other model can convert text into spoken language. These models are truly surprisingly good. For example, you have Whisper, which converts spoken language into text, and ElevenLabs for text to speech. You sometimes hear that it doesn’t sound completely natural, but it still sounds quite realistic.

Speech to text examples before 2020 are Siri, Amazon Alexa, or Google Home. The models we have now, are way better. They can still transcribe even when there is a lot of noise in the background. Text to speech is also an “old” concept. When navigating with Google Maps or translating you have a voice that speaks. This is the same for Siri, Alexa and Google Home. But still, those voices sound really robot like. Today it’s different, just try it yourself and hear the difference.

Understanding Images

In addition to generating images or videos, there are also AI models that can analyze images. With Claude and ChatGPT-4, you can upload images that the AI reads and “understands”. For example, there are Google Vision, Amazon Rekognition, and OpenCV (open-source). With these models, you can analyze images or videos. Here are some examples of what you can do with such an AI model:

Object recognition: Identifying objects, people, animals, or buildings.
Face recognition: Detecting and identifying faces.
Text recognition: Extracting text from images or videos.
Activity recognition: Detecting specific actions or events.
Anomaly detection: Identifying unusual or deviant elements/patterns.
Classification: Categorizing images or videos into predefined categories.

This is often used in security, retail, media, healthcare, and other sectors where images and/or videos are used. AI models can analyze videos or images much faster than humans and can continue to do so 24/7.

How You Can Use AI

You might be thinking: “All these different AI models are great, but I really don’t know anything about this, and my business doesn’t have any use for it.” Even companies that are heavily focused on the physical world can make use AI. When taking a photo of a room, you could analyze how many square meters the space is, how much paint is needed, etc.

Increasing Productivity

Automating repetitive tasks can be done without AI as well. However, AI can multiply this effect tenfold. AI can turn unstructured data into structured data. AI can also ensure that a process that currently takes 10 steps will only take 2/3 steps. By automatically filling in information based on the past. This way, you (or your employees) can save more time.

Customer-Centric Solutions

Every customer is different and has different needs. This often requires a lot of personalized quotes. These quotes are often made by a human who knows the customer’s requirements. But AI can do this as well. By having AI learn from your previous customers, it can create custom-made quotes for each customer.

You can also deploy chatbots to handle, for example, your customer service. I do advise not to convert your entire customer service to AI chatbots. But often, AI can help identify a problem that a customer cannot explain well themselves. They can also refer to the right resources.

Competitive Advantage

Companies that use AI have an advantage over their competitors. AI offers endless possibilities to optimize business processes. As the technology continues to develop, the gap between companies that use AI and those that don’t will only grow wider.

Speed is important in every business. With AI, companies can respond even faster to changing market conditions. By using automation, companies can react quicker than competitors who still rely on traditional, manual methods. This allows you to seize opportunities quickly, giving your business an edge.

Future-Proofing

This ties in a bit with the previous point. If you keep up with the times, you’ll last longer than a company that doesn’t. A business should be flexible, keep up with (technology) trends, and adapt to what’s relevant today. As I said, AI is here to stay and will only get better. The sooner you immerse yourself in the possibilities of AI, the better your business will perform.

Cost savings is also an important point in entrepreneurship, and AI can lead to significant savings. By streamlining your processes, automating tasks, and improving customer insights, you can reduce your costs and increase your revenue. This will allow you to invest more money in other things that will yield even greater returns.

Future of AI

As I have mentioned a few times, it doesn’t stop here. As GPUs (computer processors) become increasingly cheaper, the speed at which AI models are trained will become faster and faster. This, in turn, will allow them to be applied to even more areas. But the developments of the AI models themselves will also continue to improve. This is already happening with the help of AI.

It is therefore important as a company that you do not fall behind. We are still relatively early, and you should take advantage of this by immersing yourself in it and by implementing AI in your business. It often seems like magic, but if you understand it well, it’s not that difficult. You don’t need to understand how an AI model works at a technical level, but you do need to understand it at a functional level. Text generation can already be used to write emails, create content, and come up with ideas. Although you will certainly need to make some changes to this, my tip is to use it primarily as a guideline.

This year

Video generation will improve by 10x. These videos will become just as realistic as the photos are now. Of course, you will always be able to see a bit that it is AI if you look closely, but as I said, this will only get better. ChatGPT-5 will see the light of day, and this will further improve text generation. I also expect that other companies like Google, Amazon, Facebook, and Anthropic will not fall behind and will also release improved AI models.

We haven’t really talked about legal issues yet, but I expect that many more laws will be made (we already have some in the EU). I also expect that more attention will be paid to the potential dangers of A(G)I. Although I personally don’t believe that we will have AGI anytime soon, it is possible. AGI stands for Artificial General Intelligence, which is the hypothetical intelligence of a machine that has the capacity to understand or learn any intellectual task that a human can, according to Wikipedia.

The Coming Years

In the coming years, there will be a strong focus on creating AGI (Artificial General Intelligence). Companies like Boston Dynamics and Tesla will also focus more on creating robots with integrated AI. This will allow dangerous jobs to be taken over by robots. However, we must also ensure that these robots do not become dangerous themselves, and therefore they must remain under constant supervision.

Afterword

We also need to remain realistic. Use the resources you have and gradually blend yourself with new technologies. I am convinced that we must adapt. AI is not just a new technology, it is a major technological change. In 50 years, history books will state that the AI revolution began in 2020.

To show how good/bad AI-generated text is, I’ll let Claude from Anthropic summarize this blog in 3 paragraphs. You can judge for yourself whether you think it’s good or bad. I must say, however, that the prompt (command) is very simple. There are many longer prompts that go into the style the text should be written in, etc.

Prompt: Summarize the text below in 2/3 paragraphs.

Claude: Generative AI is an emerging technology where AI models are trained on vast amounts of data, such as text, images, and videos. These models can then generate new, unique content based on a prompt or instruction. Examples include chatbots like ChatGPT, which can generate text in a natural way, and text-to-image models like Midjourney, which can create realistic images. These technologies are already quite advanced and will only continue to improve in the future.

Generative AI offers many opportunities for businesses, such as increasing productivity by automating repetitive tasks, providing customer-centric solutions through personalized quotes and chatbots, and gaining a competitive advantage by being able to respond to changing market conditions more quickly. Companies that embrace AI will have an edge over their competitors. Additionally, AI can also lead to cost savings by streamlining processes and improving insights. The future of AI looks promising, with further advancements in video generation, text generation, and possibly even the development of Artificial General Intelligence (AGI).

Arjen: I hope this blog helps clarify what AI is and how it can be used. As always, if you don’t understand something or want more clarity, I’m always open to a conversation where I can answer your question and provide more clarity.