Google vs OpenAI Updates: Ed3 World Newsletter Issue #31

The battle of AI's is demanding innovation.

May 16, 2024

This (almost) monthly newsletter serves as your bridge from the real world to the advancements of AI & other emerging technologies, specifically contextualized for education.

Dear Educators & Friends,

Exciting news: two newsletter issues in one month! In addition to issues that introduce new concepts, I'm now including more narrative updates for timely insights. This issue couldn’t wait because you need to know what’s happening in the battle of the AIs: Google vs. OpenAI. You've probably seen countless posts about both, so my goal is to make this as useful as possible.

Strategy Overviews: Analyses of both product strategies.
Feature Mapping: A comprehensive guide to the myriad features.
Feature Utility: What these features mean for your everyday life.

Different Strokes for Different Folks

Let’s dive into strategy. If you’ve been following the Open AI updates and Google I/O announcements, you’ve likely noticed the intense competition pushing each company to deliver bigger and better innovations. This rivalry is fantastic for driving progress. However, their approaches are notably different.

🔠👂🏽👁 Both companies are launching multi-modal engagement models that include vision, voice, and audio. OpenAI has ChatGPT with 4 Omni, while Google is rolling out Project Astra and Gemini Advanced. However, Google has a significant advantage in adoption. Consider how many of us use Google Workspace, Google Suite, and Google Search—collectively about 2 billion users. Google is integrating multi-modal 🤖 AI agents across its apps, enabling them to execute tasks through automated pattern matching and advanced data analysis. This is a game-changer. Check out what Gemini can do with Gmail (minute 56:45).
Both companies are showcasing use cases to leverage various modalities. Open AI demonstrated applications such as tutoring, language learning support, meeting facilitation, and real time translation. Google showcased similar functionalities with visual problem solving (minute 27:54), custom GPTs aka Gems (minute 1:10:04), and real time translation. However, a key difference is that while Open AI is not focusing on niche products for specific verticals like education and entertainment, Google is. For example, Google presented LearnLM (minute 1:45:42) for education and Music AI Sandbox (minute 31.17) for the music industry. This means Google will be rolling out custom features based on industry needs. This also means Google is taking on much more responsibility for ethical use of AI. Consider the implications of multi-modal engagement for students under 13 or IP protection for musicians. While Open AI has been criticized for shrugging off responsibility for misuse of their products with their ‘stop AI letters’, Google is actively embracing ethical AI. Yet, considering Google’s recent ‘woke images’ misstep, it’s clear there are pros and cons on both sides.
💰 Google is focused on monetization, while Open AI aims to collect more data on ChatGPT 4 Omni usage. The more data OpenAI collects on it’s multi-model capabilities, the stronger its models become. For Open AI, our data holds more value than revenue, which is why their models often outperform competitors’. While free apps offer more equitable access, they also raise concerns about data and privacy encroachment with minimal repercussions. Both scenarios are less than ideal, as our data isn’t fully protected or owned by us in either case.

AI Cartography

I found it challenging to keep track of all these updates, especially with the sheer volume Google has launched. For my own sanity, I categorized and mapped the updates based on my own interpretation. I used Google’s products as a baseline and overlaid Open AI’s updates for comparison, indicated by the OpenAI logo. Additionally, I provided descriptions of the features to the best of my understanding. I highly recommend reading each description - you’ll be amazed at what Google is accomplishing.

As you can see, Google is leveraging their AI models and use cases (right side of graphic) to enhance their existing products (left side). They’re integrating AI agents into nearly everything, which is incredibly smart. Personally, I’d prefer paying for a smart agent integrated into my existing GSuite products over generic plugins built on Open AI. However, many of these features haven’t been rolled out to consumers yet. These advancements highlight a larger narrative around personalization, automation, and productivity enhancement.

Reimaging Life As We Know It

So what do these features across both platforms mean for the world? Here’s my brainstorm:

Changing how we consume experiences and products.

The way we consume experiences and products will fundamentally change. Pre-internet, purchasing clothes, music, or groceries required a physical trip to the store and paying with cash from the bank. Today, we can order items with a click from home, with the automated systems handling payments and deliveries. With AI agents, we’ll be able to aggregate information and processes more efficiently, making it faster and easier to shop, return items, search for top-rated products and experiences, and summarize dense information.

With multi-modal and spatial reasoning, the world will be more navigable and transparent. We’ll be able to break down information in real time. Imagine walking through a park and instantly knowing the names every flower and tree, and the history of sculptures. Imagine AI extending your learning experience in that moment by curating a set of videos or nearby exhibits based on your interests.

Personalized telemedicine and diagnostics will improve drastically. With multi-modal remote consultations, diagnoses will be more accurate and timely.

Spatial computing will help city planners design more efficient and sustainable urban environments, optimizing traffic flow, public transportation, and resource management.

AI agents will monitor public spaces, detect unusual activities, and assist in emergency response, enhancing overall safety and security.

We’ll be able to learn things faster, fix things more efficiently, and plan multi-step systems like meals, trips, and events, with ease.

On one hand, humans will satisfy their curiosities faster. On the other hand, we’ll likely face information overload.

Transforming how we produce experiences and products

We’re already seeing the impact of generative AI on writing, images, and audio. But with AI agents, we’ll set up more workflows to automate the routine and mundane tasks. When emailing someone, you’ll likely interact with their AI agent first. Based on your profile and relationship, the recipient’s AI might activate various workflows in response to your email.

The content we produce will undergo greater scrutiny because everyone will start from a high quality baseline. These multi-modal experiences won’t replace or reduce our work but will demand higher quality content that requires more creativity. Entertainment, for these reasons, is poised for a renaissance.

More people will produce diverse types of content. Individuals, whether neurotypical or neurodivergent, who excel in expressing their vision through a particular modality will seamlessly transition between different forms of content creation.

Misinformation will be rampant.

Our roles as friends, parents, workers, business owners, investors, entertainers, will evolve with changes in communication, workflow, consumerism, and information retrieval.

There are so many other implications I’ve missed, so I’d encourage you to add your thoughts in comments to continue this thread.

Reimaging Learning As We Know It

Over the past three years, generative AI sparked hope for transforming education. However, this transformation hasn’t materialized. Currently, most AI products and services focus on automating already flawed tasks in teaching and learning. Instead of enhancing the learning experience, we’re using AI to generate lesson plans more quickly. As I mentioned in my last post, we’re trying to uphold the status quo of a system that doesn’t meet it’s own standards.

So, will AI agents and multi-modal reasoning ignite the long awaited revolution in education?

😏 Well, I do think AI is forcing innovation across industries. But there are some major challenges with education. I’ve developed some theories, potential use-cases, and models for the future of AI-enabled schooling that I’ll share in the next issue.

Thanks for reading, and see you again soon.

Warmly yours,

Vriti

I’m Vriti and I lead two organizations in service of education. Ed3 DAO is a community for educators learning about emerging technologies, specifically AI & web3. k20 Educators builds metaverse worlds for learning.

Ed3 World Newsletter