2025-05-22
DeepMind 執行長與 Google 創辦人訪談：2030 年實現 AGI？

https://www.youtube.com/watch?v=M2ZtBQI2-GY

by Gemini 2.5 flash with thinking mode and Google search in AI Studio
Prompt: "以上是一場轉播的逐字稿，請按時序節錄各個問答的重點，並以台灣的繁體中文轉述。"

開場與來賓介紹
主持人 Alex Canteritz 歡迎 DeepMind 執行長 Demis Hassabis 和 Google 共同創辦人 Sergey Brin 參與關於 AI 前沿的對談。

AI 前沿模型與改進空間

問：對於前沿模型，還有多少改進空間？為何許多聰明人認為進步將趨緩？
Demis：我們看到了驚人的進步，現有技術已達到極限，但我們也在不斷發明新事物。要達到通用人工智慧（AGI）可能還需要一兩項新突破，我們有許多有潛力的想法正在醞釀中。

規模與演算法改進

問：在現今的改進中，規模（Scale）仍是主角還是配角？
Demis：兩者都需要。需要將現有技術（如數據和計算規模）發揮到極限，同時也要投入開發未來的創新，這可能會帶來十倍的飛躍。
Sergey：同意，演算法的改進和計算能力的提升（更好的晶片、更多晶片、更大的資料中心）都很重要。歷史上，演算法的進步甚至超越了摩爾定律帶來的計算進步。預計演算法的進步會比計算進步更顯著。
問：大部分改進是否來自建造更大的資料中心和使用更多晶片？
Demis：我們肯定需要更多資料中心。這不僅是為了訓練模型，也為了服務這些模型，因為大家對像 Gemini 2.5 Pro 和 Flash 這樣高效的模型需求量巨大，需要大量晶片用於服務和推論（inference）。某些高價值、困難的任務值得模型花很長時間思考。

測試時間計算與推理典範

問：推理典範的改進幅度有多大？
Demis：我們一直是「思考典範」的堅定信徒。在 AlphaGo 和 AlphaZero 的早期工作中，思考系統建立在模型之上。若關閉 AlphaGo 和 AlphaZero 的思考功能，它們的水平約為大師級；開啟後，則遠超世界冠軍水平，約有 600 ELO 的差距。在現實世界中，增益可能更大。挑戰在於世界模型比遊戲模型更複雜，且可能存在誤差。
Sergey： DeepMind 在強化學習方面是先驅。AlphaGo 和 AlphaZero 的成就表明，透過思考（inference time compute）可以節省 5000 倍的訓練量。這顯然是一個巨大的優勢，我們現在還只是剛開始。AI 在思考過程中也可以利用工具或其他 AI 來改善最終輸出，這將是一個極其強大的典範。

AGI 的新進展

問：您認為哪些機制會讓產業更接近 AGI？
Demis：思考（thinking）範式可能是其中一部分。我們還需要其他突破，例如真正的發明，不僅是解決數學猜想，還要提出新的物理理論等創造性行為，這還不具備。此外，我們需要大幅提升世界模型的準確性。

AGI 的定義

問：很多人說 AGI 一詞被濫用，您為何認為它仍然重要？
Demis：我認為這很重要。存在兩種混淆：一種是「典型人類智能」，指大多數人能做的事情，這對產品和經濟很重要。但我感興趣的 AGI 是一個更理論化的建構，即人類大腦作為一種架構所能做到的範圍，例如愛因斯坦、莫札特、居禮夫人等歷史上最優秀的人類所能達到的。目前系統還不具備這種能力。此外，當前系統還不夠一致，很容易找到漏洞，而 AGI 應該需要專家團隊數月才能找到一個明顯的漏洞。

AGI 將由一家公司還是多個實體達成？

問： AGI 會由一家公司實現並結束遊戲，還是多個實體同時實現？
Sergey：假設一家公司、國家或實體會首先實現 AGI，但也可能有多個實體在大致相同時間達到。在 AI 領域，一家公司取得進展後，其他公司會迅速跟進，反之亦然，這是一種不斷超越的過程。
Demis：同意，業界需要就 AGI 的定義達成共識。先達成的組織需要確保系統是可靠和安全的。之後，可能會出現個人化 AGI 等多種應用，但難以預測。

AI 是否需要情感才能被視為 AGI？

問： AI 是否需要擁有情感才能被視為 AGI？它能擁有情感嗎？
Demis： AI 需要理解情感，但它是否需要模仿情感可能是一個設計決策。從理論上講，它能做到，但可能沒有必要或不值得。這是一個開放性問題，在 AGI 到來前的 5 到 10 年時間裡，我們還有時間研究這些問題。

自我改進系統（Alpha Evolve）

問： Alpha Evolve 是一種 AI，可以設計更好的演算法並改進 LLM 訓練方式，您是否正在試圖引發一場智慧爆炸？
Demis：不是不受控的智慧爆炸。Alpha Evolve 是一個有趣的初步實驗，它將進化編程技術與基礎模型結合。自我改進迴圈是加速進步的一種方式。我們在 AlphaZero 中看到了這一點，它可以在不到 24 小時內從零開始學習國際象棋和圍棋，但那是在有限的遊戲領域。現實世界更複雜，這種方法能否廣泛適用還有待觀察。

Sergey Brin 重返 Google 的原因

問：這是您回到 Google 的原因嗎？
Sergey：作為一位電腦科學家，現在是歷史上非常獨特的時刻。任何電腦科學家現在都不應該退休，而應該投入 AI 工作。從未有過如此重大的問題和機會。這與網路和行動電話的革命不同，這在科學上更令人興奮，對世界的影響也將更加深遠。我會定期深入研究 Gemini 模型、預訓練、後訓練以及多模態工作，這是我科學興趣所在。

Google 對於代理人（Agents）的視覺/物理世界方法

問：其他科技公司的代理人通常是情境感知、無實體聲音且在螢幕上互動。為何 Google 和 DeepMind 的演示常透過鏡頭，且非常視覺化？
Demis：這有幾個原因：DeepMind 的傳統就是基於代理人的系統；AGI 必須理解周圍的物理環境。兩個主要用例是：一個真正有用、能隨身攜帶、理解物理環境的個人助理；另一個是機器人技術。機器人技術的瓶頸一直是軟體智能，而不是硬體。Gemini 2.5 和 VIO 技術等新版本將為機器人提供令人興奮的演算法。從一開始，Gemini 就被設計成多模態的，儘管這在初期更困難，但現在我們正在收穫這些正確決策的成果。

Google Glass 的經驗教訓

問：從 Google Glass 中學到了什麼，現在可以應用於智慧眼鏡？
Sergey：學到很多，坦白說，我犯了許多錯誤。但我仍然是這種穿戴形式的堅定支持者。現在 AI 已經進步很多，這些眼鏡可以在不不斷分散注意力情況下提供更多幫助。當時技術上也有差距。此外，我當時對消費電子供應鏈一無所知，不了解製造和合理定價的困難。這次我們有很棒的合作夥伴。
Demis：我們從 Glass 設備的豐富歷史中學到了很多。我一直認為「通用助理」是智慧眼鏡的殺手級應用，結合硬體技術的進步，這將會成功。

影片生成與模型品質/數據崩潰

問：如果網路充斥著 AI 生成的影片，這會不會影響模型訓練，導致模型品質下降？
Demis：我們非常嚴格地管理數據品質和策劃。我們所有的生成模型都附帶 SynthID，這是一種隱形 AI 水印，非常穩健，已發布一年半。所有 AI 生成的圖像和影片都嵌入了這個水印。我們可以檢測到這些水印，並將發布工具讓任何人都能檢測出來，這對於打擊深度偽造和錯誤資訊很重要。當然，這也可以用來在訓練數據中過濾掉這些內容，所以我不認為這是一個大問題。最終，如果影片模型足夠好，我們可以將它們作為合成數據的來源重新納入訓練循環，但需要非常小心，確保不會扭曲數據分佈。

雜項問題（快速問答）

問：未來 10 年網路會是什麼樣子？
- Sergey： 10 年太遠了，AI 進步的速度太快，難以預測整個世界會是什麼樣子。
- Demis：我認為近期內網路會發生很大變化，特別是考慮到「代理人優先」的網路，它不一定需要像人類一樣看到渲染圖。

問： AGI 會在 2030 年之前還是之後達成？
- Sergey：之前。
- Demis：就在之後一點點。

問：會僱用在面試中使用 AI 的人嗎？
- Demis：取決於他們如何使用。如果使用今天的工具，可能不會，但實際情況可能更複雜。
- Sergey：我從未面試過，所以無法判斷。

問：我們是否活在模擬中？
- Demis：不是 Nick Bostrom 等人所說的那種方式。我認為底層物理是資訊理論，所以我們生活在一個計算宇宙中，但它不是一個簡單的模擬。系統能夠模擬自然界中的真實結構，這很有趣。
- Sergey：這個論點會遞歸下去，如果我們在模擬中，那麼創造模擬的實體也在模擬中，以此類推。這意味著我們可能處於無限堆疊的模擬中，或者必須有一個停止標準。我認為我們對「模擬」的理解過於以人類為中心，帶著某種有意識的存在正在運行一個類似於我們的模擬的假設，這讓我認為這個推論不成立。我們沒有能力推論更高層次的結構。

All right, everybody. We have an amazing

crowd here today. We're going to be live

streaming this. So, let's hear you. Make

some noise so everybody can hear that

you're here. Let's

go. I'm Alex Caneritz. I'm the host of

Big Technology podcast and I'm here to

speak with you about the frontiers of AI

with two amazing guests. Dennis Assabis,

the CEO of Deep Mind is here. Google

Deep Mind. Good to see you, Dennis. Good

to see you, too. And we have a special

guest. Sergey Brin, the co-founder of

Google is also here.

All right. So, this is going to be fun.

Let's start with the frontier models.

Uh, Demis, this is for you. With what we

know today about frontier models, how

much improvement is there left to be

unlocked? And why do you think so many

smart people are saying that the gains

are about to level off?

I think we're seeing incredible

progress. you've all seen it today. All

the amazing stuff we showed in the cake

keynote. So, um I think we're seeing

incredible gains with the existing

techniques, pushing them to the limit,

but we're also inventing new things all

the time as well. And I think to get all

the way to something like AGI, I think

may require one or two more new

breakthroughs. And you know, I think we

have lots of promising ideas that we're

cooking up and we hope to bring into the

to the main branch of the Gemini branch.

All right. And so there's been this

discussion about scale, you know, is

scale does scale solve all problems or

does it not? So I want to ask you in

terms of the improvement that's

available today, is scale still the star

or is it a supporting actor? I think

I've always been of the opinion you need

both. You need to scale to the maximum

uh the techniques that you know about

you want to exploit them to the limit

whether that's data or compute scale. uh

and at the same time you want to spend a

bunch of effort on what's coming next

maybe six months a year down the line so

you have the next innovation that might

do a 10x leap in some way um to to kind

of intersect with the scale so you want

both in my opinion but I don't know

Sergey what do you think I mean I agree

it takes both uh you know you can have

algorithmic improvements and simply

compute improvements better chips more

chips more power bigger data centers I

think that historically if you look at

um things like the Nbody problem and

simulating you know just gravitational

bodies and things like that as you plot

it the algorithmic advances have

actually beaten out the computational

advances even with Moore's law um if I

had to guess I would say the algorithmic

advances are probably going to be even

more significant than the computational

uh advances uh but uh both of them are

coming up now So we we're kind of

getting the benefits of both. And Debus,

do you think the majority of your

improvement is coming from building

bigger data centers and using more

chips? Like there's talk about how the

world will be just wallpapered with data

centers. Is that your vision? Well, no,

look, I mean it it we're definitely

going to need a lot more data centers.

Um it's amazing that, you know, it still

amazes me from a scientific point of

view. We turn sand into thinking

machines. It's pretty incredible. But

actually, it's not just for the

training. Um it's it's now we've got

these models that everyone wants to use,

you know, and actually we're seeing

incredible demand for 2.5 Pro and I

think flash we're really excited about

how performant that is for uh the

incredible sort of co low cost. Um I

think the whole world's going to want to

use these things and so we're going to

need a lot of data centers for serving

and also for inference time compute

giving you know you saw you saw deep

think today 2.5 pro deep think. more

time you give it, the better it will be.

And certain tasks, very high value, very

difficult tasks, you want to it will be

worth letting it think for a very long

time. And we're thinking about how to

push that even further. And uh again,

that's going to require a lot of chips

at at runtime. Okay. So, you brought up

test time compute. Uh we've been about a

year into this reasoning paradigm and

you and I have spoken about it twice in

the past as something that you might be

able to add on to traditional LLMs to

get gains. So I think this is like a

pretty good time for me to be like

what's what's happening? Uh what is can

you help us contextualize the magnitude

of improvement we're seeing from

reasoning? Look well we we've always

been big believers in what we're now

calling this thinking paradigm. If you

go back to our very early work on things

like Alph Go and Alpha Zero agent work

on on playing games. They will all had

this type of attribute of a thinking

system on top of a model. And actually

you can quantify how much difference

that makes if you look at a game like

chess or go. um you you know we had

versions of alpha go and alpha zero with

the thinking turned off so it was just

the model telling you its first idea and

you know it's not bad it's maybe like

master level something like that but

then if you turn the thinking on it's be

way beyond world champion level you know

it's like a 600 ELO plus difference

between the two versions so you can see

that in games let alone for the real

world which is way more complicated and

um I think the gains will be potentially

even bigger by adding uh this thinking

type of paradigm on top. Of course, the

challenge is that your models and I

talked about this earlier in the talk

need to be a kind of world model and

that's much harder than building a model

of a simple game of course and it and uh

you know it has errors in it and yet

those can compound over longerterm

plans. So um but I think we're making

really good progress on on all that all

those fronts. Okay. Yeah, look, I mean,

um, as Demis said, I mean, Deep Mind

really pioneered a lot of this

reinforcement learning work and, uh,

what they did with Alph Go and Alpha

Zeros. He mentioned um, it showed, as I

recall, something you would take 5,000

times as much training to match what you

were able to do with still a lot of

training and the inference time compute

that you were doing with Go. Um so it's

obviously a huge advantage and obviously

like uh most of us we get some benefit

by thinking before we speak. Um and uh

although uh not always

I always get reminded to do that. Um but

uh I I think that the the AIS obviously

are much stronger once you add that

capability and I think we're just at the

tip of the iceberg right now in that

sense.

It's been less than a year than these

models have really been around.

Especially if you think about obviously

with an AI during its thinking process,

it can also use a bunch of tools or even

other AIs um in in during that thinking

process to improve what the final output

is. So I think it's going to be an

incredibly powerful paradigm. Deep think

is very interesting. It I'm going to

describe it I'm trying to describe it

right. Uh it's basically a bunch of

parallel reasoning processes working and

then checking each other and then it's

like reasoning on steroids. Now Demis,

you mentioned that the industry needs a

couple more advances to get to AGI.

Where would you put this type of uh

mechanism? Is this one of those that

might get the industry closer? I think

so. I think it's it's maybe part of one.

Okay. Shall I should we say? Um and

there are others too that we need to you

know maybe this can be part of improving

reasoning. where does true invention

come from where you know you're not just

solving a mass conjecture you're

actually proposing one or hypothesizing

a new theory in physics um you know

that's I think we don't have systems yet

that can do that type of creativity I

think they're coming um and these types

of these types of paradigms might be

helpful in that uh things like thinking

um and then probably many other things I

mean I think we need a lot of advances

on the accuracy of the world models that

we're building um I think you saw that

with VO know the potential BO3 of how it

amazes me like the how it can intuitit

the physics of the light and the

gravity. Having someone, you know, I

used to work on on on get computer

games, not just the AI, but also

graphics engines in my early career. And

remember having to do all of this by

hand, you know, and and program all of

the lighting and the shaders and all of

these things. Incredibly complicated

stuff we used to do in early games. And

now it's it's just intuiting it within

the model. It's it's pretty astounding.

I saw you shared an image of a frying

pan with some onions and some oil. Hope

you all like that. There was no

subliminal messaging about that. No, not

really. Not really. Just maybe a subtle

subtle message. Okay. So, we've we said

the word AG or the acronym AGI a couple

times. There's I I think a movement

within the AI world right now to say

let's not say AGI anymore. The term is

so overused as to be meaningless. But

Demis, I it seems like you think it's

important. Why? Yeah, I think it's very

important, but I think I mean maybe I

need to write something about this also

with Shane Le who's our our chief

scientist who was one of the people

invented the term 25 years back. Um I

think there's sort of two things that

are getting a little bit conflated. Uh

one is like what can a typical uh uh

person do an individual do? And we can,

you know, we're all very capable, but we

can only do however capable you are,

there's only a certain slice of things

that one is expert in, right? And um or

you know, you could say what can you do

what like 90% of humans can do. Uh

that's obviously going to be

economically very important and I think

from a product perspective also very

important. So it's it's a very important

milestone. So maybe we should say that's

like you know typical human

intelligence. But what I'm interested in

and what I would call AGI is really a

more theoretical construct which is what

is the human brain as an architecture

able to do right and and that's the

human brain is an important reference

point because it's the only evidence we

have maybe in the universe that general

intelligence is possible and there it

would have to be able to you would have

to show your system was capable of doing

the range of things even the best humans

in history were able to do with the same

brain architecture not one brain but the

same brain architecture. So what

Einstein did, what Mozart was able to

do, what Marary Cury and so on. And that

it's clear to me today systems don't

have that. And then the other thing that

why I think it's sort of overblown the

hype today on AGI is that our systems

are not consistent enough to be

considered to be fully general yet.

They're quite general. So they can do,

you know, thousands of things. You've

seen many impressive things today, but

every one of us have experience with

today's chat bots and assistants. You

can easily within a few minutes find

some obvious flaw with them. some high

school math thing that it doesn't solve,

you know, some basic game it can't play.

Um, uh, it's not very difficult to find

that those holes in the system. And for

me, for something to be called AGI, it

would need to, um, be consistent, much

more consistent across the board than it

is today. It should take like a couple

of months uh, for for for maybe a team

of experts to find a a hole in it, an

obvious hole in it. Whereas, you know,

today it takes an individual minutes to

find that. Sergey, this is a good one

for you. Do you think that AGI is going

to be reached by one company and it's

game over? Or could you see Google

having AGI, OpenAI having AGI, Anthropic

having AGI, China having AGI? Wow. Um,

that's a great question. I mean, I guess

I would suppose that one uh company or

country or entity will reach AGI first.

Now it is a little bit of a you know

kind of a spectrum. It's not like a

completely precise thing. So it's

conceivable that there will be more than

one roughly in that range at the same

time. Um after that what happens I I

mean I think it's very hard to foresee.

uh but you could certainly imagine

there's going to be multiple entities

that come through and in our AI space

you know we've seen

uh whatever when we make a certain kind

of advance like other companies are

quick to follow and vice versa when

other companies make certain advances

it's you know it's a kind of a constant

leaprog so I do think there's an

inspiration element that you see uh and

that would probably encourage more and

more entities to cross that threshold

Dennis, what do you think?

Well, I think we we probably do I think

it is important for the field to agree

on a definition of AGI. So, I will maybe

we should try and help that to coalesce

assuming there is one, you know, there

probably will be some organizations that

get there first. And I think it's

important to that those first systems

are built reliably and safely. And um

and I think after that if that's the

case you know we can imagine using them

to shard off many systems that have safe

architectures sort of built under under

you know sort of provably underneath

them. Uh and then you could have you

know personal AGIS and all sorts of

things happening but it's you know it's

quite difficult as as Sergey says it's

pretty difficult to predict um sort of

see beyond the event horizon to predict

what that's going to be like. Right. So

we talked a little bit about the

definition of AGI and a lot of people

have said AGI must be knowledge right

the intelligence of the brain what about

the intelligence of the heart deis

briefly does does AI have to have

emotion to be considered AGI can it have

emotion I think it will need to

understand emotion I don't know if um I

think it will be a sort of almost a

design decision if we wanted to mimic

emotions um I think there's no I don't

see any reason why it couldn't in theory

um but uh it might different or we might

it might be not necessary or in fact not

desirable for them to have the sort of

emotional reactions that that we do as

humans. So I think again it's bit of an

open question um as we get closer to

this AGI time frame and you know uh sort

of events which I think is more on a 5

to 10 year time scale. So I think we

have a bit of time not much time but

some time to research those kinds of

questions. When I when I think about how

the time frame might be shrunk, uh I

wonder if it's going to be the creation

of self-improving systems. And last

week, I almost fell out of my chair

reading this headline about something

called Alpha Evolve, which is an AI that

helps design better algorith algorithms

and even improve the way uh LLMs train.

So, Demis, are you trying to cause an

intelligence explosion? No. Uh not an

uncontrolled one. Um I look I I think we

it's an interesting first experiment.

It's amazing system a great team that's

working on that where it's interesting

now to start pairing other types of

techniques in this case evolutionary

programming techniques with the latest

foundation models which are getting

increasingly powerful and I actually

want to see in our exploratory work a

lot more of these kind of combinatorial

uh systems and sort of pairing different

approaches together. Uh and you're right

that is one of the things a

self-improvement someone discovering a

kind of self-improvement loop uh would

be one way where things might accelerate

further than they're even going today.

Um so and and we've seen it before with

our own work with things like Alpha

Zero, you know, learning chess and go

and any two-player game from scratch uh

within, you know, less than 24 hours um

starting from random with self-improving

processes. So we know it's possible, but

again um those are in quite limited game

domains which are very well described.

So the real world is far messier and far

more complex. So remains to be seen if

that type of um approach can work in a

more general way. Sergey, we've talked

about some very powerful systems and

it's a race. It's a race to develop

these systems. Is that why you came back

to Google?

Um I mean I think as a computer

scientists uh it's a very unique time in

history like uh honestly anybody who's a

computer scientist uh should not be

retired right now should be working on

AI. That's what I would just say. I mean

there's just never

been a greater sort of problem and

opportunity a greater cusp uh of

technology. Um, so I don't I wouldn't

say it's because of the race. Uh,

although we fully intend that Gemini

will be the very first AGI. Clarify

that.

Uh, but

uh to be immersed in this uh incredible

technological revolution. I mean it's

unlike you know I went through sort of

the web 1.0 thing. It was very exciting

and whatever. We had mobile, we had

this, we had that. But uh I think this

is scientifically

uh far more exciting and I think uh I

think ultimately the impact on the world

is going to be even greater in as much

as you know the web and mobile phones

have had a lot of impact um I think AI

is going to be vastly more

transformative.

So what what do you do dayto-day?

I think I torture people like uh Demis

um who's amazing by the He tolerated me

crashing this uh fireside. Um I'm in the

you know I'm across the street uh you

know pretty much every day. Um and

they're just uh uh people who are

working on the key Gemini text models on

the pre-training on the post-raining

mostly those I periodically delve into

some of the multimodal work uh V3 as uh

you've all seen.

Um, but I tend to be uh pretty deep in

the technical details. Um, and that's a

luxury I really enjoy fortunately

because guys like Demis are, you know,

minding the shop. Um, and uh, yeah,

that's just where, you know, my

scientific interest is. It's deep in the

algorithms and how they can evolve.

Okay, let's talk about the products a

little bit. Some that were introduced

recently. Um, I just want to ask you a

broad question about agents, Demis,

because when I look at other tech

companies building agents, what we see

in the demos is usually something that's

contextually aware, has a disembodied

voice, is often interacted uh with you

often interact with it on a screen. When

I see Deep Mind and Google demos, often

times it's through the camera. It's very

visual. We There was an announcement

about smart glasses today. So talk a

little bit about if that's the right

read why why Google is so interested in

having an assistant or companion that is

something that sees the world as you see

it well it's for several reasons several

threads come together so as we talked

earlier we've always been interested in

agents that's actually the the the

heritage of deep mind actually we

started with agentbased systems in games

we are trying to build AGI which is a

full general intelligence clearly that

would have to understand the physical

environment physical world around you.

And two of the massive use cases for

that, in my opinion, are a truly useful

assistant that can come around with you

in your daily life, not just stuck on

your computer or one device. It needs to

we want it to be useful in your everyday

life for everything. And so it needs to

come around you and understand your

physical context. Um, and then the other

big thing is I've always felt for

robotics to work, you sort of want what

you saw with Astra on a robot. And I've

always felt that the the bottleneck in

robotics isn't so much the the hardware,

although obviously there's many many

companies and and working on fantastic

hardware and we partner with a lot of

them, but it's actually the software

intelligence that I think is always

what's held um robotics back. But I

think we're in a really exciting moment

now where finally with um these latest

versions, especially 2.5 Gemini and more

things that we're going to bring in this

kind of VO technology and other things.

I think we're going to have really

exciting uh algorithms to make robotics

finally work in in in its and you know

sort of realize its potential which

could be enormous. So I think this and

and then in the end AGI needs to be able

to do all of those things. So for us and

that's why you can see we always had

this in mind. That's why Gemini was

built from the beginning, even the

earliest versions to be multimodal. And

that made it harder at the start because

it's harder to make things multimodal

than just text only. But in the end, I

think we're reaping the benefits of

those decisions now. And I see many of

the Gemini team here in the front row of

the correct decisions we made. They were

the harder decisions, but we made the

right decisions. And now you can see the

fruits of that with all of what you've

seen today. Actually, Sergey, I've been

thinking about whether to ask you a

Google Glass question. Oh, fire away.

What did you learn from Glass that

Google might be able to uh apply today

now that it seems like smart glasses

have made a reappearance? Wow. Yeah. Uh

great question. Um I learned a lot. I

mean that was um I definitely feel like

I made a lot of mistakes with Google

Glass. I'll be honest.

Um I am still um a big believer in the

form factor. So I'm glad that we have it

now. Uh and now it's like looks like

normal glasses. doesn't have the thing

in front. Uh I think there was a

technology gap honestly. Now in the AI

world, the things that these glasses can

do to help you out without constantly

distracting you, that capability is much

higher. Uh there's also just um I just

didn't know anything about consumer

electronic supply chains really and how

hard it would be to build that and have

it be at a reasonable price point. um

managing all the manufacturing so forth.

Um this time we have great partners

that'll are helping us build this.

Um so that's another step

forward. Uh what else can I say? I do

have to say I miss the the um airship

with the wing suiting skydivers for the

demo.

Honestly, it would have been even cooler

here at Shoreline Amphitheater than it

was up in Moscone back in the day. But

maybe we'll have to we should probably

polish the product first this time.

Ready and available and then we'll do a

really cool demo. So that's probably a

smart move. Yeah. What I will say is I

mean look, we've got obviously an

incredible history of glass devices and

smart devices so we can bring all those

learnings to today and very excited

about our new glasses as you saw. What

I' what I've always always talking to

our team and Sham and the team about is

that I mean I don't know if Sergey would

agree but I feel like the that the

universal assistant is the killer app

for smart glasses and I think that's

what's going to make it work apart from

the fact that it's all the tech the

hardware technology is also moved on and

improved a lot is this I think I feel

like this is the actual killer app the

natural killer app for it. Okay, briefly

on video generation, I sat uh in the

audience in the keynote today and was

like fairly blown away by the level of

uh improvement we've seen from these

models and I I mean you had filmmakers

talking about it in the

presentation. I want to ask you Deis um

specifically about model quality. If the

internet fills with video that's been

made with artificial intelligence, does

that then go back into the training and

lead to a lower quality model than if

you were training just from human

generated content? Yeah, look, we we you

know, there's a lot of worries about

this so-called like model collapse. I

mean, video is just one thing, but in

any modality, text as well. There's a

few things to say about that. First of

all, we're very rigorous with our data

quality management and curation. We

also, at least for all of our generative

models, we we attach synth ID to them.

So there's this invisible AI actually

made watermark that um is pretty very

robust has held up now for you know a

year 18 months since we released it. And

all of our images and

videos are embedded with this watermark.

So we can detect and and we're releasing

tools to allow anyone to detect uh uh

these watermarks and know that that was

an AI generated um uh image or video.

And of course that's important to combat

deep fakes and misinformation, but it's

also of course you could use that to

filter out if you wanted to whatever was

in your training data. So I don't

actually see that as a big problem. Um,

eventually we may have video models that

are so good you could put them back into

the loop as a source of additional data,

synthetic data it's called. And there

you just got to be very careful that

you're you're actually creating from the

same distribution that you're going to

model. Um, you're not distorting that

distribution somehow. Uh, the quality is

high enough. We have some experience of

this in a completely different main with

with things like alpha fold where there

wasn't actually enough real experimental

data to build the final alpha fold. So

we had to build an earlier version that

then predicted about a million protein

structures and then we selected it had a

confidence level on that and we selected

the top three 400,000 and put them back

in the training data. So there's lots of

it's very cutting edge research to like

mix synthetic data with real data. So

there are also ways of doing that. But

on the terms of the video sort of

generator stuff, you can just exclude it

if you want to at least with our own

work and hopefully other um gen media

companies follow suit and um put robust

watermarks in. Also obviously first and

foremost to combat uh deep fakes and

misinformation. Okay, we have four

minutes. I got four questions left. We

now move to the miscellaneous part of my

question. So let's see how many we can

get through and as fast as we can get

through them. Um, let's go uh to Sergey

with this one. What does the web look

like in 10 years? What does the web look

like in 10 years? I mean, go one minute.

Boy, I think 10 years because of the

rate of progress in AI is so far beyond

anything we can see. Not just the web. I

mean, I don't know. I don't think we

really know what the world looks like in

10 years. Okay, Demis. Well, I think I

think that's a good answer. I do think

the web I think in nearer term the web

is going to change quite a lot if you

think about an agent first web like does

it really need to you know it doesn't

necessarily need to see renders and

things like we do as as humans using the

web so I think things will be pretty

different in a few years okay uh this is

kind of an underover over question uh

AGI before 2030 or after 2030

uh 2030 boy you really kind of uh put it

on that fine line I'm gonna I'm gonna

say before before. Yeah, Dennis. I'm

just after. Just after. Yeah. Okay. Um,

no pressure, Dennis. Exactly. Well, I

have to go back and get working harder.

Is that I can ask for it. He needs to

deliver it. So, exactly.

Bob Sandbagger.

We need that next week. That's true.

I'll come to the review. All right. So,

would you hire someone that used AI in

their interview? Demis.

Oh, in their interview.

Um, depends how they used it. I think

using today's models, uh, tools,

probably not, but I think that would be

Well, it depends how they would use it,

actually. I think it's probably the

answer. Sergey,

I mean, I never interviewed at all. So,

um, I don't know. I I feel it would be

hypocritical for me to judge people

exactly how they interview. Yeah, I

haven't either, actually. So, snap on

that. I've never done a job interview.

Okay, so Demis, I've been reading your

tweets. Um, you put a very interesting

tweet up where there was a prompt that

created some sort of natural scene. Oh,

yeah. Here was the tweet. Uh, nature to

simulation at the press of a button does

make you wonder with a couple of emojis

and people ran with that and wrote some

headlines saying Demis thinks we're in a

simulation. Are we in a simulation? um

not in the way that you know um Nick

Boston and people talk about. I think I

I do think though this so I don't think

this is some kind of game even though I

wrote a lot of games. I do think that

ultimately underlying physics is

information theory. So I do think we're

in a computational universe but it's not

just a straightforward simulation. I

can't answer you in one minute, but um

but I think I think the fact that these

systems are able to model um real uh

structures in nature is quite

interesting and telling and I've been

thinking a lot about our work we've done

with Alph Go and Alpha Fold and these

types of systems. Uh I spoken a little

about about it. Maybe at some point I'll

write up a scientific paper about what I

think that really means in terms of

what's actually going on here in

reality. Sergey, you want to make a

headline?

Well, I think that argument applies

recursively, right? If we're in a

simulation, then by the same argument,

whatever beings are making the

simulation are themselves in a

simulation for roughly the same reasons

and so on and so forth. So, I think

you're going to have to either accept

that we're in an infinite stack of

simulations,

uh, or that there's got to be some

stopping criteria. And what's your best

guess? Um I think that we're taking a

very anthropocentric view like when we

say simulation in the sense that some

some kind of conscious being is running

a simulation that we are then in and

that that they have some kind of

semblance of desire and consciousness

that's similar to us. I think that's

where it kind of breaks down for me.

Um, so I I just don't think that we're

really equipped to reason about sort of

one level up in the hierarchy.

Okay. Well, Dennis, Sergey, thank you so

much. This has been such a fascinating

conversation. Thank you all. All right,

Alex. Thank you. Pleasure.

Page updated

Google Sites

Report abuse

2025-05-22DeepMind 執行長 與 Google 創辦人訪談：2030 年實現 AGI？

2025-05-22
DeepMind 執行長與 Google 創辦人訪談：2030 年實現 AGI？