https://www.youtube.com/watch?v=M2ZtBQI2-GY
by Gemini 2.5 flash with thinking mode and Google search in AI Studio
Prompt: "以上是一場轉播的逐字稿,請按時序節錄各個問答的重點,並以台灣的繁體中文轉述。"
開場與來賓介紹
主持人 Alex Canteritz 歡迎 DeepMind 執行長 Demis Hassabis 和 Google 共同創辦人 Sergey Brin 參與關於 AI 前沿的對談。
AI 前沿模型與改進空間
問: 對於前沿模型,還有多少改進空間?為何許多聰明人認為進步將趨緩?
Demis: 我們看到了驚人的進步,現有技術已達到極限,但我們也在不斷發明新事物。要達到通用人工智慧(AGI)可能還需要一兩項新突破,我們有許多有潛力的想法正在醞釀中。
規模與演算法改進
問: 在現今的改進中,規模(Scale)仍是主角還是配角?
Demis: 兩者都需要。需要將現有技術(如數據和計算規模)發揮到極限,同時也要投入開發未來的創新,這可能會帶來十倍的飛躍。
Sergey: 同意,演算法的改進和計算能力的提升(更好的晶片、更多晶片、更大的資料中心)都很重要。歷史上,演算法的進步甚至超越了摩爾定律帶來的計算進步。預計演算法的進步會比計算進步更顯著。
問: 大部分改進是否來自建造更大的資料中心和使用更多晶片?
Demis: 我們肯定需要更多資料中心。這不僅是為了訓練模型,也為了服務這些模型,因為大家對像 Gemini 2.5 Pro 和 Flash 這樣高效的模型需求量巨大,需要大量晶片用於服務和推論(inference)。某些高價值、困難的任務值得模型花很長時間思考。
測試時間計算與推理典範
問: 推理典範的改進幅度有多大?
Demis: 我們一直是「思考典範」的堅定信徒。在 AlphaGo 和 AlphaZero 的早期工作中,思考系統建立在模型之上。若關閉 AlphaGo 和 AlphaZero 的思考功能,它們的水平約為大師級;開啟後,則遠超世界冠軍水平,約有 600 ELO 的差距。在現實世界中,增益可能更大。挑戰在於世界模型比遊戲模型更複雜,且可能存在誤差。
Sergey: DeepMind 在強化學習方面是先驅。AlphaGo 和 AlphaZero 的成就表明,透過思考(inference time compute)可以節省 5000 倍的訓練量。這顯然是一個巨大的優勢,我們現在還只是剛開始。AI 在思考過程中也可以利用工具或其他 AI 來改善最終輸出,這將是一個極其強大的典範。
AGI 的新進展
問: 您認為哪些機制會讓產業更接近 AGI?
Demis: 思考(thinking)範式可能是其中一部分。我們還需要其他突破,例如真正的發明,不僅是解決數學猜想,還要提出新的物理理論等創造性行為,這還不具備。此外,我們需要大幅提升世界模型的準確性。
AGI 的定義
問: 很多人說 AGI 一詞被濫用,您為何認為它仍然重要?
Demis: 我認為這很重要。存在兩種混淆:一種是「典型人類智能」,指大多數人能做的事情,這對產品和經濟很重要。但我感興趣的 AGI 是一個更理論化的建構,即人類大腦作為一種架構所能做到的範圍,例如愛因斯坦、莫札特、居禮夫人等歷史上最優秀的人類所能達到的。目前系統還不具備這種能力。此外,當前系統還不夠一致,很容易找到漏洞,而 AGI 應該需要專家團隊數月才能找到一個明顯的漏洞。
AGI 將由一家公司還是多個實體達成?
問: AGI 會由一家公司實現並結束遊戲,還是多個實體同時實現?
Sergey: 假設一家公司、國家或實體會首先實現 AGI,但也可能有多個實體在大致相同時間達到。在 AI 領域,一家公司取得進展後,其他公司會迅速跟進,反之亦然,這是一種不斷超越的過程。
Demis: 同意,業界需要就 AGI 的定義達成共識。先達成的組織需要確保系統是可靠和安全的。之後,可能會出現個人化 AGI 等多種應用,但難以預測。
AI 是否需要情感才能被視為 AGI?
問: AI 是否需要擁有情感才能被視為 AGI?它能擁有情感嗎?
Demis: AI 需要理解情感,但它是否需要模仿情感可能是一個設計決策。從理論上講,它能做到,但可能沒有必要或不值得。這是一個開放性問題,在 AGI 到來前的 5 到 10 年時間裡,我們還有時間研究這些問題。
自我改進系統(Alpha Evolve)
問: Alpha Evolve 是一種 AI,可以設計更好的演算法並改進 LLM 訓練方式,您是否正在試圖引發一場智慧爆炸?
Demis: 不是不受控的智慧爆炸。Alpha Evolve 是一個有趣的初步實驗,它將進化編程技術與基礎模型結合。自我改進迴圈是加速進步的一種方式。我們在 AlphaZero 中看到了這一點,它可以在不到 24 小時內從零開始學習國際象棋和圍棋,但那是在有限的遊戲領域。現實世界更複雜,這種方法能否廣泛適用還有待觀察。
Sergey Brin 重返 Google 的原因
問: 這是您回到 Google 的原因嗎?
Sergey: 作為一位電腦科學家,現在是歷史上非常獨特的時刻。任何電腦科學家現在都不應該退休,而應該投入 AI 工作。從未有過如此重大的問題和機會。這與網路和行動電話的革命不同,這在科學上更令人興奮,對世界的影響也將更加深遠。我會定期深入研究 Gemini 模型、預訓練、後訓練以及多模態工作,這是我科學興趣所在。
Google 對於代理人(Agents)的視覺/物理世界方法
問: 其他科技公司的代理人通常是情境感知、無實體聲音且在螢幕上互動。為何 Google 和 DeepMind 的演示常透過鏡頭,且非常視覺化?
Demis: 這有幾個原因:DeepMind 的傳統就是基於代理人的系統;AGI 必須理解周圍的物理環境。兩個主要用例是:一個真正有用、能隨身攜帶、理解物理環境的個人助理;另一個是機器人技術。機器人技術的瓶頸一直是軟體智能,而不是硬體。Gemini 2.5 和 VIO 技術等新版本將為機器人提供令人興奮的演算法。從一開始,Gemini 就被設計成多模態的,儘管這在初期更困難,但現在我們正在收穫這些正確決策的成果。
Google Glass 的經驗教訓
問: 從 Google Glass 中學到了什麼,現在可以應用於智慧眼鏡?
Sergey: 學到很多,坦白說,我犯了許多錯誤。但我仍然是這種穿戴形式的堅定支持者。現在 AI 已經進步很多,這些眼鏡可以在不不斷分散注意力情況下提供更多幫助。當時技術上也有差距。此外,我當時對消費電子供應鏈一無所知,不了解製造和合理定價的困難。這次我們有很棒的合作夥伴。
Demis: 我們從 Glass 設備的豐富歷史中學到了很多。我一直認為「通用助理」是智慧眼鏡的殺手級應用,結合硬體技術的進步,這將會成功。
影片生成與模型品質/數據崩潰
問: 如果網路充斥著 AI 生成的影片,這會不會影響模型訓練,導致模型品質下降?
Demis: 我們非常嚴格地管理數據品質和策劃。我們所有的生成模型都附帶 SynthID,這是一種隱形 AI 水印,非常穩健,已發布一年半。所有 AI 生成的圖像和影片都嵌入了這個水印。我們可以檢測到這些水印,並將發布工具讓任何人都能檢測出來,這對於打擊深度偽造和錯誤資訊很重要。當然,這也可以用來在訓練數據中過濾掉這些內容,所以我不認為這是一個大問題。最終,如果影片模型足夠好,我們可以將它們作為合成數據的來源重新納入訓練循環,但需要非常小心,確保不會扭曲數據分佈。
雜項問題(快速問答)
問: 未來 10 年網路會是什麼樣子?
Sergey: 10 年太遠了,AI 進步的速度太快,難以預測整個世界會是什麼樣子。
Demis: 我認為近期內網路會發生很大變化,特別是考慮到「代理人優先」的網路,它不一定需要像人類一樣看到渲染圖。
問: AGI 會在 2030 年之前還是之後達成?
Sergey: 之前。
Demis: 就在之後一點點。
問: 會僱用在面試中使用 AI 的人嗎?
Demis: 取決於他們如何使用。如果使用今天的工具,可能不會,但實際情況可能更複雜。
Sergey: 我從未面試過,所以無法判斷。
問: 我們是否活在模擬中?
Demis: 不是 Nick Bostrom 等人所說的那種方式。我認為底層物理是資訊理論,所以我們生活在一個計算宇宙中,但它不是一個簡單的模擬。系統能夠模擬自然界中的真實結構,這很有趣。
Sergey: 這個論點會遞歸下去,如果我們在模擬中,那麼創造模擬的實體也在模擬中,以此類推。這意味著我們可能處於無限堆疊的模擬中,或者必須有一個停止標準。我認為我們對「模擬」的理解過於以人類為中心,帶著某種有意識的存在正在運行一個類似於我們的模擬的假設,這讓我認為這個推論不成立。我們沒有能力推論更高層次的結構。
All right, everybody. We have an amazing
crowd here today. We're going to be live
streaming this. So, let's hear you. Make
some noise so everybody can hear that
you're here. Let's
go. I'm Alex Caneritz. I'm the host of
Big Technology podcast and I'm here to
speak with you about the frontiers of AI
with two amazing guests. Dennis Assabis,
the CEO of Deep Mind is here. Google
Deep Mind. Good to see you, Dennis. Good
to see you, too. And we have a special
guest. Sergey Brin, the co-founder of
Google is also here.
All right. So, this is going to be fun.
Let's start with the frontier models.
Uh, Demis, this is for you. With what we
know today about frontier models, how
much improvement is there left to be
unlocked? And why do you think so many
smart people are saying that the gains
are about to level off?
I think we're seeing incredible
progress. you've all seen it today. All
the amazing stuff we showed in the cake
keynote. So, um I think we're seeing
incredible gains with the existing
techniques, pushing them to the limit,
but we're also inventing new things all
the time as well. And I think to get all
the way to something like AGI, I think
may require one or two more new
breakthroughs. And you know, I think we
have lots of promising ideas that we're
cooking up and we hope to bring into the
to the main branch of the Gemini branch.
All right. And so there's been this
discussion about scale, you know, is
scale does scale solve all problems or
does it not? So I want to ask you in
terms of the improvement that's
available today, is scale still the star
or is it a supporting actor? I think
I've always been of the opinion you need
both. You need to scale to the maximum
uh the techniques that you know about
you want to exploit them to the limit
whether that's data or compute scale. uh
and at the same time you want to spend a
bunch of effort on what's coming next
maybe six months a year down the line so
you have the next innovation that might
do a 10x leap in some way um to to kind
of intersect with the scale so you want
both in my opinion but I don't know
Sergey what do you think I mean I agree
it takes both uh you know you can have
algorithmic improvements and simply
compute improvements better chips more
chips more power bigger data centers I
think that historically if you look at
um things like the Nbody problem and
simulating you know just gravitational
bodies and things like that as you plot
it the algorithmic advances have
actually beaten out the computational
advances even with Moore's law um if I
had to guess I would say the algorithmic
advances are probably going to be even
more significant than the computational
uh advances uh but uh both of them are
coming up now So we we're kind of
getting the benefits of both. And Debus,
do you think the majority of your
improvement is coming from building
bigger data centers and using more
chips? Like there's talk about how the
world will be just wallpapered with data
centers. Is that your vision? Well, no,
look, I mean it it we're definitely
going to need a lot more data centers.
Um it's amazing that, you know, it still
amazes me from a scientific point of
view. We turn sand into thinking
machines. It's pretty incredible. But
actually, it's not just for the
training. Um it's it's now we've got
these models that everyone wants to use,
you know, and actually we're seeing
incredible demand for 2.5 Pro and I
think flash we're really excited about
how performant that is for uh the
incredible sort of co low cost. Um I
think the whole world's going to want to
use these things and so we're going to
need a lot of data centers for serving
and also for inference time compute
giving you know you saw you saw deep
think today 2.5 pro deep think. more
time you give it, the better it will be.
And certain tasks, very high value, very
difficult tasks, you want to it will be
worth letting it think for a very long
time. And we're thinking about how to
push that even further. And uh again,
that's going to require a lot of chips
at at runtime. Okay. So, you brought up
test time compute. Uh we've been about a
year into this reasoning paradigm and
you and I have spoken about it twice in
the past as something that you might be
able to add on to traditional LLMs to
get gains. So I think this is like a
pretty good time for me to be like
what's what's happening? Uh what is can
you help us contextualize the magnitude
of improvement we're seeing from
reasoning? Look well we we've always
been big believers in what we're now
calling this thinking paradigm. If you
go back to our very early work on things
like Alph Go and Alpha Zero agent work
on on playing games. They will all had
this type of attribute of a thinking
system on top of a model. And actually
you can quantify how much difference
that makes if you look at a game like
chess or go. um you you know we had
versions of alpha go and alpha zero with
the thinking turned off so it was just
the model telling you its first idea and
you know it's not bad it's maybe like
master level something like that but
then if you turn the thinking on it's be
way beyond world champion level you know
it's like a 600 ELO plus difference
between the two versions so you can see
that in games let alone for the real
world which is way more complicated and
um I think the gains will be potentially
even bigger by adding uh this thinking
type of paradigm on top. Of course, the
challenge is that your models and I
talked about this earlier in the talk
need to be a kind of world model and
that's much harder than building a model
of a simple game of course and it and uh
you know it has errors in it and yet
those can compound over longerterm
plans. So um but I think we're making
really good progress on on all that all
those fronts. Okay. Yeah, look, I mean,
um, as Demis said, I mean, Deep Mind
really pioneered a lot of this
reinforcement learning work and, uh,
what they did with Alph Go and Alpha
Zeros. He mentioned um, it showed, as I
recall, something you would take 5,000
times as much training to match what you
were able to do with still a lot of
training and the inference time compute
that you were doing with Go. Um so it's
obviously a huge advantage and obviously
like uh most of us we get some benefit
by thinking before we speak. Um and uh
although uh not always
I always get reminded to do that. Um but
uh I I think that the the AIS obviously
are much stronger once you add that
capability and I think we're just at the
tip of the iceberg right now in that
sense.
It's been less than a year than these
models have really been around.
Especially if you think about obviously
with an AI during its thinking process,
it can also use a bunch of tools or even
other AIs um in in during that thinking
process to improve what the final output
is. So I think it's going to be an
incredibly powerful paradigm. Deep think
is very interesting. It I'm going to
describe it I'm trying to describe it
right. Uh it's basically a bunch of
parallel reasoning processes working and
then checking each other and then it's
like reasoning on steroids. Now Demis,
you mentioned that the industry needs a
couple more advances to get to AGI.
Where would you put this type of uh
mechanism? Is this one of those that
might get the industry closer? I think
so. I think it's it's maybe part of one.
Okay. Shall I should we say? Um and
there are others too that we need to you
know maybe this can be part of improving
reasoning. where does true invention
come from where you know you're not just
solving a mass conjecture you're
actually proposing one or hypothesizing
a new theory in physics um you know
that's I think we don't have systems yet
that can do that type of creativity I
think they're coming um and these types
of these types of paradigms might be
helpful in that uh things like thinking
um and then probably many other things I
mean I think we need a lot of advances
on the accuracy of the world models that
we're building um I think you saw that
with VO know the potential BO3 of how it
amazes me like the how it can intuitit
the physics of the light and the
gravity. Having someone, you know, I
used to work on on on get computer
games, not just the AI, but also
graphics engines in my early career. And
remember having to do all of this by
hand, you know, and and program all of
the lighting and the shaders and all of
these things. Incredibly complicated
stuff we used to do in early games. And
now it's it's just intuiting it within
the model. It's it's pretty astounding.
I saw you shared an image of a frying
pan with some onions and some oil. Hope
you all like that. There was no
subliminal messaging about that. No, not
really. Not really. Just maybe a subtle
subtle message. Okay. So, we've we said
the word AG or the acronym AGI a couple
times. There's I I think a movement
within the AI world right now to say
let's not say AGI anymore. The term is
so overused as to be meaningless. But
Demis, I it seems like you think it's
important. Why? Yeah, I think it's very
important, but I think I mean maybe I
need to write something about this also
with Shane Le who's our our chief
scientist who was one of the people
invented the term 25 years back. Um I
think there's sort of two things that
are getting a little bit conflated. Uh
one is like what can a typical uh uh
person do an individual do? And we can,
you know, we're all very capable, but we
can only do however capable you are,
there's only a certain slice of things
that one is expert in, right? And um or
you know, you could say what can you do
what like 90% of humans can do. Uh
that's obviously going to be
economically very important and I think
from a product perspective also very
important. So it's it's a very important
milestone. So maybe we should say that's
like you know typical human
intelligence. But what I'm interested in
and what I would call AGI is really a
more theoretical construct which is what
is the human brain as an architecture
able to do right and and that's the
human brain is an important reference
point because it's the only evidence we
have maybe in the universe that general
intelligence is possible and there it
would have to be able to you would have
to show your system was capable of doing
the range of things even the best humans
in history were able to do with the same
brain architecture not one brain but the
same brain architecture. So what
Einstein did, what Mozart was able to
do, what Marary Cury and so on. And that
it's clear to me today systems don't
have that. And then the other thing that
why I think it's sort of overblown the
hype today on AGI is that our systems
are not consistent enough to be
considered to be fully general yet.
They're quite general. So they can do,
you know, thousands of things. You've
seen many impressive things today, but
every one of us have experience with
today's chat bots and assistants. You
can easily within a few minutes find
some obvious flaw with them. some high
school math thing that it doesn't solve,
you know, some basic game it can't play.
Um, uh, it's not very difficult to find
that those holes in the system. And for
me, for something to be called AGI, it
would need to, um, be consistent, much
more consistent across the board than it
is today. It should take like a couple
of months uh, for for for maybe a team
of experts to find a a hole in it, an
obvious hole in it. Whereas, you know,
today it takes an individual minutes to
find that. Sergey, this is a good one
for you. Do you think that AGI is going
to be reached by one company and it's
game over? Or could you see Google
having AGI, OpenAI having AGI, Anthropic
having AGI, China having AGI? Wow. Um,
that's a great question. I mean, I guess
I would suppose that one uh company or
country or entity will reach AGI first.
Now it is a little bit of a you know
kind of a spectrum. It's not like a
completely precise thing. So it's
conceivable that there will be more than
one roughly in that range at the same
time. Um after that what happens I I
mean I think it's very hard to foresee.
uh but you could certainly imagine
there's going to be multiple entities
that come through and in our AI space
you know we've seen
uh whatever when we make a certain kind
of advance like other companies are
quick to follow and vice versa when
other companies make certain advances
it's you know it's a kind of a constant
leaprog so I do think there's an
inspiration element that you see uh and
that would probably encourage more and
more entities to cross that threshold
Dennis, what do you think?
Well, I think we we probably do I think
it is important for the field to agree
on a definition of AGI. So, I will maybe
we should try and help that to coalesce
assuming there is one, you know, there
probably will be some organizations that
get there first. And I think it's
important to that those first systems
are built reliably and safely. And um
and I think after that if that's the
case you know we can imagine using them
to shard off many systems that have safe
architectures sort of built under under
you know sort of provably underneath
them. Uh and then you could have you
know personal AGIS and all sorts of
things happening but it's you know it's
quite difficult as as Sergey says it's
pretty difficult to predict um sort of
see beyond the event horizon to predict
what that's going to be like. Right. So
we talked a little bit about the
definition of AGI and a lot of people
have said AGI must be knowledge right
the intelligence of the brain what about
the intelligence of the heart deis
briefly does does AI have to have
emotion to be considered AGI can it have
emotion I think it will need to
understand emotion I don't know if um I
think it will be a sort of almost a
design decision if we wanted to mimic
emotions um I think there's no I don't
see any reason why it couldn't in theory
um but uh it might different or we might
it might be not necessary or in fact not
desirable for them to have the sort of
emotional reactions that that we do as
humans. So I think again it's bit of an
open question um as we get closer to
this AGI time frame and you know uh sort
of events which I think is more on a 5
to 10 year time scale. So I think we
have a bit of time not much time but
some time to research those kinds of
questions. When I when I think about how
the time frame might be shrunk, uh I
wonder if it's going to be the creation
of self-improving systems. And last
week, I almost fell out of my chair
reading this headline about something
called Alpha Evolve, which is an AI that
helps design better algorith algorithms
and even improve the way uh LLMs train.
So, Demis, are you trying to cause an
intelligence explosion? No. Uh not an
uncontrolled one. Um I look I I think we
it's an interesting first experiment.
It's amazing system a great team that's
working on that where it's interesting
now to start pairing other types of
techniques in this case evolutionary
programming techniques with the latest
foundation models which are getting
increasingly powerful and I actually
want to see in our exploratory work a
lot more of these kind of combinatorial
uh systems and sort of pairing different
approaches together. Uh and you're right
that is one of the things a
self-improvement someone discovering a
kind of self-improvement loop uh would
be one way where things might accelerate
further than they're even going today.
Um so and and we've seen it before with
our own work with things like Alpha
Zero, you know, learning chess and go
and any two-player game from scratch uh
within, you know, less than 24 hours um
starting from random with self-improving
processes. So we know it's possible, but
again um those are in quite limited game
domains which are very well described.
So the real world is far messier and far
more complex. So remains to be seen if
that type of um approach can work in a
more general way. Sergey, we've talked
about some very powerful systems and
it's a race. It's a race to develop
these systems. Is that why you came back
to Google?
Um I mean I think as a computer
scientists uh it's a very unique time in
history like uh honestly anybody who's a
computer scientist uh should not be
retired right now should be working on
AI. That's what I would just say. I mean
there's just never
been a greater sort of problem and
opportunity a greater cusp uh of
technology. Um, so I don't I wouldn't
say it's because of the race. Uh,
although we fully intend that Gemini
will be the very first AGI. Clarify
that.
Uh, but
uh to be immersed in this uh incredible
technological revolution. I mean it's
unlike you know I went through sort of
the web 1.0 thing. It was very exciting
and whatever. We had mobile, we had
this, we had that. But uh I think this
is scientifically
uh far more exciting and I think uh I
think ultimately the impact on the world
is going to be even greater in as much
as you know the web and mobile phones
have had a lot of impact um I think AI
is going to be vastly more
transformative.
So what what do you do dayto-day?
I think I torture people like uh Demis
um who's amazing by the He tolerated me
crashing this uh fireside. Um I'm in the
you know I'm across the street uh you
know pretty much every day. Um and
they're just uh uh people who are
working on the key Gemini text models on
the pre-training on the post-raining
mostly those I periodically delve into
some of the multimodal work uh V3 as uh
you've all seen.
Um, but I tend to be uh pretty deep in
the technical details. Um, and that's a
luxury I really enjoy fortunately
because guys like Demis are, you know,
minding the shop. Um, and uh, yeah,
that's just where, you know, my
scientific interest is. It's deep in the
algorithms and how they can evolve.
Okay, let's talk about the products a
little bit. Some that were introduced
recently. Um, I just want to ask you a
broad question about agents, Demis,
because when I look at other tech
companies building agents, what we see
in the demos is usually something that's
contextually aware, has a disembodied
voice, is often interacted uh with you
often interact with it on a screen. When
I see Deep Mind and Google demos, often
times it's through the camera. It's very
visual. We There was an announcement
about smart glasses today. So talk a
little bit about if that's the right
read why why Google is so interested in
having an assistant or companion that is
something that sees the world as you see
it well it's for several reasons several
threads come together so as we talked
earlier we've always been interested in
agents that's actually the the the
heritage of deep mind actually we
started with agentbased systems in games
we are trying to build AGI which is a
full general intelligence clearly that
would have to understand the physical
environment physical world around you.
And two of the massive use cases for
that, in my opinion, are a truly useful
assistant that can come around with you
in your daily life, not just stuck on
your computer or one device. It needs to
we want it to be useful in your everyday
life for everything. And so it needs to
come around you and understand your
physical context. Um, and then the other
big thing is I've always felt for
robotics to work, you sort of want what
you saw with Astra on a robot. And I've
always felt that the the bottleneck in
robotics isn't so much the the hardware,
although obviously there's many many
companies and and working on fantastic
hardware and we partner with a lot of
them, but it's actually the software
intelligence that I think is always
what's held um robotics back. But I
think we're in a really exciting moment
now where finally with um these latest
versions, especially 2.5 Gemini and more
things that we're going to bring in this
kind of VO technology and other things.
I think we're going to have really
exciting uh algorithms to make robotics
finally work in in in its and you know
sort of realize its potential which
could be enormous. So I think this and
and then in the end AGI needs to be able
to do all of those things. So for us and
that's why you can see we always had
this in mind. That's why Gemini was
built from the beginning, even the
earliest versions to be multimodal. And
that made it harder at the start because
it's harder to make things multimodal
than just text only. But in the end, I
think we're reaping the benefits of
those decisions now. And I see many of
the Gemini team here in the front row of
the correct decisions we made. They were
the harder decisions, but we made the
right decisions. And now you can see the
fruits of that with all of what you've
seen today. Actually, Sergey, I've been
thinking about whether to ask you a
Google Glass question. Oh, fire away.
What did you learn from Glass that
Google might be able to uh apply today
now that it seems like smart glasses
have made a reappearance? Wow. Yeah. Uh
great question. Um I learned a lot. I
mean that was um I definitely feel like
I made a lot of mistakes with Google
Glass. I'll be honest.
Um I am still um a big believer in the
form factor. So I'm glad that we have it
now. Uh and now it's like looks like
normal glasses. doesn't have the thing
in front. Uh I think there was a
technology gap honestly. Now in the AI
world, the things that these glasses can
do to help you out without constantly
distracting you, that capability is much
higher. Uh there's also just um I just
didn't know anything about consumer
electronic supply chains really and how
hard it would be to build that and have
it be at a reasonable price point. um
managing all the manufacturing so forth.
Um this time we have great partners
that'll are helping us build this.
Um so that's another step
forward. Uh what else can I say? I do
have to say I miss the the um airship
with the wing suiting skydivers for the
demo.
Honestly, it would have been even cooler
here at Shoreline Amphitheater than it
was up in Moscone back in the day. But
maybe we'll have to we should probably
polish the product first this time.
Ready and available and then we'll do a
really cool demo. So that's probably a
smart move. Yeah. What I will say is I
mean look, we've got obviously an
incredible history of glass devices and
smart devices so we can bring all those
learnings to today and very excited
about our new glasses as you saw. What
I' what I've always always talking to
our team and Sham and the team about is
that I mean I don't know if Sergey would
agree but I feel like the that the
universal assistant is the killer app
for smart glasses and I think that's
what's going to make it work apart from
the fact that it's all the tech the
hardware technology is also moved on and
improved a lot is this I think I feel
like this is the actual killer app the
natural killer app for it. Okay, briefly
on video generation, I sat uh in the
audience in the keynote today and was
like fairly blown away by the level of
uh improvement we've seen from these
models and I I mean you had filmmakers
talking about it in the
presentation. I want to ask you Deis um
specifically about model quality. If the
internet fills with video that's been
made with artificial intelligence, does
that then go back into the training and
lead to a lower quality model than if
you were training just from human
generated content? Yeah, look, we we you
know, there's a lot of worries about
this so-called like model collapse. I
mean, video is just one thing, but in
any modality, text as well. There's a
few things to say about that. First of
all, we're very rigorous with our data
quality management and curation. We
also, at least for all of our generative
models, we we attach synth ID to them.
So there's this invisible AI actually
made watermark that um is pretty very
robust has held up now for you know a
year 18 months since we released it. And
all of our images and
videos are embedded with this watermark.
So we can detect and and we're releasing
tools to allow anyone to detect uh uh
these watermarks and know that that was
an AI generated um uh image or video.
And of course that's important to combat
deep fakes and misinformation, but it's
also of course you could use that to
filter out if you wanted to whatever was
in your training data. So I don't
actually see that as a big problem. Um,
eventually we may have video models that
are so good you could put them back into
the loop as a source of additional data,
synthetic data it's called. And there
you just got to be very careful that
you're you're actually creating from the
same distribution that you're going to
model. Um, you're not distorting that
distribution somehow. Uh, the quality is
high enough. We have some experience of
this in a completely different main with
with things like alpha fold where there
wasn't actually enough real experimental
data to build the final alpha fold. So
we had to build an earlier version that
then predicted about a million protein
structures and then we selected it had a
confidence level on that and we selected
the top three 400,000 and put them back
in the training data. So there's lots of
it's very cutting edge research to like
mix synthetic data with real data. So
there are also ways of doing that. But
on the terms of the video sort of
generator stuff, you can just exclude it
if you want to at least with our own
work and hopefully other um gen media
companies follow suit and um put robust
watermarks in. Also obviously first and
foremost to combat uh deep fakes and
misinformation. Okay, we have four
minutes. I got four questions left. We
now move to the miscellaneous part of my
question. So let's see how many we can
get through and as fast as we can get
through them. Um, let's go uh to Sergey
with this one. What does the web look
like in 10 years? What does the web look
like in 10 years? I mean, go one minute.
Boy, I think 10 years because of the
rate of progress in AI is so far beyond
anything we can see. Not just the web. I
mean, I don't know. I don't think we
really know what the world looks like in
10 years. Okay, Demis. Well, I think I
think that's a good answer. I do think
the web I think in nearer term the web
is going to change quite a lot if you
think about an agent first web like does
it really need to you know it doesn't
necessarily need to see renders and
things like we do as as humans using the
web so I think things will be pretty
different in a few years okay uh this is
kind of an underover over question uh
AGI before 2030 or after 2030
uh 2030 boy you really kind of uh put it
on that fine line I'm gonna I'm gonna
say before before. Yeah, Dennis. I'm
just after. Just after. Yeah. Okay. Um,
no pressure, Dennis. Exactly. Well, I
have to go back and get working harder.
Is that I can ask for it. He needs to
deliver it. So, exactly.
Bob Sandbagger.
We need that next week. That's true.
I'll come to the review. All right. So,
would you hire someone that used AI in
their interview? Demis.
Oh, in their interview.
Um, depends how they used it. I think
using today's models, uh, tools,
probably not, but I think that would be
Well, it depends how they would use it,
actually. I think it's probably the
answer. Sergey,
I mean, I never interviewed at all. So,
um, I don't know. I I feel it would be
hypocritical for me to judge people
exactly how they interview. Yeah, I
haven't either, actually. So, snap on
that. I've never done a job interview.
Okay, so Demis, I've been reading your
tweets. Um, you put a very interesting
tweet up where there was a prompt that
created some sort of natural scene. Oh,
yeah. Here was the tweet. Uh, nature to
simulation at the press of a button does
make you wonder with a couple of emojis
and people ran with that and wrote some
headlines saying Demis thinks we're in a
simulation. Are we in a simulation? um
not in the way that you know um Nick
Boston and people talk about. I think I
I do think though this so I don't think
this is some kind of game even though I
wrote a lot of games. I do think that
ultimately underlying physics is
information theory. So I do think we're
in a computational universe but it's not
just a straightforward simulation. I
can't answer you in one minute, but um
but I think I think the fact that these
systems are able to model um real uh
structures in nature is quite
interesting and telling and I've been
thinking a lot about our work we've done
with Alph Go and Alpha Fold and these
types of systems. Uh I spoken a little
about about it. Maybe at some point I'll
write up a scientific paper about what I
think that really means in terms of
what's actually going on here in
reality. Sergey, you want to make a
headline?
Well, I think that argument applies
recursively, right? If we're in a
simulation, then by the same argument,
whatever beings are making the
simulation are themselves in a
simulation for roughly the same reasons
and so on and so forth. So, I think
you're going to have to either accept
that we're in an infinite stack of
simulations,
uh, or that there's got to be some
stopping criteria. And what's your best
guess? Um I think that we're taking a
very anthropocentric view like when we
say simulation in the sense that some
some kind of conscious being is running
a simulation that we are then in and
that that they have some kind of
semblance of desire and consciousness
that's similar to us. I think that's
where it kind of breaks down for me.
Um, so I I just don't think that we're
really equipped to reason about sort of
one level up in the hierarchy.
Okay. Well, Dennis, Sergey, thank you so
much. This has been such a fascinating
conversation. Thank you all. All right,
Alex. Thank you. Pleasure.