Uncategorized

Meta CTO Andrew Bosworth on the generative AI craze


Q: You met Mark Zuckerberg when you were teaching at Harvard. Does it feel like everything is coming full circle now, as generative AI reshapes Meta?

A: [Meta Chief Product Officer] Chris Cox and I both studied AI in unusual ways. He was part of the symbolic systems program at Stanford. I was part of this mind-brain and behavior program at Harvard — computational neurobiology. So Chris and I both had this kind of interesting, very interdisciplinary approach to AI as undergraduates. Having said that, when I was teaching AI as an undergraduate [teaching fellow], we taught that neural networks were once promising, but then known to be useless technology. That was 2004. And, of course, neural networks run the world now.

So let’s take it with a grain of salt. I love an origin story as much as the next person, but I don’t want to get too ahead of myself. When I came to Facebook, Mark told his recruiter, ‘I need someone who knows AI’ and she said, ‘I know this one guy.’ So I came in and built probably the first artificial intelligence that we shipped here, which were all heuristic-based systems for ranking News Feed. Chris Cox was my daily partner. I did the back-end and ranking and he did the front-end design work.

So, AI has always been a major component of our thinking. But like all technologies, you’re in these S curves. Before generative AI really hit the scene, you really were on this big flat part of an S curve and there for a long time, just grinding out gains, like a quarter percent or half a percent year-over-year. We’ve known for three or four years that these generative AI, large language models were a big deal, but they still felt like infrastructure. That’s why I think the OpenAI introduction of ChatGPT is so interesting.

I’m not trying to say we were prescient. We had been investing and we believed in this technology. For all of that belief, it still snuck up on us. It was like, ‘Oh my god, it’s here. We’ve got to do things differently. We’ve got to change it up.’

Q: I appreciate the honesty. It does seem like when you take mixed-reality hardware and combine it with generative AI, you get something bigger than the sum of its parts. Do you see it that way?

A: If you go back to Michael Abrash’s talks at Connect — these big, what does the future of augmented reality look like talks — artificial intelligence was always a part of the vision, but we probably got the sequencing wrong. We thought we would need a lot more very narrow data. So we all had this idea that we were going to have to have a lot of glasses that did some good stuff in the market for a long time. And then you could get to the AI that made it even more useful.

If you think about machine learning before large language models, it was always that way. There was some kind of use case that created value. For instance, Facebook existed before News Feed was ranked. Then you ranked News Feed badly. Over time, you learned how to do it better. All AI systems to date started with some non-AI thing to get a dataset that could allow you to power an AI thing that got better over time. Search worked that way. Initially, it was like Yahoo — just a list of pages.

These large language models are of a very different character. There’s a great deal more generality to them. I don’t have to get the exact, perfect training data anymore. We believe now that we have an AI that’s going to be valuable as the first cornerstone of these [Meta] devices.

Look at the Meta Ray Ban glasses. We believed this was going to be a great product with just camera video, live streaming, great music, good for calls. That’s it. We were pumped. Six months ago, we’re like, ‘We’ve got to get the assistant on it.’ Now, it’s the best feature of the glasses. Do you know how crazy that is to have a piece of hardware whose killer feature changes six months out? That does not happen.

For us, how we’re looking at the entire portfolio now is like, ‘Ok, the sequencing is backwards.’

Q: But is the data gathered from the glasses and the Quest headset still valuable?

A: The most valuable data that’s coming from the glasses is how people are using it. That’s just a classic hotspot. Where are the opportunities in the product? The truth is, with these devices, we’ve designed them so that the content isn’t coming to us. It’s on your phone. So unless you submit a bug report, or even if you opt in to all the different ways that you can share data with us, it’s mostly telemetry.

Q: How do you think AI is going to come into play with the Quest?

A: The irony is some of the things that you really want to do on Quest actually don’t have great training sets. If you think about text, you have the entire internet. You think about photos, you have these huge libraries on Facebook, on Instagram. But there’s not a great big canonical library of 3D objects, let alone 3D objects that animate in four-dimensional space. That’s what you really want. We’re doing that work to try to make sure that we improve the modalities to include being able to export 3D things. So in some ways, mixed reality and virtual reality are harder, because you actually have this additional dimension that you’re responsible for.

On the flip side, there are obviously huge advantages in mixed reality and VR, where they have sensors that are always on. They’re always scanning and sensing the room. So there is certainly high potential there. Some of the most obvious use cases actually take a lot more work for us. We’re doing the research, we’re seeing some early promising results on 3D and 4D spaces.

Q: What’s the killer use case that you’re trying to get to?

A: There are near-term and far-term answers. In the near term, they’re much more creator oriented: styling your avatar, creating a world, creating a game, that kind of thing. In the distant future, the North Star is a much more efficient interface with the machine itself. So you can work at both the content level, but also at the system level. So I can say, ‘Open this game, and invite my friends to the game.’ It’s very complex. I’m going to take your system and move you here. Meanwhile, I’m going to create an invite and engage the messaging systems and get these people engaged, get them pointed in the same direction.

Meta

Q: One thing you could do with these glasses is have them tell you who you’re looking at, which could be especially helpful when you don’t remember somebody’s name. But that requires facial recognition and that’s controversial. Is that a feature Meta would ever make?

A: I would love for you to lead a national discussion or an international one on this topic. There are people who have prosopagnosia. They’re clinically unable to recognize faces. But all of us have this problem. It’s a very human problem to have. An easy solution might be, if we’re looking at each other’s faces, I said to you, ‘hey, do you mind if I just remember your face?’ And you said, ‘Yeah, that’s fine.’ And I say ‘Hey, glasses. remember this face.’ Now that’s a very high friction consent flow. But I still can’t do it in Texas or Illinois. Texas has a law called CUBI and Illinois has BIPA. They prevent me from even doing the scan on somebody to determine if they’re in the set. I think these are bad laws. They prevent a very useful, reasonable, completely sensible thing.

The thing that people are worried about, I don’t think anyone is building. No one has been trying to build this ‘who are these random people?’ What we really want is, ‘I’ve met this person. I know this person. This person might be a friend of mine that I haven’t seen in 10 years.’ And I don’t want to just socially start out on the wrong foot. It’s the kind of thing that I think is completely reasonable by even the most stringent standards of privacy advocates, but you can’t do it right now.

Q: Are there other things you’d like to do but can’t because of laws and regulations?

A: The laws are very fluid right now. We came up in a period of incredible regulatory stability, both domestically and internationally. These things take years. You see them coming, there’s a regulatory process. Now, you’ve got states passing arbitrary laws, like Texas and Florida, on the legality of content moderation. There’s a bevy of states that want to do different types of moderation, for content for kids for different things. And Europe continues to trudge ahead with sometimes regulation, sometimes just empowering non-regulatory bodies. Sometimes you don’t even have great clear legal relief on how to appeal judgments there. So it’s really dynamic.

More often than not, it’s not limitations. It’s open questions. It hasn’t been tested in the courts. The regulators don’t even know what it means. Sometimes, there are definitions being passed into law that are unclear. There’s a lot of uncertainty on a regular basis for our work.

Q: Shifting gears to Llama, your family of AI models. What’s the strategy behind open sourcing Llama?

A: We’ve been open-source leaders for a long time, from the day I joined Facebook in 2006. PHP was open source. We would have fallen down without Memcached, which is an open source project that we took over. MySQL was open source. It’s one of the reasons we were able to be such a small lean company and do so much with the small staff that we had. We’ve been pushing not just those projects, but PyTorch, the number one system that machine learning engineering is done with. We built that and open sourced it.

When you open source something, a couple of things happen. One thing is you create a community around it, so it is self reinforcing. Because the rest of the industry uses PyTorch, a bunch of tools that we want are built not by us, but by other people that we get to use. And the things that we build are made better, more performant, more efficient by the community.

And in the case of Llama, a lot of people are obviously benefiting from it. We’ve seen 100 million downloads, we’ve seen all the different projects that have come out of it, and commercial programs are succeeding on the basis of it. If Llama continues to be what it is today — the center of gravity for a large community of builders — that’s something we benefit from immensely. It’s very self reinforcing.

Separately, if you look at the FAIR methodology and philosophy that we’ve used for the 10 years we’ve had that research lab, they’ve just found when they open source their software, they immediately get independent, third-party validation of the results. And then somebody goes and builds on that. And then we get to use that result to build our next paper, our next program.

We don’t open source everything. There are things that we do, because of the dataset that we’re using, or because of the safety of it, we just don’t feel like we can open source it. But we do try, all else being equal, to open source things with the confidence that it’s going to help everybody, including us. It’s a win-win.

Q: How does the open-source strategy play into recruiting? I know some engineers love open source, while there are some AI researchers who believe it’s irresponsible to open source such powerful technology.

A: I’m not sure if it helps with recruiting in a sense of, ‘I want to work on this thing.’ Maybe it does. I think it certainly helps you have a larger body to recruit from because more people are familiar with your models, with how you’re doing training. There are a lot more engineers we can hire who know how to use PyTorch. So I think it’s really good for recruiting from a growing pie of people who are one step from being great at your company.

And then on the second thing, I think a lot of that anxiety about these large language models is pretty misplaced. I think [Meta Chief AI Scientist] Yann LeCun has said it really well. They are a super interesting and important tool. I think of it as an associative database. They don’t have agency, they don’t have reasoning, or planning skills. They don’t have a theory of the world or a theory of mind. And they’re not on a path to having those things. And we’re already seeing diminishing returns on model size and model scaling.

There’s going to be breakthrough after breakthrough. It’s a very exciting time. It’s a great and important technology. And I can’t wait for it to continue to become cheaper to build, cheaper to run, lower latency, more efficient, being able to run really powerful models on low power, small devices. These are all really important things. They are going to take a lot of time and energy to get done.

And also, this is not AGI. It’s not like the thing itself taken to the absolute limit is AGI. What was the old quote that Von Neumann said about Oppenheimer? ‘Sometimes someone confesses a sin in order to take credit for it.’ I think there’s a lot of that going around in the Valley sometimes. I’m not saying people don’t believe it. Maybe they do. But I don’t.

Q: What is the pitch when you’re recruiting the next brilliant AI researchers? How do you get them to join Meta as opposed to OpenAI or Google DeepMind?

A: They’ve got so many choices right now. It’s a great time to be in that field, and have that expertise. The companies are pretty different. If people want to join Google and the 1000-person effort that’s happening there, they can and they know that. They’re getting something different here at Meta in terms of the size, scale, and the individual impact that they can have. OpenAI is a third, different thing. Kind of an unusual governance structure. And there are some weird equity units there.

Q: So how’s the recruiting effort going?

A: It’s going well. Great talent is leaving all these companies. Some of it is going to other companies in this set. Some of it’s going to completely new places. Mistral in France is a team of people who built Llama.

I’ve had, by Valley standards, a long career, and this is the natural way of things. When new technology happens, there’s a big explosion of enthusiasm. Some of that goes towards startups, some of it goes towards established companies. Mostly, what happens is the number of people that you are bringing into this space grows.

That’s the real thing that’s happening right now. We’re not competing for a fixed sum. There are like 10 times more of them now than there was one year ago, because these people see the land grab, and they have to get good at these skills.

Q: You mentioned Mistral. That Mixtral model that came out last week, where they use a mixture of experts, seems like competition to what you are doing at Meta. Do you see it that way?

A: No. This is one of the funny things that happens if the entire community rallies around any open-source model, and it becomes the de facto standard. We’re great. We can use that model. It didn’t have to come from us, like Memcached didn’t come from us. MySQL didn’t come from us. Those are great things. We’re happy to use them. So once you’re not trying to compete with open source, open source is good.

Let’s have models that compete, have different strategies, and try to outperform one another in different places. You’ve already aligned yourself with, ‘my business model isn’t keeping other people from this technology.’ Once you’re in that space, the term competition takes on very different meetings.

To be clear, the Llama team definitely wants to beat every other model. That’s good, that’s healthy. But that’s a competition among people out of goodwill and good spirit. It’s not an existential competition. if you lose that competition, your punishment is you get to use a better model.

Q: Doesn’t Llama 2 have a limit on the size of the organizations that are allowed to use it?

A: That is true. You always do worry about the free-rider problem. We’re pretty happy empowering the developer community, the open-source community. Once you’re one of a half dozen or dozen companies that qualify for the exemptions, then you have to have a conversation. And I think that keeps the incentives aligned.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *