[music]
-Hello, and welcome back to the ILO’s Future of Work podcast.
I’m Sophy Fisher.
Advances in generative Artificial Intelligence, or AI,
will have widespread implications for labor markets and production.
Whereas in the past automation
has been seen as primarily affecting low-skilled jobs,
such as those in agriculture or in factories,
AI is different.
Many say that the effects of tools
such as ChatGPT will be on more highly skilled jobs,
the so-called white-collar workers.
I have with me today two people
who are well qualified to throw light on this question.
Shakked Noy and Whitney Zhang
are both with the Massachusetts Institute of Technology,
better known as MIT, which is in the United States.
They’ve published a new paper,
Experimental Evidence
on the Productivity Effects of Generative Artificial Intelligence.
Shakked and Whitney, welcome.
Thank you very much for joining us on the podcast.
-Thanks for having us.
-Thanks for having us.
-First of all, why don’t you run us through the key findings of your paper?
-Yes, so the way we typically summarize it,
our paper has four main findings.
The first is that access to generative AI,
in this case, ChatGPT-3.5,
increased the productivity of medium-level white-collar professionals
on the sorts of writing tasks
they might typically encounter in their jobs.
It raised their productivity both by
allowing them to complete those tasks faster
and by slightly raising the quality of the resulting output.
The second finding is that
the productivity gains were larger for workers
whose initial ability levels were lower.
They gained about the same in terms of increased speed
as higher-ability workers but gained much more in terms of quality
of their output.
The third finding is that workers’ initial reactions
to encountering these technologies seemed to be pretty positive.
People were more satisfied
with the work that they’d done and with the task
when given access to ChatGPT
compared to when they were asked to complete the task
without access to that technology.
Then finally, when we followed up with the workers
who’d completed the tasks in our experiment
and asked them about their activities in their real jobs
two weeks and two months after the experiment,
a substantial fraction of them,
30-40% were using ChatGPT in their real jobs.
The percentage using it
was substantially higher in the treatment group
whom we’d introduced to ChatGPT compared to the control group
whom we had not introduced to ChatGPT,
suggesting that there are still frictions or barriers
to diffusion of the technology in real workplaces.
-Right. To sum up your first two findings,
essentially it makes higher-skilled workers quicker
and lower-skilled workers both better and quicker.
-Exactly, yes. -The consequences for this are
potentially greater productivity.
What do you think the consequences are for employment and wages?
-This, I would say, depends on a number of factors
that we can’t directly address in the experiment itself.
In particular,
if we think about things on both the demand and the supply sides,
if there’s this increased productivity
and workers are able to produce more than on the demand side,
there’s this question of,
are people going to demand these types of outputs more?
For example, if we think that the demand for writing,
or advertisements, or coding would substantially increase,
then this could suggest an increase in employment and wages.
Whereas if that demand is relatively inelastic, then perhaps
this would lead to a decrease in jobs overall.
Then just thinking about the supply as well.
One thing is that our tasks are around this very specific part of
what most people’s jobs are doing this kind of writing.
Whereas a lot of tasks in the real world
are much more loose and require a lot more judgment and planning,
and these other factors.
Then just also thinking about if this is 5% of someone’s job,
for example,
maybe I can take that 5% out and then replace that person entirely.
This just means that they have 5% more time
or they’re going to fill that time with other work.
It also depends to what extent these tasks
are the main part of someone’s job
or just some small part that’s then complemented by other pieces.
-The subjects in your experiment
were basically doing writing tasks, weren’t they?
Which required them to synthesize information
and then produce a piece of text.
That’s correct, isn’t it?
-That’s right. -What other skills do you see
this applying to?
Would it simply be that?
-I think it’s a potentially very flexible technology.
We’re only beginning to see the range of applications
that the generative AI systems can be applied to.
I think definitely writing tasks like the ones in our experiment,
in the sense that they’re relatively generic,
they’re not extremely technical,
are one main application for generative AI.
There’s also lots of other things.
It’s very good at writing code, for example.
It can be good at other things like summarizing documents,
synthesizing information,
maybe even as a cooperative tool for brainstorming or coming up with ideas.
It’s, potentially a technology with a lot of applications.
-I would think your findings might be of concern to high-skilled workers
who are hearing what you say because essentially,
someone could employ a lower-skilled worker
who would presumably be cheaper
and get almost exactly the same results as a worker
who might potentially be more expensive.
-I think limited to the specific set of tasks,
I would say yes.
I think, in some sense, that’s what you would find here.
At the same time,
again,
I think a lot of these times these tasks are bundled together.
Potentially, when you’re paying for these high-skilled workers,
you’re also asking in some sense for their skills and judgment
or interpersonal communication
or their ability to work with others on a team.
These aren’t things that I think ChatGPT right now would be great at assisting at,
and I think it’s still our important skills in the workplace.
Depending on what that bundle of skills looks like in the job,
I think that would ultimately predict what is going to happen.
Then I think the other thing, this relates back to
complementarity versus substitution more broadly,
is that here, these initial low performers are doing much better,
but they’re doing much better because they’re just taking this prompt,
putting it into GPT,
taking this output, and then submitting it is what we generally seem to be finding.
Then there’s also a question of,
well, do you even need these low-skilled workers in the first place?
Or can you just hire someone to do this prompting?
That would ultimately be more efficient.
In that case, I think it becomes even more unclear
what the impacts on inequality might look like.
-Did you look at all that the impact on inequality
or was that something that still requires more research depending on
how the technology is rolled out?
-We looked at an important input into the ultimate effects on inequality,
which is by how much it raises the productivity of lower-ability
compared to higher-ability workers.
There’s a common argument
that inequality has increased a lot over the past several decades,
at least in part because technologies like computers
enhanced the productivity of high-ability workers
more than they enhanced the productivity of low-ability workers.
Which gave rise to divergence in wages and earnings.
In this case, what we found, again, limited to a very specific set of tasks
in a very specific context
was the opposite, which is that
the productivity impacts were greater for lower-ability workers.
A bunch of other factors will determine
what the ultimate equilibrium effects of
these technologies on unemployment will be,
like some of the demand side factors and supply side factors
that Whitney has talked about,
how training and human capital acquisition will adapt,
how companies will restructure their workforces
and their models of production and things like that.
It’s really impossible to tell from current information
what the ultimate impacts on inequality will be.
-Higher productivity is often associated
with reductions in poverty and increasing wages.
In that sense, it might actually reduce inequality.
-Absolutely.
Productivity has been stagnant for a long time,
even as we’ve gotten fantastic new things like the Internet.
I don’t think there’s a consensus on why, for example, the Internet
hasn’t resulted in higher productivity growth than we’re currently seeing.
It suggests that things like
bottlenecks from other parts of the production system
or other factors might be important.
I wouldn’t be super confident from the fact
that generative AI raises the productivity of white-collar workers specifically,
that this will translate into the kinds of
broad-based productivity gains for the economy that would translate into
increases in living standards and reductions in poverty.
I think it’s too early to tell.
-I wanted to pick you up on one particular sentence in your paper,
which said that the experiment’s evidence suggests
that it largely substitutes for worker effort
rather than complements worker skills,
potentially causing a decrease
in demand for workers with adverse distributional effects
as capital owners gain at the expense of workers.
By which I think you mean that the people who have the money and the capital
will gain a higher proportion of the benefits.
Would that be correct?
-That’s right. -This action is taken to shape the impact,
be that through legislation or taxation or something like that?
-Right. I think this goes back to
what we were just talking about in terms of
what are these workers actually doing in our experiment,
and it’s that they’re taking these prompts,
they’re putting them into ChatGPT, and then they’re submitting these outputs.
It seems plausible that you could,
if that’s all that they’re adding value to,
then you could have someone where that’s their job.
You don’t need all of these workers to be doing that kind of task.
Who then takes all of the profits
that come out of doing this kind of output generation?
Then it’s the people who own this type of capital in the first place.
-Do you think it is also possible to shape the impact
and the way that this is applied through other kinds of institutions,
for example,
stronger worker representation, greater voice in the shaping of processes,
unions, things like that?
-Absolutely.
I think this is an area where there’s been a lot of speculation
because there’s a very large amount of anecdotal
or qualitative evidence about the ways
that unions or other forms of worker representation
are involved in shaping the way that technologies affect firms
and shaping how firms adjust their employment structures
in response to new technologies.
I would say there’s still very little solid evidence
on the interaction
between systems of worker representation and technology adoption.
You can tell optimistic stories
where worker representatives protect workers
and encourage firms to upskill them,
thereby ensuring that technology gets rolled out in a way
that really benefits everyone,
benefits the consumers through productivity gains,
but also benefits workers
through internal productivity gains and wage gains.
There’s also potentially stories
where worker representatives block the deployment
and development of new technologies
in a way that’s ultimately harmful for society
because they’re representing relatively narrow interests.
I would say there’s as of yet, no good empirical evidence
that would help us adjudicate between those two possibilities.
That definitely seems like a really important area of research going forward.
-I was also talking about those who might benefit.
I was also struck by a point you made in the paper
about, that it could increase options for people
who potentially are very creative or who have a lot of good ideas,
but simply have a problem expressing themselves
in ways that others can understand and digest.
For example, people who have problems with writing skills,
either because they’re very technological
or simply because perhaps they’re writing in a second or third language.
ChatGPT could help them literally get their message across.
-That’s right.
I think anecdotally, I’ve certainly heard from people
for whom English is not their first language that,
for example, ChatGPT has been really helpful
in helping them with academic writing or otherwise,
where before this was a bottleneck for them.
Now they’re much more able to communicate their research or other areas.
I think this is another way in which
GPT technologies could be a sort of leveling
of the playing field in that
people who might be really great with creativity or thinking of ideas,
but just have difficulty writing them in a way
that sounds like professional
or academic style can now use the technology
to make their writing more sound like these other aspects.
I think this could be potentially great
for both these workers
as well as just for innovation as a whole, potentially.
-I was also struck by one of the other main conclusions
that you mentioned,
which was about the most subjective outcomes, e.g.
the job satisfaction
that a lot of your subjects in the experiment
found using ChatGPT gave them equal or greater job satisfaction.
Whereas traditionally in the past, people have felt that
forms of automation have reduced job satisfaction.
Do why they felt that the job was more satisfying
and whether that feeling lasted
or whether it was simply a hangover from the experiment?
-Right. To some extent, we definitely think that
this reflects people’s first reactions to an exciting new technology
rather than necessarily something fundamental and permanent about the way
that the technology affects their enjoyment of their work.
When we followed up with people two weeks and two months after the experiment,
the people who were in the treatment group
were much more likely to be using ChatGPT in their real jobs,
but they weren’t more satisfied
with their real jobs than people in the control group.
Now, that doesn’t necessarily mean
that ChatGPT won’t raise job satisfaction in the long run.
It could be that it’s just too small of an impact on a two-week
or two-month time horizon
relative to your base prior level of satisfaction in your job.
We don’t find evidence
that it’s permanently improving job satisfaction.
As to why it raised job satisfaction in our experiment,
one possibility,
again, is that people are just really excited
to discover a powerful new technology.
Another possibility is just,
ChatGPT helps with many elements of tasks that might be the least enjoyable.
Editing,
translating your ideas into a rough draft.
The tasks that we gave people were not terribly exciting or stimulating.
People may in general have found,
writing an e-mail
or writing a short report to be a boring task.
In that case, anything that helps you finish that task more quickly
might raise your job satisfaction.
You could tell a story where basically,
ChatGPT is speeding people up
or helping them expend less effort precisely in the areas
that they least enjoy working and in the types of tasks
that they least enjoy doing.
Again, I would say there’s no conclusive evidence
on that from our experiment.
-In that case, research always raises as many questions
as it answers.
Where do you want to go next with this?
What do you see as the key questions that have been raised by
what you find so far?
-I think the next big thing is really to try to move into the field
into a setting where the tasks are much less well defined,
where we’re requiring much more of workers
to use a broad bundle of skills,
where there’s many of these organizational-type concerns
that we aren’t able to address an individual level experiment.
To see what productivity impacts of workers look like there.
I think there’s just also interesting work to be done in looking at
how organizations are figuring out how to adapt to these workflows.
Shakked and I have been looking into some more qualitative stuff
where it seems like companies are really
seeing this as an exciting tool, but not quite sure of
how do I actually get my workforce to start using this?
What are some areas
where this might really improve productivity in my organization?
What are good use cases and what are bad use cases?
Of course, there’s all sorts of regulatory
and legal things as well that further complicate things.
I think just having a broader look at the ecosystem as a whole,
I think is the exciting next direction.
-Yes. In other words, the difference between what happens in the lab
and what happens in real life.
Shakked and Whitney, thank you very much for joining us.
That was extremely interesting.
Another look at the effects of generative AI on the world of work.
Unfortunately, that is all we have time for
in this edition of the podcast.
Once again, thanks to Shakked Noy
and Whitney Zhang for joining us and talking about their paper.
If you want to know more about it,
you can find a link to it on the webpage for this podcast on the ILO’s website.
For now, thank you very much
for listening and please join us again soon.
For now, goodbye.
[music]