Why You Shouldn’t Start an AI PhD Now

3 minute read

1. Research direction is largely decided by industry

I pursued three different research directions during my PhD, and two of them were heavily inspired by papers from industry — my Prompting Whisper work was a follow-up to OpenAI’s Whisper, and my VoiceCraft model was inspired by Microsoft’s VALL-E.

In fact, industry is the one that decides how the game is played, because they have far more compute and data. I’ve heard it more than once: someone at school works on a task for three months, finally gets it to work, and then a big tech company releases a much larger model trained on vastly more data that achieves state-of-the-art results on ten tasks — including that very task the poor PhD student has been working on. It’s tough to accept that something you’ve spent months improving by 5% is suddenly overtaken by an industrial model that improves it by 30%.

Academia barely stands a chance when competing with industry in building state-of-the-art AI models. If you’re only interested in chasing leaderboards, very few academic labs will have the resources to support your work.

Perhaps more crushingly, some of the most impactful work in modern AI wasn’t done by people with PhDs. For example, Alec Radford — the creator of GPT, CLIP, and Whisper — does not have a PhD.

2. Poverty

Doing a PhD never makes financial sense, even in the hottest field — AI.

Right now, entry-level AI research scientists at big tech typically make between $300k and $700k a year. Let’s take the average, $500k. It sounds like a lot for a new grad, but it takes at least five years to complete a PhD before you can even interview for such a job. During those five years, you make $20–40k a year — barely enough to get by.

In another world, if you hadn’t gone for a PhD five years ago and instead started as a software engineer, working just as hard as you would in a PhD, you will be a senior or staff engineer at a big tech by now — with a median salary around $500k and $770k. Plus, during those five years, you’d have earned far more money as a junior and mid-level software engineer compared to a PhD student.

I still remember the feeling when I had only $2,000 in my bank account when I arrived at UT Austin for my PhD in September 2021. My rent was $1,100, and my PhD stipend after tax was $2,000. It wasn’t fun at all knowing that every time I got a direct deposit, more than half of it went straight to rent.

3. Struggling

Almost every PhD student goes through periods of depression. The situation is especially tough for AI PhDs right now because the field is so crowded — almost any idea you can think of is either already published or will be published next month by someone else.

Doing impactful research requires a mindset completely opposite to the state of AI right now. You need to spend endless hours surveying the literature, finding a unique angle to tackle a problem, adapting or building codebases, running experiments, and analyzing why things don’t work. During this time (often 3–8 months), there will be weeks or even months when you feel you’re not making any progress — and in the frantic pace of AI, where new models are released daily, you’ll feel even worse.

Unlike in industry, where you usually work in a team, most PhD projects are done alone. Yes, you’ll have collaborators who provide feedback during meetings, but you’re the one actually writing the code and running the experiments. Often, the reason something doesn’t work isn’t because the idea is wrong, but because the implementation is slightly off — and in those cases, you’re the only one who can fix it.

If you have reached here, please check out my article on why this is the best time to do a PhD in AI.

Share on

X Facebook LinkedIn Bluesky

Puyuan Peng

Why You Shouldn’t Start an AI PhD Now

1. Research direction is largely decided by industry

2. Poverty

3. Struggling

Share on

You May Also Enjoy

Why You Should Start an AI PhD Now

PhD in AI – My Experience

做一个更快乐的博士生

Deep RL 12 Reinforcement Learning and Control as Probabilistic Inference