Why You Should Start an AI PhD Now
1. You are pushing the boundary of this industrial revolution
You are pushing the boundary of this industrial revolution — and you are the one deciding which direction to push it. If you have big ambitions, there is nothing bigger than this. Yes, with just two GPUs at your school lab you won’t build the next LLM that tops every leaderboard, but you could design a new architecture that runs faster and uses fewer resources, you could analyze why those “PhD-level” LLMs still fail at basic reasoning tasks like counting, you could even invent an entirely new paradigm.
When Hinton worked on neural nets in the 1980s, most AI researchers were focused on symbolic reasoning and expert systems — neural nets were widely dismissed as impractical. When LeCun developed convolutional neural nets in the 1990s, the community was focused on support vector machines and other statistical methods — people even called them “convoluted” neural nets to mock their complexity. When the GPT3 came out in 2020, I had a job interview where the interviewer told me that OpenAI was only pushing GPT for branding purposes, because everyone else was using bidirectional Transformers like BERT, which performed much better than causal Transformers like GPT.
You might not be the author of a revolutionary paper like GPT-3, but that paper cited 146 others — you could be the author of one of them. Each of those 146 papers also cited hundreds more, and your name might appear somewhere down the chain. Most scientific progress is incremental: even if your work isn’t the one that changes everything, it could be the building block that makes the final breakthrough possible.
With all these tech companies dropping new models every week, it’s easy to get the illusion that research only happens in industry. But that’s completely wrong — we in industry constantly read academic papers, and sometimes realize that people in academia have already found smarter ways to solve the same problems. Academia isn’t falling behind; in fact, many of the best ideas originate there and are later scaled up and turned into products by industry.
2. Honing your all-round skills in building AI
During your PhD, you own your project end-to-end, which means you touch almost every part of an AI system — data, infrastructure, modeling, evaluation, and often even building demos and promoting your work. You become deeply familiar with every component that makes AI systems work.
Owning a project end-to-end might not sound like a big deal, but it’s actually rare in industry. In most companies, you work in a team where each member focuses on a small piece of the system. Because of this, someone who has hands-on experience with the full pipeline becomes much more valuable than someone who has only specialized in one area.
When I started my job search, I watched videos about how to get a job in big tech as a new grad — and was frustrated that the most important thing seemed to be solving LeetCode problems. My advisor pointed out that job hunting for CS undergrads and AI PhDs is very different. As a PhD, you already have a multi-year track record of publications and open-source contributions. If your record shows that your skills align with what a hiring team needs, LeetCode shouldn’t matter as much. During my own job search, I failed many coding interviews because I didn’t prepare LeetCode — yet half of the companies I interviewed still made me offers because my research stood out.
3. Suffering builds mental strength
To outsiders, AI might look like a glamorous field — every day there are new models, new products, new funding rounds. But doing a PhD in AI often means long periods of quiet thinking, working alone, and struggling with problems that don’t seem to move forward.
The nature of scientific research is that 80% of your ideas won’t work. Of the remaining 20%, half will fail due to implementation errors. When things don’t work, you’re usually the only one who can fix them, because you’re the only person who knows every detail of your project. You might wish everyone else were struggling too, but in such a fast-moving field, you’ll constantly see people posting shiny new results on social media — which makes the struggle even harder.
The life of an AI PhD often looks like this: wake up, figure out why yesterday’s idea didn’t work, modify it, code, debug, and think — all on your own. What keeps you going? You need to genuinely love your project, enjoy coding, and be addicted to problem-solving.
When you embark on a journey knowing that nine out of ten attempts will fail — and still do it anyway — you are a true adventurer who values the process more than the outcome. And if you can persist like that for five years and succeed in the end, the rest of your career will only feel easier.
The End
You’re not just building models — you’re building understanding. A PhD might turn you into a celebrated researcher whose every paper shakes the field, but more often, it makes you a quiet thinker — someone who works in silence, guided by the belief that what you’re doing matters, and patiently trying out every idea that might bring it to life.
Whether or not anyone notices, the world moves a little further because you tried.