Paper link: https://arxiv.org/abs/2501.12948

By : https://x.com/vinodcodes

<aside> 🔖 We’ll Try to understand the DeepSeek paper over the top, learning some terminologies. PS: This is my first blog so if you get any misinterpretations or so please let me know I have mentioned my twitter profile here. I hope I’ll improve more after writing more such blogs. So yeah Let’s get it. 💪🏻

</aside>

This aged well didn't it ?

image.png

Let’s set the context first:

Overview of DeepSeek R1-zero:

image.png

We’ll now look at what the abstract talks about

Introduction: