Discussion about this post

User's avatar
Daniel Popescu / ⧉ Pluralisk's avatar

This piece really made me think. The sheer data scale for pre-training is wild like learning complex new cycling routes. Brilliant insight into GPT!

Expand full comment

No posts

Ready for more?