One small bit of nit-picking on your use of this analogy:
"The process is somewhat akin to an accelerated form of biological evolution"
IMO, the process of training a model should be compared to the process of raising a child from embryo to young adult - the time at which formal training typically ends. However, there is evolution at the level of the techniques that we use at training & inference (transformer architectures, expanded context, reasoning models, etc.).
Hey Jason - I really appreciate you taking the time to leave this feedback.
You’re absolutely right that the “accelerated biological evolution” analogy wasn’t the right fit here. Evolution is partly driven by random mutations, whereas backprop + gradient descent are very targeted, intentional updates.
I played around with a few alternative analogies, but none hit the balance of accuracy and conciseness I’m trying to aim for. Instead, I decided it was cleaner to just remove the analogy entirely and rephrase that section a bit.
I hope you enjoyed the post overall and would welcome any future feedback you may have!
100%. In my experience, the people dismissing AI with this reasoning are either (1) too personally incentivized to see it fail (e.g. developers), and/or (2) are basing their assessment on anecdotal experiences with shitty chat bots on websites and thinking these are 1:1. The other thing that surprises me about this line of rejection is the arrogance it requires in assuming we know how our own brains actually work. It also suggests they have a clear definition of "what is intelligence and how do we measure it" which would be quite impressive if true :)
One small bit of nit-picking on your use of this analogy:
"The process is somewhat akin to an accelerated form of biological evolution"
IMO, the process of training a model should be compared to the process of raising a child from embryo to young adult - the time at which formal training typically ends. However, there is evolution at the level of the techniques that we use at training & inference (transformer architectures, expanded context, reasoning models, etc.).
Hey Jason - I really appreciate you taking the time to leave this feedback.
You’re absolutely right that the “accelerated biological evolution” analogy wasn’t the right fit here. Evolution is partly driven by random mutations, whereas backprop + gradient descent are very targeted, intentional updates.
I played around with a few alternative analogies, but none hit the balance of accuracy and conciseness I’m trying to aim for. Instead, I decided it was cleaner to just remove the analogy entirely and rephrase that section a bit.
I hope you enjoyed the post overall and would welcome any future feedback you may have!
100%. In my experience, the people dismissing AI with this reasoning are either (1) too personally incentivized to see it fail (e.g. developers), and/or (2) are basing their assessment on anecdotal experiences with shitty chat bots on websites and thinking these are 1:1. The other thing that surprises me about this line of rejection is the arrogance it requires in assuming we know how our own brains actually work. It also suggests they have a clear definition of "what is intelligence and how do we measure it" which would be quite impressive if true :)