autoregressive foundational machine learning models.

Have “capabilities” to understand natural language.

Exhibits emergent behaviour of intelligence, but probably not AGI due to observer-expectancy effect.

One way or another is a form of behaviourism, through reinforcement learning. It is being “told” what is good or bad, and thus act accordingly towards the users. However, this induces confirmation bias where one aligns and contains his/her prejudices towards the problem.


Incredibly hard to scale, mainly due to their large memory footprint and tokens memory allocation.


See also: this talk

  • Quantization: reduce computational and memory costs of running inference with representing the weight and activations with low-precision data type
  • Continuous batching: Implementing Paged Attention with custom scheduler to manage swapping kv-cache for better resource utilisation

on how we are being taught.

How would we assess thinking?

Similar to calculator, it simplifies and increase accessibility to the masses, but in doing so lost the value in the action of doing math.

We do math to internalize the concept, and practice to thinking coherently. Similarly, we write to help crystalised our ideas, and in the process improve through the act of putting it down.

The process of rephrasing and arranging sentences poses a challenges for the writer, and in doing so, teach you how to think coherently. Writing essays is an exercise for students to articulate their thoughts, rather than testing the understanding of the materials.

on ethics

See also Alignment.

There are ethical concerns with the act of “hallucinating” content, therefore alignment research is crucial to ensure that the model is not producing harmful content.

as philosophical tool.

To create a better representations of the world for both humans and machines to understand, we can truly have assistive tools to enhance our understanding of the world surround us

Imagine Nietzsche, Kant, Camus coexists in the same room.

AI generated content

Don’t shit where you eat, Garbage in, garbage out. The quality of the content is highly dependent on the quality of the data it was trained on, or model are incredibly sensitive to data variances and biases.

Bland doublespeak

See also: All the better to see you with and this tweet


See this and this This only occurs if you only need a “good-enough” item where value outweighs the process.

However, one should always consider to put in the work, rather than being “ok” with good enough. In the process of working through a problem, one will learn about bottleneck and problems to be solved, which in turn gain invaluable experience otherwise would not achieved if one fully relies on the interaction with the models alone.

These models are incredibly useful for summarization and information gathering. With the taxonomy of RAG or any other CoT tooling, you can pretty much augment and produce and improve search-efficiency bu quite a lot.

notable mentions:


Overall should be a net positive, but it’s a double-edged sword.

as end-users


I think it’s likely that soon all computer users will have the ability to develop small software tools from scratch, and to describe modifications they’d like made to software they’re already using

as developers

Tool that lower of barrier of entry is always a good thing, but it often will lead to probably even higher discrepancies in quality of software

Increased in productivity, but also increased in technical debt, as these generated code are mostly “bad” code, and often we have to nudge and do a lot of prompt engineering.