See also essays on ChatGPT

introduction.

To understand how AI is fundamentally political, we need to go beyond neural nets and statistical pattern recognition to instead ask what is being optimized, and for whom, and who gets to decide. Then we can trace the implications of those choices. -- Kate Crawford, The Atlas of AI

1979’s “Star-Trek: the Motion Picture” centered around the antagonist, V’Ger, an artificial entity that have outgrown its original programs, seeked annihilation upon planet Earth. At the core, the movie is mostly fictional, yet its prevalence to our current state of affairs is uncanny. Much in Artificial intelligence (AI) has changed since 1960s, including a shift in symbolic systems to more recent hype about deep connectionist networks. AI has expanded rapidly as a academia field and as a industry1. Yet, the belief of formalising human intelligence and reproduced by machine has always been the core disputes in the history of AI. There has always been two narratives discussed within academia and industry practioners on how we should approach such systems: The likes of Marvin Minsky claiming “machine can think” (CRAWFORD, 2021, pp. 5–9); while Dreyfus (Dreyfus, 2008) believed in a Heideggerian AI system would dissolve the framing problem2. Nowadays, this narrative morphs into two verticals: Entities that seek to build systems capable of outperforming at tasks that a human can do at a greater degree of accuracy and efficiency (OpenAI, Anthropic, SSI, many AI labs, etc.3), and companies that build AI systems to amplify our abilities to create and improve efficiency for our work (Runway, Cohere, etc.).

This literature review aims to provide a comprehensive overview of the current state of AI, through its history and current adoption. It will also include investigations into certain concerns for diversity, equity, and inclusion (DEI) within the field, as well as the ethical implications of AI systems. It will then conclude and posit questions about where we go from here.

growth.

Mathematicians wish to treat matters of perception mathematically, and make themselves ridiculous [...] the mind [...] does it tacitly, naturally, and without technical rules. -- Pascal, Pensées

The inception of AI might well begin when the belief of a total formalisation of knowledge must be possible4. From Plato’s dichotomy of the rational soul from the body with its skills and intuition5, to Leibniz’s conception of the binary systems as a “universal characteristics” (Leibniz, 1951, pp. 15, 25, 38) that led to Babbage’s design of “Analytic Engine” being recognized as the “first digital computer”, Alan Turing posited that a high-speed digital computer, programmed with rules, might exhibit emergent behaviour of intelligence (TURING, 1950). Thus, a paradigm among researchers that focused on symbolic reasoning was born, referred to as Good Old-Fashioned AI (GOFAI) (Haugeland, 1997). GOFAI was built on a high level symbolic representation of the world, popularized through expert systems (Jackson, 1998) that tried to mimic human expert on specialized tasks 6. Yet, we observed a period of “AI Winter” where most symbolic AI research either reached dead end or funding being dried up (Hendler, 2008). This is largely due to GOFAI’s semantic representation which were implausible to scale to generalized tasks.

Concurrently, Donald Norman’s Parallel Distributed Processing (Rumelhart et al., 1986) group investigated variations of Rosenblatt’s project (Rosenblatt, 1958), where they proposed intermediate processors within the network (often known as “hidden layers”) alongside with inputs and outputs to extrapolate appropriate responses based on what it had learned during training process. These systems, built on top of statistical methods7 and connectionist networks are often referred to by Haugeland as New-Fangled AI (NFAI) (Haugeland, 1997).

In retrospect, GOFAI are deterministic in a sense that intentionality is injected within symbolic tokens through explicit programming. Connectionist networks, on the other hand, are often considered as black-box models, given their hidden nature of intermediate representations of perceptron. Unlike GOFAI, its internal representation is determined by the state of the entire network rather than any single unit. Given the rise of Moore’s Law and the exponential amount of computing and data available, we are currently witnessing the dominance of connectionist networks, especially with the injection of LLMs into the mainstream (Kaplan et al., 2020), where the majority of research are focused on developing artificial neural networks that optimizes around loss functions (Vaswani et al., 2023) (Srivastava et al., 2014). One notable example that combines both GOFAI and NFAI systems is AlphaZero, a connectionist network based Go playing systems, that uses a deep neural networks to assess new positions and Monte-Carlo Tree Search (a GOFAI algorithm) to determine its next move (Silver et al., 2017).

adoption.

For context, we produce a lot of data: social media consumption, emails transaction, search, online shopping, mainly due to the rise of the internet and Web 2.0 post 9/11. While capitialism has always been a fraught system, there are incentives for harvesting our attention and predict our future behaviour — what Zuboff refers to as “surveillance capitalism” (Carr, 2019). In a sense, surveillance capitalism is built on top of the notion of extraction imperatives where the Google and Facebook of the world have to mine as much information as possible 8. Machine learning benefited of this phenomenon since statistical methods often predict certain pattern from given data and yield certain predictions/decisions. ML can be categorized into two sub fields, supervised learning (where algorithms are trained on labelled data to provide prediction based on given labels) and unsupervised learning (where algorithms are trained on the basis of “produce y in the form of x”)9.

Supervised learning methods including Naive Bayes, Decision tree, and other Bayesian models have been well integrated into industries to solve forecasting and classification problems (Wu et al., 2020)

fairness

See also: MIT Press (Hao et al., 2019), Darthmouth investigation in COMPAS system (Dressel, 2018)

DEI has become a key aspect of technological progress in the century. This applies to AI, where its black-box nature has proven to be difficult for researchers to align certain bias bugs. Two main DEI methods emerge for addressing given problems: improving data diversity and ensuring fairness during the training procedure.

The primary methods on fighting against bias bugs in contemporary AI system includes increase in data diversity. There is a timeless saying in computer science “Garbage in Garbage out”, which essentially states that bad data will produce outputs that’s of equal quality. This is most prevalent in AI, given the existence of these networks within a black-box model. One case of this is the very first iterations of Google Photos’ image recognition where it identified people with darker skins as “gorillas” (BBC News, 2015). Alliances such as The Data & Trust Alliance, including Meta, Nike, CVS Health, are formed to regulate and combat algorithmic bias. The Data & Trust Alliance aims to confront dangers of powerful algorithms in the work force before they can cause harm instead of simply reacting after the damage is done (Lohr, 2021). (Clarke, 2021) proposed that close inspection and regulation of these models should be monitored closely to mitigate misrepresentation of marginalized groups (Khan, 2022).

Truth is, data lacks context. A prime example of this US’ COMPAS used by US courts to assess the likelihood of criminal to reoffend. ProPublica concluded that COMPAS was inherently biased towards those of African descent, citing that it overestimated the false positives rate for those of African descent by two folds (Angwin et al., 2016). Interestingly, a study done at Darthmouth showed a surprising accuracy on the rate of recidivism with random volunteers when given the same information as the COMPAS algorithm (Dressel, 2018). The question remains, how do we solve fairness and ensure DEI for marginalized groups when there is obviously prejudice and subjectivity that introduce bias at play? It is not a problem we can’t solve, rather collectively we should define what makes an algorithm fair.

Ackley, D. H., Hinton, G. E., & Sejnowski, T. J. (1985). A Learning Algorithm for Boltzmann Machines. Cognitive Science, 9(1), 147–169. https://doi.org/10.1207/s15516709cog0901_7
Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). How We Analyzed the COMPAS Recidivism Algorithm. ProPublica. https://www.propublica.org/article/how-we-analyzed-the-compas-recidivism-algorithm
Aristotle. (2009). Nicomachean Ethics (L. Brown, Ed.; W. D. Ross, Trans.). Oxford University Press.
BBC News. (2015). Google apologises for Photos app’s racist blunder. BBC News. https://www.bbc.com/news/technology-33347866
Carr, N. (2019). Thieves of Experience: How Google and Facebook Corrupted Capitalism. Los Angeles Review of Books. https://lareviewofbooks.org/article/thieves-of-experience-how-google-and-facebook-corrupted-capitalism/
CRAWFORD, K. (2021). The Atlas of AI: Power, Politics, and the Planetary Costs of Artificial Intelligence. Yale University Press. http://www.jstor.org/stable/j.ctv1ghv45t
Dressel, J., & Hany Farid. (2018). The accuracy, fairness, and limits of predicting recidivism. Science Advances, 4(1), eaao5580. https://doi.org/10.1126/sciadv.aao5580
Dreyfus, H. L. (1972). What Computers Can’t Do: A Critique of Artificial Reason (1st ed.). Harper & Row.
Dreyfus, H. L. (2008). Why Heideggerian AI Failed and How Fixing It Would Require Making It More Heideggerian. In The Mechanical Mind in History (pp. 331–362). MIT Press.
Hao, K., Kar, J., & Buolamwini, J. (2019). Can you make AI fairer than a judge? Play our courtroom algorithm game. MIT Technology Review. https://www.technologyreview.com/2019/10/17/75285/ai-fairer-than-judge-criminal-risk-assessment-algorithm/amp/
Haugeland, J. (1997). Mind Design II: Philosophy, Psychology, and Artificial Intelligence. The MIT Press. https://doi.org/10.7551/mitpress/4626.001.0001
Hendler, J. (2008). Avoiding Another AI Winter. IEEE Intelligent Systems, 23(2), 2–4. https://doi.org/10.1109/MIS.2008.20
Jackson, P. (1998). Introduction to Expert Systems (3rd ed., p. 542). Addison Wesley.
Jordan, M. I., & Mitchell, T. M. (2015). Machine learning: Trends, perspectives, and prospects. Science, 349(6245), 255–260.
Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J., & Amodei, D. (2020). Scaling Laws for Neural Language Models. https://arxiv.org/abs/2001.08361
Leibniz, G. W. (1951). Leibniz Selections (P. P. Wiener, Ed.; p. 606). Charles Scribner’s Sons.
McKinsey & Company. (2024). McKinsey technology trends outlook 2024. McKinsey Digital. https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/the-top-trends-in-tech
Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological Review, 65(6), 386–408. https://doi.org/10.1037/h0042519
Rumelhart, D. E., McClelland, J. L., & Group, P. R. (1986). Parallel Distributed Processing, Volume 1: Explorations in the Microstructure of Cognition: Foundations. The MIT Press. https://doi.org/10.7551/mitpress/5236.001.0001
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., Lillicrap, T., Simonyan, K., & Hassabis, D. (2017). Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm. https://arxiv.org/abs/1712.01815
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A Simple Way to Prevent Neural Networks from Overfitting. Journal of Machine Learning Research, 15(56), 1929–1958.
TURING, A. M. (1950). I.—COMPUTING MACHINERY AND INTELLIGENCE. Mind, LIX(236), 433–460. https://doi.org/10.1093/mind/LIX.236.433
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. (2023). Attention Is All You Need. https://arxiv.org/abs/1706.03762
Wu, D., Wang, X., Su, J., Tang, B., & Wu, S. (2020). A Labeling Method for Financial Time Series Prediction Based on Trends. Entropy, 22(10). https://doi.org/10.3390/e22101162

Footnotes

  1. (Jordan & Mitchell, 2015) described the emerging trends within classical machine learning systems, focusing on recommendation systems. From a recent McKinsey’s reports of outlook trend of 2024, they reported around 570bn dollars equity investment in the adoption of generative AI, notably the integration of LLMs into enterprises usecase (McKinsey & Company, 2024)

  2. An intelligent being learns from its experience, then applies such intuition to predict future events. How does one select appropriate context (frame) for a given situation?
    Dreyfus’ argument is that machines are yet able to represent human’s reliance on many unconscious and subconscious processes (Dreyfus, 1972). A Heideggerian AI would exhibit Dasein (being in the world).

  3. Their goals are to build “artificial super intelligence” (ASI) systems. This target is largely due to certain observer-expectancy effect we observe in the current AI system.

  4. According to Plato, Socrates asked Euthyphro, a fellow Athenian who is about to turn in his own father for murder in the name of piety: “I want to know what is characteristic of piety which makes all actions pious. […] that I may have it to turn to, and to use as a standard whereby to judge your actions and those of other men.” This is Socrates’ version of effective procedure for modern-day computer scientists.

  5. According to Plato, all knowledge must be universally applicable with explicit definitions, in other words, intuition, feeling would not constitute as the definition of knowing Aristotle differed from Plato where intuition was necessary to applying theory into practice (Aristotle, 2009, p. 8, book VI). For Plato, cooks, who proceed by taste and intuition does not involve understanding because they have no knowledge. Intuition is considered as a mere belief.

  6. Allen Newell and Herbert Simon’s work at RAND initially showed that computers can simulate important aspects of intelligence.

  7. Notable figures include John Hopfield, Hinton’s “A Learning Algorithm for Boltzmann Machines” (Ackley et al., 1985) that introduces the concept of Boltzmann’s distributions in training neural networks, as well as Hinton’s later work on backpropagation algorithm.

  8. Some notable quotes:

    • “Unlike financial derivatives, which they in some ways resemble, these new data derivatives draw their value, parasite-like, from human experience.”.
    • “[Facebook’s algorithm fine-tuning and data wrangling] is aimed at solving one problem: how and when to intervene in the state of play that is your daily life in order to modify your behavior and thus sharply increase the predictability of your actions now, soon, and later.”
  9. This is a mere simplification of the field. ML researchers also investigate in specific sub-fields