References
- Adamic et al. (2005). The political blogosphere and the 2004 US election: divided they blog. Proceedings of the 3rd international workshop on Link discovery, pp. 36–43
- Alaboudi et al. (2021). An exploratory study of debugging episodes. arXiv preprint arXiv:2105.02162
- Argyle et al. (2023). Out of one, many: Using language models to simulate human samples. Political Analysis, vol. 31, no. 3, pp. 337–351
- Baria et al. (2021). The brain is a computer is a brain: neuroscience‘s internal debate and the social significance of the Computational Metaphor. arXiv preprint arXiv:2107.14042
- Beisel et al. (2002). [RETRACTED] Histone methylation by the Drosophila epigenetic transcriptional regulator Ash1. Nature, vol. 419, no. 6909, pp. 857–862
- Bender et al. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. Proceedings of the 58th annual meeting of the association for computational linguistics, pp. 5185–5198
- Berglund et al. (2023). The reversal curse: Llms trained on“ a is b” fail to learn“ b is a”. arXiv preprint arXiv:2309.12288
- Brown et al. (2020). Language models are few-shot learners. Advances in neural information processing systems, vol. 33, pp. 1877–1901
- Cabanac et al. (2021). Tortured phrases: A dubious writing style emerging in science. Evidence of critical issues affecting established journals. arXiv preprint arXiv:2107.06751
- Callaham et al. (2002). Journal prestige, publication bias, and other characteristics associated with citation of published studies in peer-reviewed journals. Jama, vol. 287, no. 21, pp. 2847–2850
- Castro Torres et al. (2022). North and South: Naming practices and the hidden dimension of global disparities in knowledge production. Proceedings of the National Academy of Sciences, vol. 119, no. 10, pp. e2119373119
- Chollet and Francois (2019). On the measure of intelligence. arXiv preprint arXiv:1911.01547
- Deshpande et al. (2023). Toxicity in chatgpt: Analyzing persona-assigned language models. arXiv preprint arXiv:2304.05335
- Doshi et al. (2024). Generative AI enhances individual creativity but reduces the collective diversity of novel content. Science Advances, vol. 10, no. 28, pp. eadn5290
- Drori et al. (2022). A neural network solves, explains, and generates university math problems by program synthesis and few-shot learning at human level. Proceedings of the National Academy of Sciences, vol. 119, no. 32, pp. e2123433119
- Durmus et al. (2023). Towards measuring the representation of subjective global opinions in language models. arXiv preprint arXiv:2306.16388
- Dziri et al. (2024). Faith and fate: Limits of transformers on compositionality. Advances in Neural Information Processing Systems, vol. 36
- Garcia et al. (2024). Artificial intelligence–generated draft replies to patient inbox messages. JAMA Network Open, vol. 7, no. 3, pp. e243201–e243201
- Girotra et al. (2023). Ideas are dimes a dozen: Large language models for idea generation in innovation. Available at SSRN 4526071
- Ha et al. (2024). Organic or diffused: Can we distinguish human art from ai-generated images?. Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pp. 4822–4836
- Hilbert et al. (2011). The world’s technological capacity to store, communicate, and compute information. science, vol. 332, no. 6025, pp. 60–65
- Karras and Tero (2019). A Style-Based Generator Architecture for Generative Adversarial Networks. arXiv preprint arXiv:1812.04948
- Kong et al. (2023). Better zero-shot reasoning with role-play prompting. arXiv preprint arXiv:2308.07702
- Kotek et al. (2023). Gender bias and stereotypes in large language models. Proceedings of the ACM collective intelligence conference, pp. 12–24
- Kumar et al. (2016). Ask me anything: Dynamic memory networks for natural language processing. International conference on machine learning, pp. 1378–1387
- Köpf et al. (2024). Openassistant conversations-democratizing large language model alignment. Advances in Neural Information Processing Systems, vol. 36
- Li et al. (2024). Artificial Intelligence awarded two Nobel Prizes for innovations that will shape the future of medicine. NPJ Digital Medicine, vol. 7, no. 1, pp. 336
- Lightman et al. (2023). Let’s verify step by step. arXiv preprint arXiv:2305.20050
- M. Bran et al. (2024). Augmenting large language models with chemistry tools. Nature Machine Intelligence, pp. 1–11
- Matter et al. (2024). Close to Human-Level Agreement: Tracing Journeys of Violent Speech in Incel Posts with GPT-4-Enhanced Annotations. arXiv preprint arXiv:2401.02001
- McAleese et al. (2024). Llm critics help catch llm bugs. arXiv preprint arXiv:2407.00215
- Morris et al. (n.d.). Levels of AGI for operationalizing progress on the path to AGI, arXiv, 2023. arXiv preprint arXiv:2311.02462
- Nasr et al. (2023). Scalable extraction of training data from (production) language models. arXiv preprint arXiv:2311.17035
- Park et al. (2023). Generative agents: Interactive simulacra of human behavior. Proceedings of the 36th annual acm symposium on user interface software and technology, pp. 1–22
- Perkins et al. (2024). Genai detection tools, adversarial techniques and implications for inclusivity in higher education. arXiv preprint arXiv:2403.19148
- Porter et al. (2024). AI-generated poetry is indistinguishable from human-written poetry and is rated more favorably. Scientific Reports, vol. 14, no. 1, pp. 26133
- Radford and Alec (2018). Improving language understanding by generative pre-training.
- Sharma et al. (2023). Towards understanding sycophancy in language models. arXiv preprint arXiv:2310.13548
- Sharma et al. (2024). Facilitating self-guided mental health interventions through human-language model interaction: A case study of cognitive restructuring. Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, pp. 1–29
- Si et al. (2024). Can llms generate novel research ideas? a large-scale human study with 100+ nlp researchers. arXiv preprint arXiv:2409.04109
- Sidorkin and Alexander M (2024). Embracing chatbots in higher education: the use of artificial intelligence in teaching, administration, and scholarship.
- Sivak et al. (2019). Parents mention sons more often than daughters on social media. Proceedings of the National Academy of Sciences, vol. 116, no. 6, pp. 2039–2041
- Stribling et al. (2005). Rooter: A methodology for the typical unification of access points and redundancy.
- Unknown Author (2024). Delving into ChatGPT usage in academic writing through excess vocabulary. arXiv preprint arXiv:2406.07016
- Villalobos et al. (2024). Will we run out of data? Limits of LLM scaling based on human-generated data. arXiv preprint arXiv:2211.04325, vol. 3
- Waswani et al. (2017). Attention is all you need. NIPS
- Wei et al. (2022). Chain-of-thought prompting elicits reasoning in large language models. Advances in neural information processing systems, vol. 35, pp. 24824–24837
- West et al. (2023). THE GENERATIVE AI PARADOX:“What It Can Create, It May Not Understand”. The Twelfth International Conference on Learning Representations
- Wu et al. (2023). Reasoning or reciting? exploring the capabilities and limitations of language models through counterfactual tasks. arXiv preprint arXiv:2307.02477
- Wu et al. (2024). [RETRACTED] Assessment of the efficacy of alkaline water in conjunction with conventional medication for the treatment of chronic gouty arthritis: A randomized controlled study. Medicine, vol. 103, no. 14, pp. e37589
- Wuttke et al. (2024). AI Conversational Interviewing: Transforming Surveys with LLMs as Adaptive Interviewers. arXiv preprint arXiv:2410.01824
- Yin et al. (2024). Should We Respect LLMs? A Cross-Lingual Study on the Influence of Prompt Politeness on LLM Performance. arXiv preprint arXiv:2402.14531
- Zech et al. (2018). Confounding variables can degrade generalization performance of radiological deep learning models. arXiv preprint arXiv:1807.00431
- Zhang et al. (2024). [RETRACTED] The three-dimensional porous mesh structure of Cu-based metal-organic-framework-aramid cellulose separator enhances the electrochemical performance of lithium metal anode batteries. Surfaces and Interfaces, vol. 46, pp. 104081
- Zheng et al. (2023). Is “A Helpful Assistant” the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts. arXiv preprint arXiv:2311.10054, vol. 8