-
Mashup Score: 4When Two LLMs Debate, Both Think They'll Win - 4 day(s) ago
Can LLMs accurately adjust their confidence when facing opposition? Building on previous studies measuring calibration on static fact-based question-answering tasks, we evaluate Large Language…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 4
Medical vision-language models often struggle with generating accurate quantitative measurements in radiology reports, leading to hallucinations that undermine clinical reliability. We introduce…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 1
We investigate the effectiveness of large language models (LLMs), including reasoning-based and non-reasoning models, in performing zero-shot financial sentiment analysis. Using the Financial…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 4Google Scholar is manipulatable - 8 day(s) ago
Citations are widely considered in scientists’ evaluation. As such, scientists may be incentivized to inflate their citation counts. While previous literature has examined self-citations and…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 0How much do language models memorize? - 11 day(s) ago
We propose a new method for estimating how much a model “knows” about a datapoint and use it to measure the capacity of modern language models. Prior studies of language model memorization have…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 3
We investigate the potential implications of large language models (LLMs), such as Generative Pre-trained Transformers (GPTs), on the U.S. labor market, focusing on the increased capabilities…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 15Why Academics Are Leaving Twitter for Bluesky - 12 day(s) ago
We analyse the migration of 300,000 academic users from Twitter/X to Bluesky between 2023 and early 2025, combining rich bibliometric data, longitudinal social-media activity, and a novel…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 20
Today’s AI systems have human-designed, fixed architectures and cannot autonomously and continuously improve themselves. The advance of AI could itself be automated. If done safely, that would…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 2Unlocking Non-Invasive Brain-to-Text - 17 day(s) ago
Despite major advances in surgical brain-to-text (B2T), i.e. transcribing speech from invasive brain recordings, non-invasive alternatives have yet to surpass even chance on standard metrics. This…
Source: arXiv.orgCategories: General Medicine NewsTweet
-
Mashup Score: 27Contemplative Wisdom for Superalignment - 17 day(s) ago
As artificial intelligence (AI) improves, traditional alignment strategies may falter in the face of unpredictable self-improvement, hidden subgoals, and the sheer complexity of intelligent…
Source: arXiv.orgCategories: General Medicine NewsTweet
When Two LLMs Debate, Both Think They'll Win https://t.co/Q3nGJ0QjtB via @PradyuPrasad et al https://t.co/uy4yiOQY21