“`html

An innovative AI system is being developed to detect possible indications of cognitive decline in doctors’ reports concerning their patients.(Image credit: Getty Images)ShareShare by:
- Copy link
- X
Share this article 0Join the conversationFollow usAdd us as a preferred source on GoogleNewsletterSubscribe to our newsletter
The initial indications of diminished cognitive abilities often manifest themselves not via an official diagnosis, but rather through subtle hints contained within notes from healthcare professionals.
A recent study, which was featured on Jan. 7 within the journal npj Digital Medicine, implies that artificial intelligence (AI) has the ability to assist in pinpointing these preliminary signals — for instance, difficulties pertaining to thinking and recall or shifts in conduct — through the examination of physician’s records for patterns of concern. These could potentially involve frequent references to mental alterations or disorientation exhibited by the patient, in addition to concerns voiced by family members present during the appointment with the person they care for.
You may like
-

Giving AI the ability to monitor its own thought process could help it think like humans
-

A woman experienced delusions of communicating with her dead brother after late-night chatbot sessions
-

Switching off AI’s ability to lie makes it more likely to claim it’s conscious, eerie study finds
“The intention is not to supplant clinical evaluation but instead function as an assistive screening tool,” stated study co-author Dr. Lidia Moura, an associate professor of neurology at Massachusetts General Hospital, in conversation with Live Science. By identifying such individuals, she expressed, the system could aid medical professionals in determining which individuals merit a subsequent assessment, particularly in environments where specialists are limited in number.
Whether that sort of assessment genuinely provides assistance to individuals hinges upon the manner in which it is employed, according to Julia Adler-Milstein, a health informatician at the University of California, San Francisco, who had no involvement in the research. “Assuming the indicators possess precision, are directed to the appropriate member of the care team, and are operational, thereby initiating a definite subsequent action, then indeed, they can be readily assimilated into the workflow of the clinic,” she conveyed to Live Science via electronic mail.
A team of AI agents, not just one
To construct their novel AI system, the researchers utilized what they characterize as an “agentic” methodology. This expression alludes to a synchronized compilation of AI programs — specifically, five in this situation — each endowed with a particular responsibility while concurrently reviewing one another’s output. As a unit, these collaborating entities iteratively improved the means by which the system interpreted clinical documents absent human intervention.
The investigators built the system atop Meta’s Llama 3.1 and furnished it with a collection of physicians’ reports from a three-year timeframe for analysis, including clinic appointments, progress updates, and discharge overviews. These items were derived from a repository at a hospital and had previously been examined by clinicians who documented the presence of cognitive issues in a particular patient’s file.
The team initially presented the AI with a balanced arrangement of patient documents, consisting of an equal proportion exhibiting noted cognitive concerns and those devoid of them, thereby facilitating its learning process in rectifying errors while endeavoring to correspond with the categorizations clinicians had previously assigned to these records. Upon culmination of this procedure, the system’s assessment matched the clinicians’ findings around 91% of the time.
Subsequently, the completed system underwent evaluation utilizing an independent subset of data that it had not previously encountered, but that was extracted from the identical three-year compilation. The intention of this second collection was to emulate actual care scenarios; thus, only approximately one-third of the records were designated by clinicians as exhibiting cognitive concern.
Within that evaluation, the system’s sensitivity decreased to about 62%, signifying that it neglected nearly four out of every ten instances wherein clinicians had marked records as displaying affirmative indications of cognitive deterioration.
You may like
-

Giving AI the ability to monitor its own thought process could help it think like humans
-

A woman experienced delusions of communicating with her dead brother after late-night chatbot sessions
-

Switching off AI’s ability to lie makes it more likely to claim it’s conscious, eerie study finds
Initially, the reduction in precision seemed to be a failing — until the researchers scrutinized the health records that both the AI and human reviewers had sorted differently.
Medical specialists assessed these instances by reevaluating the medical documentation directly, and did so absent awareness of whether the classification originated from clinicians or the AI. In 44% of occurrences, these evaluators ultimately concurred with the system’s judgment as opposed to the original analysis of charts performed by a physician.
“That was among the more unexpected findings,” stated study co-author Hossein Estiri, an associate professor of neurology at Massachusetts General Hospital.
In numerous of those instances, he elaborated, the AI implemented clinical interpretations with more restraint than the physicians had, abstaining from noting concerns in situations where reports didn’t explicitly delineate memory impairments, disorientation, or supplementary alterations in the patient’s cognitive processes — even if a diagnosis of cognitive decline was present elsewhere within the record. Fundamentally, the AI was prepared to prioritize references to potential cognitive concerns, which doctors might not routinely designate as significant in the present moment.
The outcomes underscore the limitations of manual chart analysis performed by doctors, Moura noted. “Whenever the indicators are apparent, they are universally recognized,” she stated. “It’s when they are understated that discrepancies arise between individuals and computers.”
Karin Verspoor, an investigator specializing in AI and health technologies at RMIT University, who was unaffiliated with the study, indicated that the system underwent evaluation on a meticulously arranged compilation of physicians’ notes that had been scrutinized by clinicians. Given that the data was obtained exclusively from a single network of hospitals, she cautioned that its correctness might not readily translate to environments characterized by divergent documentation protocols.
RELATED STORIES
—A fresh blood analysis for Alzheimer’s has emerged. Here is the full scope of essential information regarding it.
—A man with a virtually guaranteed susceptibility to developing early Alzheimer’s remains asymptomatic in his 70s — What are the underlying mechanisms?
—Around half of all global instances of dementia can be postponed or avoided, according to scientists
She noted that the system’s vision is constrained by the standard of the documentation it reviews, and this restraint can be addressed solely through optimizing the system across various clinical environments, she contended.
Estiri clarified that, currently, the system is envisioned to function imperceptibly in the backdrop of standard doctor’s consultations, showcasing potential concerns alongside a description of the mechanism by which it attained them. It should be noted, however, that it is not yet employed in real-world clinical application.
“The intent is not for physicians to be in the position of actively utilizing AI tools,” he conveyed, “but rather, for the system to furnish insight — the observations we are making, and their justifications — as an integral component of the clinical documentation.”
Article Sources
Tian, J., Fard, P., Cagan, C. et al. An autonomous agentic workflow for clinical detection of cognitive concerns using large language models. npj Digit. Med. 9, 51 (2026). https://doi.org/10.1038/s41746-025-02324-4

Anirban MukhopadhyayLive Science Contributor
Anirban Mukhopadhyay serves as an independent science reporter. He possesses a doctorate in genetics along with a master’s degree in computational biology and pharmaceutical design. He routinely produces content for The Hindu and has made contributions to The Wire Science, where he communicates intricate biomedical investigations to the general populace utilizing comprehensible terminology. External to science journalism, he derives pleasure from crafting and absorbing fictional narratives that seamlessly incorporate elements of mythology, recollection, and despondency within dreamlike accounts exploring themes of sorrow, individuality, and the understated allure inherent in self-discovery. During his leisure periods, he relishes extensive ambulatory outings accompanied by his canine companion and motorbike voyages across the Himalayan mountain ranges.
Show More Comments
You must confirm your public display name before commenting
Please logout and then login again, you will then be prompted to enter your display name.
LogoutRead more

Giving AI the ability to monitor its own thought process could help it think like humans

A woman experienced delusions of communicating with her dead brother after late-night chatbot sessions

Switching off AI’s ability to lie makes it more likely to claim it’s conscious, eerie study finds

The more that people use AI, the more likely they are to overestimate their own abilities

Scientists are developing a ‘self-driving’ device that helps patients recover from heart attacks

AI may accelerate scientific progress — but here’s why it can’t replace human scientists
Latest in Ageing
