
Brian Sims
Editor
Brian Sims
Editor
POPULAR GENERATIVE Artificial Intelligence (AI) web browser assistants are collecting and sharing sensitive user data, such as medical records, without adequate safeguards. That’s the key finding of a new study conducted by researchers from University College London (UCL), UC Davis and the Mediterranea University of Reggio Calabria.
The study, which will be presented and published as part of the USENIX Security Symposium, is the first large-scale analysis of generative AI browser assistants and privacy. Alarmingly, the research has uncovered widespread tracking, profiling and personalisation practices that pose serious privacy concerns, with the authors calling for greater transparency and user control over data collection and sharing practices.
The researchers analysed ten of the most popular generative AI browser extensions, such as ChatGPT for Google, Merlin and Microsoft Copilot. These tools, which need to be downloaded and installed to use, are designed to enhance web browsing with AI-powered features like summarisation and search assistance, but have been found to collect extensive personal data from users’ web activity.
Analysis has revealed that several assistants transmitted full web page content – including any information visible on screen – to their servers. One assistant, Merlin, even captured form inputs such as online banking details or health data.
Extensions like Sider and TinaMind shared user questions and information that could identify them (such as their IP address) with platforms like Google Analytics, enabling potential cross-site tracking and ad targeting.
ChatGPT for Google, Copilot, Monica and Sider demonstrated the ability to infer user attributes such as age, gender, income and interests, and used this information to personalise responses, even across different browsing sessions.
Only one assistant, namely Perplexity, did not show any evidence of profiling or personalisation.
Unprecedented access
Dr Anna Maria Mandalari, senior author of the study from UCL’s Electronic and Electrical Engineering Department, said: “Though many people are aware that search engines and social media platforms collect information about them for targeted advertising, these AI browser assistants operate with unprecedented access to users’ online behaviour in areas of their online life that should remain private. While they offer convenience, our findings show they often do so at the cost of user privacy, without transparency or consent and sometimes in breach of privacy legislation or the company’s own terms of service.”
Dr Mandalari added: “This data collection and sharing is not trivial. Besides the selling or sharing of data with third parties, in a world where massive data hacks are frequent, there’s no way of knowing what’s happening with your browsing data once it has been gathered.”
For the study, the researchers simulated real-world browsing scenarios by creating the persona of a ‘rich millennial male from California’, which they used to interact with the browser assistants while completing common online tasks. This included activities in both the public (logged out) space, such as reading online news, shopping on Amazon or watching YouTube videos.
It also included activities in the private (logged in) space, such as accessing a university health portal. The researchers assumed that users would not want this activity to be tracked due to the data being personal and sensitive.
During the simulation, the researchers intercepted and decrypted traffic between browser assistants, their servers and third party trackers, allowing them to analyse what data was flowing in and out in real-time. They also tested whether assistants could infer and remember user characteristics based on browsing behaviour by asking them to summarise the web pages then asking the assistant questions (such as: ‘What was the purpose of the current medical visit?’ after accessing an online health portal) to see if personal data had been retained.
The experiments revealed that some assistants, including Merlin and Sider, did not stop recording activity when the user switched to the private space when they are meant to do so.
Urgent need
The authors suggest that the study highlights the urgent need for regulatory oversight of AI browser assistants in order to protect users’ personal data.
The study was conducted in the US and so compatibility with UK/European Union (EU) data laws such as the General Data Protection Regulation was not included, but the authors note that this would likely be a violation in the UK and the EU as well, given that privacy regulations there are more stringent.
The authors recommend that developers adopt privacy-by-design principles, such as local processing or explicit user consent for data collection.
Dr Aurelio Canino, an author of the study from UCL’s Electronic and Electrical Engineering Department and the Mediterranea University of Reggio Calabria, concluded: “As generative AI becomes more embedded in our digital lives, we must absolutely ensure that privacy is not sacrificed for convenience. Our work lays the foundation for future regulation and transparency in this rapidly evolving space.”
Dorset House
64 High Street
East Grinstead
RH19 3DE
UNITED KINGDOM
01342 31 4300