Single Sign-On (SSO) with OAuth 2.0 and OpenID Connect 1.0 is essential for user authentication and authorization on the Internet. Billions of users rely on SSO services provided by Google, Facebook, and Apple. For large-scale measurements on the security of SSO, researchers need to reliably detect SSO implementations in the wild.
In this paper, we survey the current state of 36 SSO measurement tools from prior work and discover gaps leading to blind spots in the SSO landscape that hinder the community from improving large-scale research. We observe unreliable measurements and a lack of reproducibility, making comparisons between studies difficult, if not impossible. We fill these gaps with SSO-Monitor, our open-source, modular, and highly extensible framework for large-scale SSO landscape and security measurements. SSO-Monitor achieves a high accuracy of 93% and, compared to previous tools, significantly improves the reliability of SSO measurements by 19%. It continuously takes snapshots of SSO implementations on the top 1M websites to compose an SSO-Archive that is reproducible by design. Therefore, it passively monitors the SSO flows and provides an extensive set of landscape and security insights on sso-monitor.me. Our SSO-Archive allows researchers to perform comprehensive measurements over time and even beyond the scope of SSO.
We use the data in our SSO-Archive to measure the security of 89k SSO authentication flows on the top 1M websites. Thereby, we discover 33k violations of OAuth Security Best Current Practices and 339 severe security vulnerabilities. They include 30 username and password leaks and 28 token leaks that allow full account takeovers.
This website provides access to the SSO-Archive and the SSO-Monitor tool. The SSO-Archive is a central collection and long-term storage of Single Sign-On logins that we recorded on millions of websites. SSO-Monitor is the tool that generates the data that is fed into our SSO-Archive. You can think of our SSO-Archive as the Tranco Top Sites Ranking and the Internet Archive's Wayback Machine but applied to Single Sign-On.
It lists all archived SSO login recordings in a paginated table. You can filter and search the SSO-Archive, for example, to only select domains with SSO, domains of a specific scan, or all domains matching a custom database query for more advanced fine-grained filtering.
Go to the SSO-ArchiveIt computes several statistics on our SSO-Archive. You can load the latest scan, a specific scan, or a ground truth to show aggregated statistics of the data, including the number of SSO buttons, login pages, and the position of SSO buttons on the browser canvas.
Go to the StatisticsIt lets you download our SSO-Archive as JSON files. Since our API uses pagination and does not provide all data at once, we provide a large JSON file holding all SSO login recordings instead. The file allows you to apply own parsing and queries for individual filtering.
Go to the Tranco+SSO ListSingle Sign-On (or short: SSO) is a user authentication method that involves two parties — an Identity Provider (IdP) and a Service Provider (SP). IdPs, such as Google, Facebook, and Apple, manage and verify user identities. SPs are the applications or websites that require user authentication. With SSO, users can log in on multiple SPs by authenticating through a single IdP, streamlining the login process and reducing the need to remember multiple credentials. When a user attempts to log in on a website, the SP redirects the user to the chosen IdP. The IdP verifies the user's identity, and if successful, sends an authentication token back to the SP. The SP uses the authentication token to sign in the user. A simple example of SSO is the "Sign in with Facebook" feature on Pinterest. This feature allows users to log in to Pinterest with their Facebook accounts.
SSO-Monitor is an open-source tool that continuously iterates over millions of websites to monitor the SSO landscape across the web. Therefore, it regularly visits websites, determines their login pages, and checks whether they support SSO on their login pages. Currently, it can detect SSO logins with 10 of the most commonly used IdPs, such as Google, Facebook, Apple, Twitter, LinkedIn, and Github.
The SSO-Archive is our central collection and long-term storage of artifacts and data, including every component of this website that lets you explore or download our archived data. SSO-Monitor is the tool that generates the data that is fed into the SSO-Archive. You can think of our SSO-Archive as the Tranco Top Sites Ranking and the Internet Archive's Wayback Machine but for SSO research.
SSO-Monitor executes the following steps to determine login pages of websites:
shop.com/login
or subdomains like login.shop.com
SSO-Monitor executes the following steps to detect SSO buttons on login pages:
Sign in with Google
or more generic terms like google
in the DOM and accessibility treenavigator.credentials
browser API to detect if a website requests a PasswordCredential
for password authentication, FederatedCredential
or IdentityCredential
for SSO authentication, or PublicKeyCredential
for WebAuthn or passkey authenticationSSO-Monitor currently detects the following IdPs: Apple, Facebook, Google, Twitter, LinkedIn, Microsoft, Baidu, Github, QQ, Sina Weibo, and WeChat. Further IdPs can be easily integrated by extending the regular expressions detecting the login requests that are issued to the IdPs and the IdP logos.
Yes, we already detect username and password logins by integrating the LastPass password manager.
Password managers already use sophisticated algorithms to find username and password fields.
These algorithms go beyond checking the type
attributes of <input>
fields.
Lastpass is the most downloaded password manager with over 10 million users in the Chrome web store and has been extensively studied in academic research.
The extension injects a uniquely identifiable icon into all username and password fields, allowing us to identify all fields.
We have limited support for WebAuthn and passkey detection.
It hooks the navigator.credentials
browser API to detect if a website requests a PublicKeyCredential
for WebAuthn or passkey authentication.
Thereby, it can detect if a website starts the WebAuthn or passkey authentication.
However, many websites commonly require some sort of user interaction before starting the WebAuthn or passkey authentication.
For instance, websites require users to submit their usernames before the authentication process is started.
Currently, we do not explicitly scan for WebAuthn or passkey authentication but we plan to extend full support for WebAuthn and passkey detection in the coming months.
We provide a wide range of data on this website. In fact, this website is build on our API, so everything you can see on this website is fetched from our API. We further provide a Tranco+SSO list in JSON format that you can download. In the following, you can see a commented overview of some of the data that we provide.
[
{
"domain": "shop.com", // domain in the tranco list
"rank": 1337, // ranking of the domain in the tranco list
"task_id": "1f8083d1-6d38-4de3-98f3-b1d2987a6555", // unique id of the snapshot task
"task_timestamp_response_received": 1689367754.3676727, // timestamp of the snapshot task
"resolved": {...}, // see "Resolved" section
"timings": {...}, // see "Timings" section
"login_page_candidates": [...], // see "Login Pages" section
"recognized_idps": [...], // see "Idps" section
"recognized_lastpass_icons": [...], // see "Lastpass Icons" section
"recognized_navcreds": [...], // see "Navigator Credentials" section
"metadata_available": {...}, // see "Metadata" section
"metadata_data": {...} // see "Metadata" section
},
...
]
{
"reachable": true, // whether the domain is reachable (no dns error, no timeout, valid status code, ...)
"url": "https://www.shop.com/index.html", // fully resolved url, i.e., https://shop.com redirects to https://www.shop.com/index.html
"domain": "www.shop.com", // domain of the fully resolved url
"error_msg": "...", // short reason why not reachable
"error": "..." // detailed reason why not reachable
}
{
"resolve_duration_seconds": 13.37, // time in seconds to resolve the domain
"login_page_detection_paths_duration_seconds": 13.37, // time in seconds to test common paths and subdomains for login pages
"login_page_detection_crawling_duration_seconds": 13.37, // time in seconds to crawl the homepage for login pages
"login_page_detection_metasearch_duration_seconds": 13.37, // time in seconds to query the metasearch engine for login pages
"login_page_detection_sitemap_duration_seconds": 13.37, // time in seconds to analyze the sitemap for login pages
"login_page_detection_homepage_duration_seconds": 13.37, // time in seconds to add the homepage to the set of login pages
"login_page_detection_duration_seconds": 13.37, // total time in seconds to determine the login pages
"login_page_analysis_duration_seconds": 13.37, // total time in seconds to analyze the login pages, i.e., for password field detection
"sso_button_detection_duration_seconds": 13.37, // total time in seconds to detect the sso buttons
"sdk_detection_duration_seconds": 13.37, // total time in seconds to determine sso sdks
"total_duration_seconds": 13.37 // total analysis time in seconds
}
[
{
"login_page_candidate": "https://www.shop.com/login", // url of the login page
"login_page_strategy": "PATHS|CRAWLING|METASEARCH|SITEMAP|HOMEPAGE|ROBOTS|MANUAL", // strategy used to find this login page
"login_page_locator_mode": "ANCHOR|ELEMENT", // only for CRAWLING, ANCHOR is <a> element, ELEMENT is any element
"login_page_priority": { // priority of this login page
"regex": "/(show|users?|web|sso)*[_\-\s]*(log|sign)[_\-\s]*(in|up|on)(/.*|\?.*|\#.*|\s*)$", // highest prioritized regex matching this login page
"priority": 99 // priority of regex / login page
},
"resolved": { // see "Resolved" section
"reachable": true, // whether the login page is reachable
"url": "https://www.shop.com/login.html", // fully resolved url of login page
"domain": "www.shop.com", // domain of fully resolved url
"title": "Login | Shop.com" // title of login page
},
"content_type": "text/html", // content type of the login page
"content_analyzable": { // whether the login page is analyzable (valid status code, content type, etc.)
"valid": true, // whether the login page is analyzable
"error": "..." // reason why not valid
},
"login_page_candidate_screenshot": { // reference to screenshot of login page candidate
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "png" // screenshot is png file
}
},
"login_page_info": { // only for some strategies
"x": 13.37, // only for CRAWLING, x coordinate of element that leads to login page
"y": 13.37, // only for CRAWLING, y coordinate of element that leads to login page
"width": 13.37, // only for CRAWLING, width of element that leads to login page
"height": 13.37, // only for CRAWLING, height of element that leads to login page
"inner_text": "Login", // only for CRAWLING, inner text of element that leads to login page
"outer_html": "<a href=\"/login\">Login</a>", // only for CRAWLING, outer html of element that leads to login page
"href_attribute": "/login", // only for CRAWLING+ANCHOR, href attribute of <a%gt; element that leads to login page
"href_absolute": "https://www.shop.com/login", // only for CRAWLING+ANCHOR, absolute href of <a%gt; element that leads to login page
"login_page_frame": "TOPMOST|POPUP", // only for CRAWLING+ELEMENT, clicking the element may open the login page in a new popup or overwrite the topmost window
"element_tree": ["SPAN", "A", ..., "BODY", "HTML"], // only for CRAWLING+ELEMENT, element tree of element that leads to login page
"result_hit": 1, // only for METASEARCH, index of search result that leads to login page
"result_engines": ["GOOGLE", "BING", "YAHOO"], // only for METASEARCH, engines that returned this search result
"result_raw": { // only for METASEARCH, raw result from SearXNG API, see https://docs.searxng.org/
"url": "https://www.shop.com/login",
"title": "Login | Shop.com",
"content": "...",
"engine": "bing",
"parsed_url": ["https", "www.shop.com", "/login", "", "", ""],
"template": "default.html",
"engines": ["google", "bing", "yahoo"],
"positions": [2, 1, 2],
"score": 6,
"category": "general",
"pretty_url": "https://www.shop.com/login",
"open_group": true
},
"change_frequency": "monthly", // only for SITEMAP, see https://www.sitemaps.org/protocol.html
"last_modified": "2005-01-01", // only for SITEMAP, see https://www.sitemaps.org/protocol.html
"news_story": {...}, // only for SITEMAP, see https://developers.google.com/search/docs/crawling-indexing/sitemaps/news-sitemap
"priority": 0.5 // only for SITEMAP, see https://www.sitemaps.org/protocol.html
}
},
...
]
[
{
"recognition_strategy": "KEYWORD|LOGO", // strategy used to detect this sso button
"element_coordinates_x": 13.37, // coordinates and dimensions of the element
"element_coordinates_y": 13.37,
"element_width": 13.37,
"element_height": 13.37,
"element_validity": "HIGH|LOW", // only KEYWORD, high ^= "Sign in with Apple", low ^= "apple"
"element_inner_text": "Continue with Google", // text content of the element
"element_outer_html": "<button>....<button>", // html markup of the element
"element_tree": ["BUTTON", ..., "BODY", "HTML"], // list of tag names from the element to the root
"element_tree_markup": { // reference to file containing entire html markup of the element tree
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "json"
}
},
"login_page_url": "https://www.shop.com/login", // url of the login page that contains this sso button
"idp_name": "GOOGLE", // name of the idp
"idp_integration": "CUSTOM|SIGN_IN_WITH_APPLE|GOOGLE_ONE_TAP|FACEBOOK_LOGIN|...", // api or sdk integration
"idp_frame": "TOPMOST|POPUP|IFRAME", // frame in which the idp is called
"idp_login_request": "https://accounts.google.com/o/oauth2/v2/auth?client_id=...", // url of the login request
"idp_har": { // reference to HAR file containing http traffic of sso flow, see https://en.wikipedia.org/wiki/HAR_(file_format)
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "har"
}
},
"idp_screenshot": { // reference to screenshot of the idp
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "png"
}
},
"keyword_recognition_candidates": 3, // only KEYWORD, number of candidates that were found with the keywords
"keyword_recognition_hit_number_clicks": 1, // only KEYWORD, number of clicks on the candidates until sso was started
"keyword_recognition_hit_keyword": "Continue with Google", // only KEYWORD, keyword in the element that started sso
"keyword_recognition_duration_seconds": 13.37, // only KEYWORD, total time between page load and sso start
"keyword_recognition_locator_mode": "CSS|XPATH|ACCESSIBILITY", // only KEYWORD, locator mode that was used to find the element
"keyword_recognition_screenshot": { // only KEYWORD, reference to screenshot of the element that started sso
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "png"
}
},
"logo_recognition_candidates": 3, // only LOGO, number of candidates that were found with the logo
"logo_recognition_hit_number_clicks": 1, // only LOGO, number of clicks on the candidates until sso was started
"logo_recognition_duration_seconds": 13.37, // only LOGO, total time between page load and sso start
"logo_recognition_matching_score": 0.80, // only LOGO, score of the best matching logo (0-1)
"logo_recognition_pattern_matching_duration_seconds": 13.37, // only LOGO, time it took to match the logo on the screenshot
"logo_recognition_pattern_checking_duration_seconds": 13.37, // only LOGO, time it took to verify if clicking the logo starts sso
"logo_recognition_screenshot_scale": 1, // only LOGO, scale of the screenshot (0-1, 1 = not scaled)
"logo_recognition_template_filename": "apple.png", // only LOGO, filename of the logo template
"logo_recognition_template_scale": 0.12, // only LOGO, scale of the logo template (0-1, 1 = not scaled)
"logo_recognition_screenshot": { // only LOGO, reference to screenshot of the element that started sso
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "png"
}
}
},
...
]
[
{
"recognition_strategy": "LASTPASS_ICON", // following username and password fields were found with the lastpass extension
"login_page_url": "https://www.shop.com/login", // the url of the login page
"lastpass_icon_frame": "TOPMOST|IFRAME", // the username/password fields are on topmost window or in iframe
"lastpass_icon_frame_index": 0, // the index of the frame in the page (0 = topmost window, 1/2/... = iframe)
"lastpass_icon_frame_name": "...", // the name of the frame
"lastpass_icon_frame_title": "Login | Shop.com", // the title of the frame
"lastpass_icon_frame_url": "https://www.shop.com/login", // the url of the frame that contains the username/password fields (if topmost, then this is the login page url, otherwise it is the url of the iframe)
"lastpass_icon_frames_length": 1, // the number of frames in the page
"lastpass_icon_elements": [ // all identified username and password fields (typically contains 2, one for username, one for password)
{
"element_coordinates_x": 13.37, // coordinates and dimensions of the input fields
"element_coordinates_y": 13.37,
"element_width": 13.37,
"element_height": 13.37,
"element_inner_text": "", // inner text of the input field
"element_outer_html": "<input name=\"username|password|...\" style=\"background-image: url();\">", // outer html of the input field, the background image is the lastpass icon that is injected into the field by the extension
"element_tree": ["INPUT", ..., "BODY", "HTML"], // the tree of the input field
"element_tree_markup": { // reference to the markup of the tree of the input field
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "json"
}
}
},
...
]
},
...
]
{
"metadata_available": { // whether metadata is available at the well known endpoints of the domain
"apple_app_site_association": false, // https://developer.apple.com/documentation/xcode/supporting-associated-domains#Add-the-associated-domain-file-to-your-website
"assetlinks": false, // https://developers.google.com/digital-asset-links/v1/create-statement
"browserid": false, // https://mozilla.github.io/id-specs/docs/formats/well-known/
"fido2_configuration": false, // not yet standardized
"fido_2fa_configuration": false, // not yet standardized
"fido_configuration": false, // not yet standardized
"jwks": false, // https://auth0.com/docs/secure/tokens/json-web-tokens/json-web-key-sets
"oauth_authorization_server": false, // https://datatracker.ietf.org/doc/html/rfc8414#section-3
"oauth_client": false, // https://datatracker.ietf.org/doc/html/draft-looker-oauth-client-discovery-01#section-3
"openid_configuration": false, // https://openid.net/specs/openid-connect-discovery-1_0.html#ProviderConfig
"robots_txt": true, // https://datatracker.ietf.org/doc/html/rfc9309#name-access-method
"security_txt": true, // https://datatracker.ietf.org/doc/html/rfc9116#name-location-of-the-securitytxt
"uma2_configuration": false, // https://backstage.forgerock.com/docs/am/7/uma-guide/configure-uma-discovery.html
"web_identity": false, // https://developer.chrome.com/docs/privacy-sandbox/fedcm/#well-known-file
"webfinger": false // https://datatracker.ietf.org/doc/html/rfc7033#section-4
},
"metadata_data": { // reference to json file containing all metadata
"type": "reference",
"data": {
"bucket_name": "...",
"object_name": "...",
"extension": "json"
}
}
}
A lot of web security research has focused on the unauthenticated web. Large-scale measurements on the authenticated web are hard as account logins and registrations have to be automated. Recent research has already started semi-automated studies of the authenticated web. We see our publicly available SSO-Archive and open-source tool SSO-Monitor as a baseline for future measurements on the authenticated web. In fact, we already provide the HTTP traffic of login processes. As a result, post-login security measurements such as security attributes of session cookies and secure storage of tokens in the browser can be already conducted on our data. We also pave the way for active measurements on the authenticated web with SSO-Monitor's extensible architecture and automatic SSO login support.
We create an individual task for each analysis of a domain. For instance, if we analyze the top 500k domains of the Tranco list, we create 500k tasks in total (one for each domain). Tasks are contained in a scan that defines the targeted domains that should be analyzed. For instance, we create a single scan for the Tranco top 500k domains that holds a total of 500k tasks. Scans can be further combined by tagging them. For instance, if we later decide to scan the lower 500k domains of the Tranco list in a new scan, we can add the same tag to both 500k scans. This allows us to gradually execute smaller scans and "combining" them at a later time.
Yes, we follow an API-centric approach and provide a publicly accessible API for all our data. In fact, everything you can see on this website is fetched from our API. We also provide an OpenAPI file and an API documentation to make working with the API more convenient.
Yes, SSO-Monitor is entirely open-source and we plan to actively maintain it for research purposes. Our SSO-Archive contains all data and is provided throughout this website in various formats, i.e., as downloadable JSON files or via APIs that can be filtered with queries. If you have troubles downloading our large dataset, please contact us.
If you use our data or tooling for your research, please cite our publication:
@inproceedings{
sso-monitor,
title={SoK: SSO-Monitor - The Current State and Future Research Directions in Single Sign-On Security Measurements},
author={Jannett, Louis and Westers, Maximilian and Wich, Tobias and Mainka, Christian and Mayer, Andreas and Mladenov, Vladislav},
booktitle={2024 IEEE 9th European Symposium on Security and Privacy (EuroS&P)},
year={2024},
volume={},
number={},
pages={},
keywords={Single Sign-On;Authentication;Authorization;OAuth;OpenID Connect;Web Archive;SSO Archive},
doi={TBD}
}
L. Jannett, M. Westers, T. Wich, C. Mainka, A. Mayer and V. Mladenov, "SoK: SSO-Monitor - The Current State and Future Research Directions in Single Sign-On Security Measurements", 2024 IEEE 9th European Symposium on Security and Privacy (EuroS&P), Vienna, Austria, 2024, pp. TBD-TBD, doi: TBD. keywords: {Single Sign-On;Authentication;Authorization;OAuth;OpenID Connect;Web Archive;SSO Archive}
Feel free to reach out to us regarding this research, the artifacts, or the tooling. If you have any trouble with the tool, please open an issue on Github.
The SSO-Monitor.me logo is based on the OAuth logo created by Chris Messina. The logo is released under the Creative Commons Attribution ShareAlike 3.0 license.