In the world of test automation, choosing the right tools and frameworks can greatly impact the success and efficiency of your testing efforts. Selenium has been the most popular choice for web application testing, but with the introduction of Selenium 4 (October 2021), as well as the emergence of other frameworks (non-selenium) like Cypress and Playwright, testers now have multiple options to consider. In this article, we will analyze the architecture diagrams of Selenium 3, Selenium 4, Cypress, and Playwright.
A Selenium request is sent from the Selenium Client component. Next, the request is received by JSON Wire Protocol Over HTTP, then secured by the Browser Driver.
Afterward, the request command is delivered to a Real Browser, where the automation is performed. When the automation is complete, a response travels back to the Browser Driver, JSON Wire Protocol, and Selenium Client.
Unlike Selenium 3, Selenium 4 has direct communication between the client and server, because of the W3C protocol.
W3C (World Wide Web Consortium) protocol was introduced because all the web browsers followed the W3C standards and also all the browser drivers followed the W3C standards. To standardize the communication, the JSON wire protocol was replaced by W3C. This approach provides better communication with the browsers, stability, and common code (e.g. no browser-specific code required).
Cypress engine directly operates inside the browser. Therefore, it is the browser that is executing your test code.
Cypress can access both the frontend and backend applications, which enables it to act on real-time incidents on the applications and, at the exact time, execute tasks outside the browser that requires additional special privilege.
The Node server and browser communication are through the Web Socket, which starts execution after the proxy is created. Cypress sends HTTP requests and responses from the node server to the browser.
Cypress has control over all the commands that run on and off the browsers. It directly talks with the Operating System to capture screenshots, record videos, access the network layer, and perform file system operations.
Playwright communicates all requests between client and server through a single WebSocket connection, which stays in place until test execution is completed.
WebSocket VS HTTP Connection Diagram
In HTTP connection architectures, tools like Selenium sends each command as a separate HTTP request and receives JSON responses. So, every action (e.g. opening the browser, clicking an element, etc.) is sent as a separate HTTP request. When every request is completed, the connection between the server and client will be terminated and implies being re-established for the next request. Because of ending and recreating a connection for each request , the overall test execution may be slower and flaky.
On the other hand, in WebSocket architectures (e.g. used by Playwright), all requests between client and server through a single websocket connection, which stays in place until test execution is completed. This reduces the points of failure and allows commands to be sent quickly on a single connection.