现在的位置: 首页 > 综合 > 正文

Automating for Internet Explorer

2018年08月06日 ⁄ 综合 ⁄ 共 8073字 ⁄ 字号 评论关闭

Automating for Internet Explorer

In this article, I’ll describe some techniques to wrap an Internet Explorer window in order to facilitate the automation for verifying the behavior of web applications. In fact, unit testing (part of extreme programming) can be a great help to maintain the functionality of each specific blocks of code. However, as the application concretize itself, it is also important to consider the automation of a few sets of end user scenarios to make sure that your application doesn’t break over time. The proposed method is to use the Shell Document Object and Control Library to wrap and expose the functionalities of the Internet Explorer window. A subsequent article will describe how to deal with the dialogs that IE may pop up (File Upload or Download, Security warnings, etc.).

 

There are multiple techniques to create a new Internet Explorer window and find its handle to be able to manipulate it. For example, one could launch a new process and iterate through the desktop windows until the Explorer window shows up. However, this method can become very tricky to implement since Internet Explorer can take some time to load, some windows may already be existing on the desktop, etc. The approach that is going to be explained in this article will involve the use of some synchronization primitives and will benefit from the ShDocVw APIs. Once the window is found, the DOM can be accessed and the rest of the Win32 APIs can be used to calculate the positions of the elements that you want to click.

 

Pre-requisites

 

Even though it is not explicitly needed, it is assumed that you know about the platform invocation framework (see this article). In order to use the code samples in this article, you’ll need to add a reference to the Microsoft Internet Controls library (ShDocVw.dll), the Microsoft Html library (mshtml.dll), System.Threading and System.Runtime.InteropServices namespaces.

 

Creation of the Internet Explorer Window

 

The Internet Controls Library contains the “ShellWindowsClass” which is basically a collection of all the shell windows (e.g.: IE) spawned across the desktop. That component provides an event handler called “Windows Registered” that we are going to hook up to. Once the process has been launched, we will wait until the corresponding window is registered then we are going to connect our Internet Explorer control to the shell window found. To determine if the window is found, we iterate through the registered windows and we try to find a handle that matches the handle of the process we previously launched. We will use the “ManualResetEvent” synchronization primitive to wait a certain amount of time for the window to be registered. The constructor of that class takes a parameter to indicate if we need the semaphore to start as signaled or not. Here’s the code used:

  

namespace Browsers

{

    using System;

    using System.Threading;

    using System.Diagnostics;

    using System.Runtime.InteropServices;

    using InternetExplorerLibrary = SHDocVw;

   

    public class InternetExplorer

    {

        private int timeout                                                                = 10000; // 10 secs.

        private InternetExplorerLibrary.ShellWindows windows   = null;

        private InternetExplorerLibrary.InternetExplorer IE             = null;

        private Process process                                                    = null;

        private ManualResetEvent waitForRegister                       = null;

 

        public InternetExplorer( )

        {

            // We need to set this on the red light so that we block until the window is registered

            waitForRegister = new ManualResetEvent(false);

 

            // We use the shell to get notification when our window is created and registered

            windows = new InternetExplorerLibrary.ShellWindowsClass();

            InternetExplorerLibrary.DShellWindowsEvents_WindowRegisteredEventHandler registerHandler = new InternetExplorerLibrary.DShellWindowsEvents_WindowRegisteredEventHandler(windows_WindowRegistered);

            windows.WindowRegistered += registerHandler;

 

            // Launch IE

            process = Process.Start("IExplore","about:blank");

            waitForRegister.WaitOne(timeout, false); // We block here for at max 10 secs

 

            // See paragraph below...

            windows.WindowRegistered -= registerHandler;

            while(Marshal.ReleaseComObject(windows) > 0); // make sure we drop everything

 

            waitForRegister = null;

 

            if(IE == null)

                throw new Exception("Timeout while creating an IE Window");

        }

 

        private void windows_WindowRegistered(int lCookie)

        {

            if(process == null)

                return;  // This wasn't our window for sure

 

            for(int i = 0; i < windows.Count; i++)

            {

                InternetExplorerLibrary.InternetExplorer ShellWindow = windows.Item(i) as InternetExplorerLibrary.InternetExplorer;

                if( ShellWindow != null && (IntPtr)ShellWindow.HWND == process.MainWindowHandle)

                {

                    IE = ShellWindow;

                    waitForRegister.Set(); // Signal the constructor that it is safe to go on now.

                    return;

                }

            }         

        }

    }

}

 

As you might have noticed, once the IE window is found, we un-register the handler and we release any RCW related to the “ShellWindowClass” component. RCW stands for “Runtime Callable Wrapper” which are objects used to wrap around COM objects in the managed world. The problem that happens when you use an event-based approach is if you want to debug your scenario, the Internet Explorer window will hang just as if you were holding some objects that prevent it from pumping messages. I’m not sure about the reason why this is happening, however, I prefer to be able to look at my IE window even when I break in my scenario; unfortunately we have to poll on the IE window to achieve this.

 

Accessing the DOM

 

The “InternetExplorer” class found in the ShDocVw library provides a property to read the Document Object Model (DOM) of the page:

 

        public object Document

        {

            get { return IE.Document; }

        }

 

Accessing elements in the document

 

The document returns an object. You need to use the mshtml library to have a “strongly-type” access to the members. Here’s a code sample accessing the document:

 

            mshtml.IHTMLDocument3 document = IE.Document as mshtml.IHTMLDocument3;

            mshtml.IHTMLElement2 foo = document.getElementById("foo") as mshtml.IHTMLElement2;

 

Refresh the Browser and Navigate

 

Again, the “InternetExplorer” class provides those functionalities. However, since we are not using an event-based approach, we need to poll on the state of the IE window to make sure all the processing is done before the call returns. Note that the same functionality can be implemented as asynchronous calls; you can always use polling and call back once IE is ready. Here’s the code used:

 

        // Clean wrapper around the possible values of the InternetExplorer state property

       public enum ReadyState

        {

            Loading           = InternetExplorerLibrary.tagREADYSTATE.READYSTATE_LOADING,

            Loaded            = InternetExplorerLibrary.tagREADYSTATE.READYSTATE_LOADED,

            Complete         = InternetExplorerLibrary.tagREADYSTATE.READYSTATE_COMPLETE,

            Interactive       = InternetExplorerLibrary.tagREADYSTATE.READYSTATE_INTERACTIVE,

            Uninitialized     = InternetExplorerLibrary.tagREADYSTATE.READYSTATE_UNINITIALIZED

        }

 

        public ReadyState State

        {

            get { return (ReadyState)IE.ReadyState; }

        }

 

        public void Refresh()

        {

            WaitForStableState();

            IE.Refresh();

            WaitForStableState();

        }

 

        public void Navigate(string URL)

        {

            // Allow only a single navigate

            lock(this)

            {

                // Hardcoded flags – you might want to make this more extensible…

                object flag = (int)12;

                object nullPointer = null;

 

                WaitForStableState();

                IE.Navigate(URL, ref flag, ref nullPointer, ref nullPointer, ref nullPointer);

                WaitForStableState();

            }

        }

 

        // You might want to use a smaller/bigger sleep delta and some timeout as well to throw an error if no response

        private void WaitForStableState()

        {

            while(State == ReadyState.Uninitialized || State == ReadyState.Loading) Thread.Sleep(100);

        }

 

Calculate position on the Screen, get Width and Height

 

For other functionalities, you should rely on p/invoke (see this article). Some APIs that you might want to consider are GetWindowPlacement and ClientToScreen.

 

Those articles cover the basis in order to facilitate your ramping up on your automation work. This should benefit developers working on large applications who want to run suites before checking in their code. The last article of this series will provide means to deal with all the IE dialogs that may pop up when executing a certain task. If you have any suggestion, feedback or comments please let me know.

抱歉!评论已关闭.