现在的位置: 首页 > 综合 > 正文

WebBrowse, Invoke javascript, C# API, outer html

2013年09月15日 ⁄ 综合 ⁄ 共 6458字 ⁄ 字号 评论关闭

用webbrowser控件navigate到某个页面。

可以使用webbrowser.document.invoke来调用document里的javascript。

我的目的是要把页面保存下来,我一开始用的是document.execcommand方法。这个函数接受很多参数,对应浏览器的各个菜单命令,如另存为,打印等等。
这些命令参数的英文名是command identifier,具体的参数列表,在下面这个页面有详细介绍。我顺便把它们附在最后。
(http://www.tutor.nsu.ru/library/default.asp?url=/workshop/author/dhtml/reference/commandids.asp)
我用execcommand("saveas",true,path)进行保存。
文档上对saveas的介绍是:其第2个参数是个布尔值,设为真时会显示另存为对话框,设为false时不显示对话框。
但是实际上,无论设为真或假,都会显示对话框。对于这一点,有另一篇文章说是为了防止大批量拷贝网页,阻止盗取他人的数据。
呵呵,挺滑稽的。难道这个小小的参数能担负起这么重大的责任?

既然这个窗口一定会出现,那怎么解决它呢。我想可以用程序来模拟人的点击动作。具体是这么做的。通过对话框标题查找保存对话框,当然也可以用其他特征去查找,
找到对话框之后,再定位对话框中的保存按钮,这两个查找都可以用FindWindow来做。找到后给这个对话框发个消息就可以了,用SendMessage,消息则用WM_CLICK。
因为我用的是c#,而FindWindow,SendMessage并不是托管代码,那如何在c#中使用它们呢?
下面这篇文章介绍了这方面的知识。
用C#调用Windows API和其它进程通信 (http://www.cnblogs.com/index/archive/2005/01/16/92651.html
关键就是要引入这些非托管代码,用下面的语句就可以了。
    #region Dll Import
 
    [DllImport("User32.dll",EntryPoint="FindWindow")]
    private static extern IntPtr FindWindow(string lpClassName,
string lpWindowName);
 
    [DllImport("user32.dll",EntryPoint="FindWindowEx")]
    private static extern IntPtr FindWindowEx(IntPtr hwndParent,
IntPtr hwndChildAfter, string lpszClass, string lpszWindow);
 
    [DllImport("User32.dll",EntryPoint="SendMessage")]
    private static extern int SendMessage(IntPtr hWnd,
int Msg, IntPtr wParam, string lParam);
 
    #endregion

我把寻找窗口和保存按钮,并发送点击消息的代码写在另一个线程函数里,启动新的线程来运行这个函数。
当程序开始后,就启动这个线程,让它不停的监视是否出现这样的对话框。

但这种方法效果很不好,运行速度不快,另外因为我对线程这一块还不熟,运行起来总是出这么一个问题:
尝试读取或写入受保护的内存。这通常指示其他内存已损坏。
一直没能解决。

后来终于找到另外一个办法,不在走调用保存对话框这条途径,而是直接把页面html代码保存到文件。
这个思路很直观,其实我早就想过这么做,但刚开始是用docuent.write()方法来写,另外也不知道页面的html代码是存在
document对象的哪个属性里。所以没办法做到。
下面的代码则演示了如何用这个方法来保存,超级简单,直观。
HtmlDocument doc = webBrowser1.Document;
HtmlElementCollection elems = doc.GetElementsByTagName("HTML");
if (elems.Count == 1)
 {

    HtmlElement elem = elems[0];
    StreamWriter wrt = new StreamWriter(path);
    wrt.Write(elem.OuterHtml.ToString());
    }
这样,我的任务就完成了

 

下面是sendmessage中一些消息对应的值,这些值在windows api中定义

const int WM_GETTEXT = 0x000D;
const int WM_SETTEXT = 0x000C;
const int WM_CLICK = 0x00F5;

const int WM_CLOSE = 0x0010

 

附上command identifiers
Command identifiers specify an action to take on the given object. Use them with the following methods:

execCommand
queryCommandEnabled
queryCommandIndeterm
queryCommandState
queryCommandSupported
queryCommandValue

Command Identifiers
2D-Position
Allows absolutely positioned elements to be moved by dragging.

AbsolutePosition
Sets an element's position property to "absolute."

BackColor
Sets or retrieves the background color of the current selection.

BlockDirLTR
Not currently supported.

BlockDirRTL
Not currently supported.

Bold
Toggles the current selection between bold and nonbold.

BrowseMode
Not currently supported.

Copy
Copies the current selection to the clipboard.

CreateBookmark
Creates a bookmark anchor or retrieves the name of a bookmark anchor for the current selection or insertion point.

CreateLink
Inserts a hyperlink on the current selection, or displays a dialog box enabling the user to specify a URL to insert as a hyperlink on the current selection.

Cut
Copies the current selection to the clipboard and then deletes it.

Delete
Deletes the current selection.

DirLTR
Not currently supported.

DirRTL
Not currently supported.

EditMode
Not currently supported.

FontName
Sets or retrieves the font for the current selection.

FontSize
Sets or retrieves the font size for the current selection.

ForeColor
Sets or retrieves the foreground (text) color of the current selection.

FormatBlock
Sets the current block format tag.

Indent
Increases the indent of the selected text by one indentation increment.

InlineDirLTR
Not currently supported.

InlineDirRTL
Not currently supported.

InsertButton
Overwrites a button control on the text selection.

InsertFieldset
Overwrites a box on the text selection.

InsertHorizontalRule
Overwrites a horizontal line on the text selection.

InsertIFrame
Overwrites an inline frame on the text selection.

InsertImage
Overwrites an image on the text selection.

InsertInputButton
Overwrites a button control on the text selection.

InsertInputCheckbox
Overwrites a check box control on the text selection.

InsertInputFileUpload
Overwrites a file upload control on the text selection.

InsertInputHidden
Inserts a hidden control on the text selection.

InsertInputImage
Overwrites an image control on the text selection.

InsertInputPassword
Overwrites a password control on the text selection.

InsertInputRadio
Overwrites a radio control on the text selection.

InsertInputReset
Overwrites a reset control on the text selection.

InsertInputSubmit
Overwrites a submit control on the text selection.

InsertInputText
Overwrites a text control on the text selection.

InsertMarquee
Overwrites an empty marquee on the text selection.

InsertOrderedList
Toggles the text selection between an ordered list and a normal format block.

InsertParagraph
Overwrites a line break on the text selection.

InsertSelectDropdown
Overwrites a drop-down selection control on the text selection.

InsertSelectListbox
Overwrites a list box selection control on the text selection.

InsertTextArea
Overwrites a multiline text input control on the text selection.

InsertUnorderedList
Toggles the text selection between an ordered list and a normal format block.

Italic
Toggles the current selection between italic and nonitalic.

JustifyCenter
Centers the format block in which the current selection is located.

JustifyFull
Not currently supported.

JustifyLeft
Left-justifies the format block in which the current selection is located.

JustifyNone
Not currently supported.

JustifyRight
Right-justifies the format block in which the current selection is located.

LiveResize
Causes the MSHTML Editor to update an element's appearance continuously during a resizing or moving operation, rather than updating only at the completion of the move or resize.

MultipleSelection
Allows for the selection of more than one site selectable element at a time when the user holds down the SHIFT or CTRL keys.

Open
Not currently supported.

Outdent
Decreases by one increment the indentation of the format block in which the current selection is located.

OverWrite
Toggles the text-entry mode between insert and overwrite.

Paste
Overwrites the contents of the clipboard on the current selection.

PlayImage
Not currently supported.

Print
Opens the print dialog box so the user can print the current page.

Redo
Not currently supported.

Refresh
Refreshes the current document.

RemoveFormat
Removes the formatting tags from the current selection.

RemoveParaFormat
Not currently supported.

SaveAs
Saves the current Web page to a file.

SelectAll
Selects the entire document.

SizeToControl
Not currently supported.

SizeToControlHeight
Not currently supported.

SizeToControlWidth
Not currently supported.

Stop
Not currently supported.

StopImage
Not currently supported.

StrikeThrough
Not currently supported.

Subscript
Not currently supported.

Superscript
Not currently supported.

UnBookmark
Removes any bookmark from the current selection.

Underline
Toggles the current selection between underlined and not underlined.

Undo
Not currently supported.

Unlink
Removes any hyperlink from the current selection.

Unselect
Clears the current selection.

 

 

抱歉!评论已关闭.