首先说明一下SpiderDone类的目的:
由于编写大型的Spider程序需要很多并发的线程,所以要知道Spider何时完成还是比较难的。而SpiderDone类正好可以实现这个功能。
最重要的两个方法:
synchronized public void workerBegin();//每执行一次,线程数量加一
synchronized public void workerEnd();//每执行一个,线程数量减一;
具体代码如下:
- package com.heaton.bot;
- /**
- * This is a very simple object that
- * allows the spider to determine when
- * it is done. This object implements(实现)
- * a simple lock(锁) that the spider class
- * can wait on to determine completion.
- * Done is defined as the spider having
- * no more work to complete.
- */
- class SpiderDone {
- /**
- * The number of SpiderWorker object
- * threads that are currently working
- * on something.
- */
- private int activeThreads = 0;
- /**
- * This boolean keeps track of if
- * the very first thread(特指第一个线程) has started
- * or not. This prevents this object
- * from falsely reporting that the spider
- * is done, just because the first thread
- * has not yet started.
- */
- private boolean started = false;
- /**
- * This method can be called to block(阻止,阻碍)
- * the current thread until the spider
- * is done.
- */
- synchronized public void waitDone()
- {
- try {
- while ( activeThreads>0 ) {
- wait();
- }
- } catch ( InterruptedException e ) {
- }
- }
- /**
- * Called to wait for the first thread to
- * start. Once this method returns the
- * spidering process has begun.
- */
- synchronized public void waitBegin()
- {
- try {
- while ( !started ) {
- wait();
- }
- } catch ( InterruptedException e ) {
- }
- }
- /**
- * Called by a SpiderWorker object
- * to indicate that it has begun
- * working on a workload.
- */
- synchronized public void workerBegin()
- {
- activeThreads++;
- started = true;
- notify();
- }
- /**
- * Called by a SpiderWorker object to
- * indicate that it has completed a
- * workload.
- */
- synchronized public void workerEnd()
- {
- activeThreads--;
- notify();
- }
- /**
- * Called to reset this object to
- * its initial state.
- */
- synchronized public void reset()
- {
- activeThreads = 0;
- }
- }
同样,其他的方法不是很常用。用到的话,再回来看吧!
That’s OK!