原本很早就发现了这个问题了的,只是一直都没有找出确切的原因,这次在刚刚抓取新闻过后再查询果然又出现了熟悉的“no doc found!”仔细检查发现没有程序上的问题,再查JE的JAVA文档,猜测是由于Berkeley DB对于新抓取的新闻还存在于内存的CACHE中,没有更新到DISK上!

重启TOMCAT后,发现那些查询不到的新记录又奇迹般的出来了,看来问题就在这!于是查找Berkeley DB的同步方法,找到一个Environment的sync()方法可以将内存里的记录刷新到DISK上。加入该方法,果然重新抓取后不用重启WEB服务器就能无误的运行了!

原文:When a write operation is performed in JE, the modified data is written to a leaf node contained in the in-memory cache. If your JE writes are performed without transactions, then the in-memory cache is the only location guaranteed to receive a database modification without further intervention on the part of the application developer.

For some class of applications, this lack of a guaranteed write to disk is ideal. By not writing these modifications to the on-disk logs, the application can avoid most of the overhead caused by disk I/O.

However, if the application requires its data to persist persist at a specific point in time, then the developer must manually sync database modifications to the on-disk log files (again, this is only necessary for non-transactional applications). This is done using Environment.sync().

Note that syncing the cache causes JE to write all modified objects in the cache to disk. This is probably the most expensive operation that you can perform in JE.




