Beagle
- Slow (3-4 second) when searching something
- Must be forced reindex all documents regularly, otherwise Beagle accuracy will be decreased. I created a script to force index Beagle at least once a week to maintain Beagle accuracy
+ Can be manually forced to use all CPU for indexing, so indexing document is very fast. In my case (around 10 GB data), indexing is finished within 3 hours (estimation only)
+ Can find text inside any Microsoft Office document
GDLinux (beta)
+ Very fast when searching something
- Can not be manually forced to use all CPU for indexing. Building index only done using idle time. In my case, building index for a 10GB data took around 5 hrs idle time, and in reality the index finished in one day (around 15 hours....)
- Can not find text inside Microsoft Office documents - looks like this version of gdlinux only index the title of Microsoft Office documents
Recommendation
So far, I think the best solution is using Beagle for searching Microsoft Office documents only, and using GDlinux for searching the rest.
Hi,
ReplyDeleteMust be forced reindex all documents regularly, otherwise Beagle accuracy will be decreased. I created a script to force index Beagle at least once a week to maintain Beagle accuracy
I'm the maintainer of Beagle -- this shouldn't be necessary. Can you give me a better idea of why you need to do this? Feel free to email me at joe@joeshaw.org.
Thanks,
Joe