Beagle
- Slow (3-4 second) when searching something
- Must be forced reindex all documents regularly, otherwise Beagle accuracy will be decreased. I created a script to force index Beagle at least once a week to maintain Beagle accuracy
+ Can be manually forced to use all CPU for indexing, so indexing document is very fast. In my case (around 10 GB data), indexing is finished within 3 hours (estimation only)
+ Can find text inside any Microsoft Office document
GDLinux (beta)
+ Very fast when searching something
- Can not be manually forced to use all CPU for indexing. Building index only done using idle time. In my case, building index for a 10GB data took around 5 hrs idle time, and in reality the index finished in one day (around 15 hours....)
- Can not find text inside Microsoft Office documents - looks like this version of gdlinux only index the title of Microsoft Office documents
Recommendation
So far, I think the best solution is using Beagle for searching Microsoft Office documents only, and using GDlinux for searching the rest.
1 comment:
Hi,
Must be forced reindex all documents regularly, otherwise Beagle accuracy will be decreased. I created a script to force index Beagle at least once a week to maintain Beagle accuracy
I'm the maintainer of Beagle -- this shouldn't be necessary. Can you give me a better idea of why you need to do this? Feel free to email me at joe@joeshaw.org.
Thanks,
Joe
Post a Comment