Thread: How to crawl image files like jpeg, gif, ...
Started 1 year, 10 months ago by develoWorks
Hello,
I want to index and find image files.
I`ve already included the entries '.jpg' and '.jpeg' in the crawl options menu of the concerning data source.
Furthermore, I`ve changed the file '/WEB-INF/config.properties'.
I added the file type name 'jpg' with the file type extension 'jpg image/jpeg'.
Nonetheless, the files can`t be found when using the enterprise search, ...
OmniFind doesn't index image files and I honestly have no idea why you would want to do so in an enterprise search arena. What would you search on? The only place where I can see applicability as far as indexing an "image" file, is for scanned document images. You could then use OCR (optical character recognition) to extract the text from the image. Other than that, the only other...
It was planned to search the image files according to their metadata.
It`s a pity that Omnifind is not able to index such files.
But anyway, thank you for your quick answer.
Message was edited by: develoWorks
I am surprised when you say that Omnifind doesnt crawl image files. Inspite of my mimetypes set to exclude image/tiff it is still being picked up. Is it doing so because it is trying to crawl based off the metadata associated with those images. I have tried every possible scenario to limit the crawl space so that the image files dont get crawled and yet they are. I have posted couple ...
Hello Dmorris,
As you stated in your post "the only other way it makes sense is if the images are in a content repository and have metadata associated with them in which case you would index the metadata", is that do-able OR NOT ?? if i have a content Repository (FileNet P8 4.5) and i have image files uploaded to this repository, can i index these files using omnifind OR not ??...
Hey there mauriziog,
I tried the approach that you specified and it worked fine (with some abnormalities, like if you try to search with the file name it doesnt return any results, but if you search with the extension jpg it gets them all)
Next I will try it with filenet (if we can ever connect the IICE to filenet :S)
Thanks alot for the fast reply
Hey there mauriziog, I tried the approach that you specified and it worked fine (with some abnormalities, like if you try to search with the file name it doesnt return any results, but if you search with the extension jpg it gets them all) Next I will try it with filenet (if we can ever connect the IICE to filenet :S) Thanks alot for the fast reply
It was planned to search the image files according to their metadata. It`s a pity that Omnifind is not able to index such files. But anyway, thank you for your quick answer. Message was edited by: develoWorks
I am surprised when you say that Omnifind doesnt crawl image files. Inspite of my mimetypes set to exclude image/tiff it is still being picked up. Is it doing so because it is trying to crawl based off the metadata associated with those images. I have tried every possible scenario to limit the crawl space so that the image files dont get crawled and yet they are. I have posted couple of threads on this but to no response. Do you have any ideas?
OmniFind doesn't index image files and I honestly have no idea why you would want to do so in an enterprise search arena. What would you search on? The only place where I can see applicability as far as indexing an "image" file, is for scanned document images. You could then use OCR (optical character recognition) to extract the text from the image. Other than that, the only other way it makes sense is if the images are in a...
Related threads on "developerWorks : Information Management Forums":
Thread profile page for "How to crawl image files like jpeg, gif, ..." on http://www.ibm.com/developerworks/db2/.
This report page is a snippet summary view from a single thread "How to crawl image files like jpeg, gif, ...", located on the Message Board at http://www.ibm.com/developerworks/db2/.
This thread profile page shows the thread statistics for: Total Authors, Total Thread Posts, and Thread Activity