This is an easy one but requires a little bit of work to get working correctly. SharePoint uses a feature called Index Server to search documents but it doesn’t search within PDFs by default. Searching inside PDF documents requires an iFilter from Adobe which they designed for 3rd party systems to read the PDF file format. Adobe includes this filter with Adobe Reader or you can download iFilter separately from Adobe’s site if you don’t want Reader installed on your SharePoint servers.
http://www.adobe.com/products/reader – Latest version of Adobe Reader
or
http://www.adobe.com/support/downloads/detail.jsp?ftpID=2611 – x86 iFilter
http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025 – x64 iFilter
CENTRAL ADMINISTRATION
Now in SharePoint itself, you need to configure the search service to index files with the .pdf extension:
1. Go to CA and open up the Shared Service under Shared Services Administration.
2. Click Search Administration under the Search section.
3. Click File Types in the left nav bar and then click New File Type.
4. Enter “pdf” and click OK.
ICONS
You will also want to display the PDF icon next to PDF Documents in SharePoint. You can download the icon from here:
http://www.adobe.com/images/pdficon_small.gif
and copy it into the 12 hive folder here:
C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions\12\TEMPLATE\IMAGES
Then open up this XML template file:
C:\Program Files\Common Files\Microsoft Shared\Web Server Extensions \12 \TEMPLATE\XML\DOCICON.XML
and add the this line in the section if it isn’t there already:
REGISTRY
Now on to the registry changes you need to make on each index server. Make sure to backup your registry before making any changes. These two changes will register the Adobe PDF iFilter with the Office Search service. The values that need to be changed are:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office Server\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Setup\ContentIndexCommon\Filters\Extension\.pdf
Both values should be changed to:
{E8978DA6-047F-4E3D-9C78-CDBE46041603}
Then go to:
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Shared Tools\Web Server Extensions\12.0\Search\Applications\{Random GUID}\Gather\Search\Extensions\ExtensionList
and add “pdf” to this list. You will have to create a new String Value for this. Just number it the next number in the list, should be 38 on most Sharepoint installs.
SYSTEM PATH
Now you need to add the Adobe install directory to the System Path environmental variable so that the search service can find the dll which provides the iFilter service:
1. Right click My Computer
2. Click Properties
3. Click Advanced
4. Click Environment Variables
5. In the bottom half of the window, find the Path variable and double click it.
6. At the end of the value, add:
;C:\Program Files\Adobe\Reader 9.0\Reader
RESTART SEARCH SERVICES
Now you need to restart the Office Search service so that all changes are reflected. Open up cmd prompt and type
sc stop osearch [press enter]
sc start osearch [press enter]
Or just restart it via the Services MMC.
If you already have PDF documents in SharePoint that you want to search inside, you have to “Reset all crawled content” in Search Settings and then begin a new “Full Crawl” under Content Sources.
UPDATE 9/20/2010: Installing SP2 or cumulative updates to your Sharepoint farm may sometimes reset your registry changes. Specifically your {E8978DA6-047F-4E3D-9C78-CDBE46041603} registry key will be reset to the old {4C904448-74A9-11D0-AF6E-00C04FD8DC02} value or it might include both keys. This will cause your PDF indexing to stop. Just edit the registry values above and put the correct value back in, restart search services & IIS, then run a full crawl. Your PDFs will begin indexing correctly again.
Jason Samuel is a visionary product leader and trusted advisor with a proven track record of shaping strategy and driving technology innovation. With extensive expertise in enterprise end-user computing, security, cloud, automation, and virtualization technologies, Jason has become a globally recognized authority in the IT industry. His career spans consulting for hundreds of Fortune 500 enterprises across diverse business sectors worldwide, delivering cutting-edge digital solutions from Citrix, Microsoft, VMware, Amazon, Google, and NVIDIA that seamlessly balance security with exceptional user experiences.
Jason’s leadership is amplified by his dedication to knowledge-sharing as an author, speaker, podcaster, and mentor within the global IT and technology community. Recognized with numerous prestigious awards, Jason’s contributions underscore his commitment to advancing technology and empowering organizations to achieve transformative results. Follow him on LinkedIn.