I index office documents in my application (MS Office 03, 07). If you load an IFilter using a specific file, everything is OK, the filter works as it should:

 HRESULT hr_f = LoadIFilter(filename, 0, (void **)&pFilter); 

However, initialization from the buffer:

 HRESULT hr_ss = BindIFilterFromStream(spStream, 0, (void **)&pFilter); 

returns E_FAIL , and pFilter , of course, does not work. IStream from IStream , methods implemented, the main thing that is needed to initialize the desired plug-in, I suspect, is created in the method:

 HRESULT StreamFilter::Stat(STATSTG * pstatstg, DWORD grfStatFlag) { //Microsoft Office Ifilter from Windows Registry const IID CLSID_IFilter = { 0xf07f3920, 0x7b8c, 0x11cf, { 0x9b, 0xe8, 0x00, 0xaa, 0x00, 0x4b, 0x99, 0x86 } //{f07f3920-7b8c-11cf-9be8-00aa004b9986} }; LARGE_INTEGER pSize; int fl = GetFileSizeEx(_hFile, &pSize); memset(pstatstg, 0, sizeof(STATSTG)); pstatstg->clsid = CLSID_IFilter; pstatstg->type = STGTY_STREAM; pstatstg->cbSize.QuadPart = pSize.QuadPart; return S_OK; } 

pstatstg tried different variants of initialization of the pstatstg structure, everything is useless ... After calling this method, judging by the call stack it goes to query.dll and from there I get E_FAIL . I can not imagine what else might be needed.

There is a similar question, Using IFilter in C # , and the method described for * .pdf really works in pluses. But unfortunately for MSO is not suitable.

    1 answer 1

    In general, after a long search, I, inspired by the decision in this case , sawed off my crutches, which work as I need.

    We enable the system to select the desired handler by extension of the file being processed:

     HRESULT hr = LoadIFilter(L".doc", 0, (void **)&pFilter); 

    Then you need to initialize our IStream* :

     IPersistStream *stream; HRESULT hr_qi = pFilter->QueryInterface(&stream); std::ifstream ifs(filename, ios::binary); std::string content((std::istreambuf_iterator<char>(ifs)), (std::istreambuf_iterator<char>())); IStream *comStream; HGLOBAL hMem = ::GlobalAlloc(GMEM_MOVEABLE, content.size()); LPVOID pDoc = ::GlobalLock(hMem); memcpy(pDoc, content.c_str(), content.size()); ::GlobalUnlock(hMem); HRESULT hr_mem = ::CreateStreamOnHGlobal(hMem, true, &comStream); HRESULT hr_stream_load = stream->Load(comStream); 

    And then we work with the filter, as in the examples from MSDN or GitHub:

     if (SUCCEEDED(hr)) { DWORD flags = 0; HRESULT hr = pFilter->Init(IFILTER_INIT_INDEXING_ONLY | IFILTER_INIT_APPLY_INDEX_ATTRIBUTES | IFILTER_INIT_APPLY_CRAWL_ATTRIBUTES | IFILTER_INIT_FILTER_OWNED_VALUE_OK | IFILTER_INIT_APPLY_OTHER_ATTRIBUTES, 0, 0, &flags); if (FAILED(hr)) { pFilter->Release(); throw exception("IFilter::Init() failed"); } Start(); STAT_CHUNK stat; while (SUCCEEDED(hr = pFilter->GetChunk(&stat))) { if ((stat.flags & CHUNK_TEXT) != 0) ProcessTextChunk(pFilter, stat); if ((stat.flags & CHUNK_VALUE) != 0) ProcessValueChunk(pFilter, stat); } Finish(); pFilter->Release(); } else { throw exception("LoadIFilter() failed"); } 

    It should be emphasized that in this situation there is NO need to implement your version of IStream* , only if you do not write a custom processing plugin for Windows Search.