Prompt the library in C ++ (preferably well portable) to check the files for belonging to the doc docx xls xlsx pdf format. And of course it is desirable free.

    1 answer 1

    There is a linux utility file . It makes a rough estimate of what type the file belongs to. You can try to find its source code and analyze it.

    As for the rest, it is easier to take and connect the MS Office and OpenOffice libraries and make analysis with their means.

    PS: PDF can be easily identified by the% PDF signature at the beginning of the file — see format specification .