The task you are talking about is quite complicated, so let's divide it into parts.
Part 1. Arrangement of hyphenation.
For starters, there are dictionaries in which words are given with all possible hyphenation. Compiling such dictionaries is, of course, a laborious task. In addition, dictionaries usually still store tables of affixes and the rules for attaching them to roots: storing 6 cases for each word is burdensome.
Then, there is the classical algorithm of F. Liang for hyphenation, which was used by the mega-father Donald Knuth in his system TeX. This algorithm is based on patterns that are assigned a particular weight, and is designed in such a way that the dictionary of exceptions is minimal (for example, the exception table for the English language contains a total of 14 elements). You can find tables for all languages along with exception tables in any TeX distribution (for example, in many Linux distributions).
The same algorithm, with minor modifications, is used in many popular open spell checking systems. For example, hunspell , which is used in OS X, OpenOffice, Firefox, Chrome, Opera, Eclipse, and a bunch of other programs, uses the Liang algorithm .
There are still proprietary, closed transfer algorithms, I can not say anything about them.
How exactly the breaks are arranged in each specific reader, only experts can say. However, the list of programs that use TeX hyphenation is impressive, isn't it?
Part 2. Breaking the text into lines.
Here again there is a simple, artless approach: knowing the possible points of hyphenation, we “stuff” pieces into the current line as long as possible. Then go to the next line, and so on to the end. According to my impressions, Microsoft Word still works that way.
The algorithm used in the TeX system is much more elegant: the purpose of the algorithm is to maximize the total "quality" of all lines. For each of the options for splitting into lines, the algorithm can estimate how much the string is stretched or compressed. Deviation from the “ideal” case reduces the quality of the string. Other factors, such as hyphenation, also reduce line quality. The algorithm selects a partition in quadratic time, using the option of dynamic programming. The quality of the resulting splitting is noticeably better than in the naive algorithm, and increases with increasing paragraph length; however, the algorithm itself is not very fast.
Again, what kind of algorithm does this or that reader use, only experts can say.
Part 3. What to do?
First, talk to your superiors, find out what they want. Take out the pagination in a separate module and to start with a half-day with coffee and bread rolls, sketch out a "greedy" line break without hyphenation. (Note the special case where one word in the middle is longer than the entire line.)
Secondly, for transfers, check if hunspell is compatible with your license, and screw it up for hyphenation. This is short-lived, and immediately gives a partition that is no different from OpenOffice — not bad already. Work here for a couple of days.
Thirdly, try either to independently implement the TeXh algorithm for splitting into lines, or find a ready implementation (even under an incompatible license). Test speed, it can be critical for your program. If the speed is right, look for an implementation that is compatible with your program in the sense of license purity (or do it yourself, there is information in Google, albeit in English). If not, maybe this quality of the layout at the price of speed is not what you need?