📜 ⬆️ ⬇️

Julia. Report Generators and Documentation


One of the most pressing problems at all times is the problem of preparing reports. Since Julia is a language whose users are directly connected with the tasks of data analysis, the preparation of articles and beautiful presentations with the results of calculations and reports, this topic simply cannot be ignored.


Initially, this article planned a set of recipes for generating reports, but next to the reports is the topic of documentation, with which report generators have many intersections. Therefore, it includes tools by the criterion of the possibility of embedding an executable code on Julia into a template with some markup. Finally, we note that the review includes report generators, both implemented on Julia itself, and tools written in other programming languages. And, of course, some key points of the Julia language itself were not ignored, without which it may not be clear in what cases and what means should be used.


Jupyter notebook


This tool, perhaps, should be attributed to the most popular among those who are engaged in data analysis. Due to the ability to connect various computational cores, it is popular with researchers and mathematicians who are used to their specific programming languages, one of which is Julia. The corresponding modules for the Julia language are implemented for Jupyter Notebook in full. And that is why Notebook is mentioned here.
The installation process of Jupyter Notebook is not complicated. See https://github.com/JuliaLang/IJulia.jl for the order. If Jupyter Notebook is already installed, you just need to install the Ijulia package and register the corresponding computational kernel.


Since the Jupyter Notebook product is sufficiently well known not to paint it in detail, we only mention a couple of points. Notepad (we will use notepad terminology) in Jupyter Notebook consists of blocks, each of which can contain either code or markup in its various forms (for example, Markdown). The result of processing is either visualization of the markup (text, formulas, etc.), or the result of the last operation. If at the end of the line with the code is a semicolon, the result will not be displayed.


Examples Notepad before execution is shown in the following figure:



The result of its execution is shown in the following figure.



Notepad contains graphics and some text. Note that to output the matrix, it is possible to use the DataFrame type, for which the result is displayed in the form of an html-table with explicit borders and a scroller, if needed.


Jupyter notebook can export the current notepad to a file in html format. With installed conversion tools, it can convert to pdf.


To build reports on a certain schedule, you can use the nbconvert module and the following command, which is called in the background on a schedule:
jupyter nbconvert --to html --execute julia_filename.ipynb


When performing lengthy calculations, it is advisable to add an option indicating the timeout - --ExecutePreprocessor.timeout=180


An html report formed from this file will appear in the current directory. The option --execute here means a forced start of the recalculation.


For a complete set of options for the nbconvert module nbconvert see
https://nbconvert.readthedocs.io/en/latest/usage.html


The result of the conversion to html is almost completely consistent with the previous figure, except that there is no menu bar and buttons.


Jupytext


Quite an interesting utility that allows you to convert already created ipynb notes into Markdown text or Julia code.


We can convert the previously considered example using the command
jupytext --to julia julia_filename.ipynb


As a result, we get the file julia_filename.jl with the code for Julia and a special markup in the form of comments.


 # --- # jupyter: # jupytext: # text_representation: # extension: .jl # format_name: light # format_version: '1.3' # jupytext_version: 0.8.6 # kernelspec: # display_name: Julia 1.0.3 # language: julia # name: julia-1.0 # --- # # Report example using Plots, DataFrames # ### Drawing # Good time to show some plot plot(rand(5,5), linewidth=2, title="My Plot", size = (500, 200)) # ## Some computational results rand(2, 3) DataFrame(rand(2, 3)) 

Note block delimiters are just a double line feed.


We can do the inverse transform using the command:
jupytext --to notebook julia_filename.jl


As a result, an ipynb file will be generated, which, in turn, can be processed and converted into pdf or html.


See https://github.com/mwouts/jupytext for details.


A common drawback of jupytext and jupyter notebook is that the “beautiful” report is limited by the capabilities of these tools.


HTML self-generation


If for some reason we believe that Jupyter Notebook is too heavy a product, requiring the installation of numerous third-party packages that are not needed for Julia, or not flexible enough to build the required report form, then an alternative way is to generate the HTML page manually. However, here you have to immerse yourself in the features of image formation.


For Julia, using the Base.write function is the typical way to output something to the output stream, and Base.show(io, mime, x) for decorating. Moreover, for various requested mime-output methods there can be various display options. For example, a DataFrame displayed as text with a pseudo-table.


 julia> show(stdout, MIME"text/plain"(), DataFrame(rand(3, 2))) 3×2 DataFrame │ Row │ x1 │ x2 │ │ │ Float64 │ Float64 │ ├─────┼──────────┼───────────┤ │ 1 │ 0.321698 │ 0.939474 │ │ 2 │ 0.933878 │ 0.0745969 │ │ 3 │ 0.497315 │ 0.0167594 │ 

If mime is specified as text/html , then the result is HTML markup.


 julia> show(stdout, MIME"text/html"(), DataFrame(rand(3, 2))) <table class="data-frame"> <thead> <tr><th></th><th>x1</th><th>x2</th></tr> <tr><th></th><th>Float64</th><th>Float64</th></tr> </thead> <tbody><p>3 rows × 2 columns</p> <tr><th>1</th><td>0.640151</td><td>0.219299</td></tr> <tr><th>2</th><td>0.463402</td><td>0.764952</td></tr> <tr><th>3</th><td>0.806543</td><td>0.300902</td></tr> </tbody> </table> 

That is, using the methods of the show function defined for the corresponding data type (third argument) and the corresponding output format, you can ensure the formation of the file in any desired data format.


The situation with images is more complicated. If we need to create a single html file, the image must be embedded in the page code.


Consider an example in which this is implemented. Output to the file will be performed by the function Base.write , for which we define the appropriate methods. So, the code:


 #!/usr/bin/env julia using Plots using Base64 using DataFrames # сформируем изображение и запомним его в переменной p = plot(rand(5,5), linewidth=2, title="My Plot", size = (500, 200)) # распечатаем тип, чтобы видеть, кто формирует изображение @show typeof(p) # => typeof(p) = Plots.Plot{Plots.GRBackend} # Определим три абстрактных типа, чтобы сделать 3 разных # метода функции преобразования изображений abstract type Png end abstract type Svg end abstract type Svg2 end # Функция Base.write используется для записи в поток # Определим свои методы этой функции с разными типами # Первый вариант — выводим растровое изображение, перекодировав # его в Base64-формат. # Используем HTML разметку img src="data:image/png;base64,..." function Base.write(file::IO, ::Type{Png}, p::Plots.Plot) local io = IOBuffer() local iob64_encode = Base64EncodePipe(io); show(iob64_encode, MIME"image/png"(), p) close(iob64_encode); write(file, string("<img src=\"data:image/png;base64, ", String(take!(io)), "\" alt=\"fig.png\"/>\n")) end # Два метода для вывода Svg function Base.write(file::IO, ::Type{Svg}, p::Plots.Plot) local io = IOBuffer() show(io, MIME"image/svg+xml"(), p) write(file, replace(String(take!(io)), r"<\?xml.*\?>" => "" )) end # выводим в поток XML-документ без изменений, содержащий SVG Base.write(file::IO, ::Type{Svg2}, p::Plots.Plot) = show(file, MIME"image/svg+xml"(), p) # Определим метод функции для DataFrame Base.write(file::IO, df::DataFrame) = show(file, MIME"text/html"(), df) # Пишем файл out.html простейший каркас HTML open("out.html", "w") do file write(file, """ <!DOCTYPE html> <html> <head><title>Test report</title></head> <body> <h1>Test html</h1> """) write(file, Png, p) write(file, "<br/>") write(file, Svg, p) write(file, "<br/>") write(file, Svg2, p) write(file, DataFrame(rand(2, 3))) write(file, """ </body> </html> """) end 

For image generation, the default engine is Plots.GRBackend , which can perform raster or vector image output. Depending on what type is specified in the mime argument of the show function, the corresponding result is generated. MIME"image/png"() generates an image in png format. MIME"image/svg+xml"() generates an svg image. However, in the second case, you should pay attention to the fact that a completely independent xml-document is being formed, which can be written as a separate file. At the same time, our goal is to insert an image into an HTML page, which in HTML5 can be done by simply inserting SVG markup. That is why the Base.write(file::IO, ::Type{Svg}, p::Plots.Plot) cuts the xml header, which, otherwise, will violate the structure of the HTML document. Although most browsers can correctly display an image even in this case.


Regarding the method for raster images Base.write(file::IO, ::Type{Png}, p::Plots.Plot) , the implementation feature here is that we can insert the binary data into the HTML only in Base64-format. We do this with the <img src="data:image/png;base64,"/> construct. And for transcoding use Base64EncodePipe .


The Base.write(file::IO, df::DataFrame) method Base.write(file::IO, df::DataFrame) provides output in the html-table format of the DataFrame object.


The final page looks like this:



In the image, all three pictures look about the same, but remember that one of them is incorrectly inserted from the HTML point of view (extra xml header). One is raster, which means it cannot be increased without loss of detail. And only one of them is inserted as a valid svg fragment inside the HTML markup. And it can also be easily scaled without loss of detail.


Naturally, the page was very simple. But any visual improvements are possible with CSS.


This way of generating reports is useful, for example, when the number of output tables is determined by real data, and not by a template. For example, you need to group data by some field. And for each group to form separate blocks. Since the formation of the page results in the number of calls to Base.write , it is obvious that there is no problem to wrap the necessary block in a loop, make the output dependent on data, etc.


Code example:


 using DataFrames # Химические элементы и их аггрегатное состояние ptable = DataFrame( Symbol = ["H", "He", "C", "O", "Fe" ], Room = [:Gas, :Gas, :Solid, :Gas, :Solid] ) res = groupby(ptable, [:Room]) # А теперь выведем группы раздельно open("out2.html", "w") do f for df in (groupby(ptable, [:Room])) write(f, "<h2>$(df[1, :Room])</h2>\n") show(f, MIME"text/html"(), DataFrame(df)) write(f, "\n") end end 

The result of this script is a fragment of the HTML page.



Please note that everything that does not require decorating / formatting is output directly via the Base.write function. At the same time, everything that requires conversion is displayed via Base.show .


Weave.jl


Weave is a science report generator that is implemented on Julia. Uses ideas from Pweave, Knitr, rmarkdown, Sweave generators. Its main task is to export the original markup in any of the proposed languages ​​(Noweb, Markdown, script format) to LaTex, Pandoc, Github markdown, MultiMarkdown, Asciidoc, reStructuredText markup. And, even in iJulia Notebooks and back. In part of the latter, it looks like Jupytext.


That is, Weave is a tool that allows you to write templates that contain Julia-code in different markup languages, and on the output to have markup in another language (but already with the results of Julia-code execution). And this is a very useful tool for researchers. For example, you can prepare an article on Latex, which will have inserts on Julia with automatic calculation of the result and its substitution. Weave will generate a file for the final article.


There is support for the Atom editor with the appropriate plugin https://atom.io/packages/language-weave . This allows you to develop and debug Julia-embedded scripts on the markup, and then generate the target file.


The basic principle in Weave, as already mentioned, is the parsing of a template containing markup with text (formulas, etc.) and pasting the Julia code. The result of the code can be displayed in the final report. The output of text, code, output of results, the output of graphs - all this can be individually configured.


To process the templates, you need to run an external script that will collect everything into a single document and convert it to the desired output format. That is, templates are separate, handlers are separate.


An example of such a processing script:


 # Обработать файлы и сгенерировать отчёты: # Markdown weave("w_example.jmd", doctype="pandoc" out_path=:pwd) # HTML weave("w_example.jmd", out_path=:pwd, doctype = "md2html") # pdf weave("w_example.jmd", out_path=:pwd, doctype = "md2pdf") 

jmd in file names is Julia Markdown.


Take the same example that we used in the previous tools. However, we’ll insert a title with information about the author that Weave understands.


 --- title : Intro to Weave.jl with Plots author : Anonymous date : 06th Feb 2019 --- # Intro ## Plot ` ``{julia;} using Plots, DataFrames plot(rand(5,5), linewidth=2, title="My Plot", size = (500, 200)) ` `` ## Some computational results ` ``julia rand(2, 3) ` `` ` ``julia DataFrame(rand(2, 3)) ` `` 

This fragment, being converted to pdf, looks something like this:



Fonts and design are well recognizable for Latex users.


For each piece of embedded code, you can determine how this code will be processed and what will be displayed in the end.


For example:



About noweb and script formats, and more about this tool, see http://weavejl.mpastell.com/stable/


Literate.jl


The authors of this package to the question why Literate, refer to the paradigm of the Literate Programming Donald Knutt. The task of this tool is to generate documents based on the Julia code containing comments in the markdown format. Unlike the previously reviewed Weave tool, it cannot do documents with the results of its execution. However, the tool is lightweight and, above all, focused on documenting code. For example, help in writing beautiful examples that can be placed on any markdown platform. Often used in a chain of other documentation tools, for example with Documenter.jl .


There are three possible output formats - markdown, notebook and script (pure Julia code). In none of them the implementation of the embedded code will not be carried out.


An example of a source file with Markdown comments (after the first # character):


 #!/usr/bin/env julia using Literate Literate.markdown(@__FILE__, pwd()) # documenter=true # # Intro # ## Plot using Plots, DataFrames plot(rand(5,5), linewidth=2, title="My Plot", size = (500, 200)) # ## Some computational results rand(2, 3) DataFrame(rand(2, 3)) 

The result of his work will be a Markdown document and directives for the Documenter , if their generation has not been explicitly disabled.


 ` ``@meta EditURL = "https://github.com/TRAVIS_REPO_SLUG/blob/master/" ` `` ` ``@example literate_example #!/usr/bin/env julia using Literate Literate.markdown(@__FILE__, pwd(), documenter=true) ` `` # Intro ## Plot ` ``@example literate_example using Plots, DataFrames plot(rand(5,5), linewidth=2, title="My Plot", size = (500, 200)) ` `` ## Some computational results ` ``@example literate_example rand(2, 3) DataFrame(rand(2, 3)) ` `` *This page was generated using [Literate.jl](https://github.com/fredrikekre/Literate.jl).* 

The code inserts inside the markdown here are intentionally set with a space between the first and subsequent apostrophes, so as not to deteriorate when the article is published.


See https://fredrikekre.imtqy.com/Literate.jl/stable/ for details.


Documenter.jl


Generator documentation. Its main purpose is to form readable documentation packages written on Julia. Documenter converts to html or pdf as examples with Markdown-markup and embedded Julia-code, and the source files of the modules, extracting Julia-docstrings (Julia's own comments).


An example of a typical paperwork:



In this article we will not dwell on the principles of documentation, because, in an amicable way, this should be done in a separate article on the development of modules. However, some moments of the Documenter, we consider here.


First of all, you should pay attention to the fact that the screen is divided into two parts - the left side contains an interactive table of contents. The right side is actually the text of the documentation.


A typical directory structure with examples and documentation is as follows:


  docs/ src/ make.jl src/ Example.jl ... 

The docs/src directory is the markdown documentation. And the examples are located somewhere in the src source directory.


The key file for the Docuementer is docs/make.jl The contents of this file for the documentation itself:


 using Documenter, DocumenterTools makedocs( modules = [Documenter, DocumenterTools], format = Documenter.HTML( # Use clean URLs, unless built as a "local" build prettyurls = !("local" in ARGS), canonical = "https://juliadocs.imtqy.com/Documenter.jl/stable/", ), clean = false, assets = ["assets/favicon.ico"], sitename = "Documenter.jl", authors = "Michael Hatherly, Morten Piibeleht, and contributors.", analytics = "UA-89508993-1", linkcheck = !("skiplinks" in ARGS), pages = [ "Home" => "index.md", "Manual" => Any[ "Guide" => "man/guide.md", "man/examples.md", "man/syntax.md", "man/doctests.md", "man/latex.md", hide("man/hosting.md", [ "man/hosting/walkthrough.md" ]), "man/other-formats.md", ], "Library" => Any[ "Public" => "lib/public.md", hide("Internals" => "lib/internals.md", Any[ "lib/internals/anchors.md", "lib/internals/builder.md", "lib/internals/cross-references.md", "lib/internals/docchecks.md", "lib/internals/docsystem.md", "lib/internals/doctests.md", "lib/internals/documenter.md", "lib/internals/documentertools.md", "lib/internals/documents.md", "lib/internals/dom.md", "lib/internals/expanders.md", "lib/internals/mdflatten.md", "lib/internals/selectors.md", "lib/internals/textdiff.md", "lib/internals/utilities.md", "lib/internals/writers.md", ]) ], "contributing.md", ], ) deploydocs( repo = "github.com/JuliaDocs/Documenter.jl.git", target = "build", ) 

As you can see, the key methods here are makedocs and deploydocs , which determine the structure of future documentation and the place for its placement. makedocs provides for the formation of markdown markup from all specified files, which includes both the implementation of embedded code and the extraction of docstrings comments.


Documenter supports a number of directives to insert code. Their format is `` ` @something



The presence of @example, @repl, @eval , in fact, determined whether to include the Documenter in this review or not. Moreover, the previously mentioned Literate.jl can automatically generate such markup, which was demonstrated earlier. That is, there are no fundamental limitations to using the documentation generator as a report generator.


For more information about Documenter.jl, see https://juliadocs.imtqy.com/Documenter.jl/stable/


Conclusion


Despite the youth of the Julia language, already developed packages and tools for it, we can speak about full use in high-loaded services, and not just about the implementation of pilot projects. As you can see, the ability to generate various documents and reports, including the results of code execution both in text and in graphical form, is already provided. Moreover, depending on the complexity of the report, we can choose between the simplicity of creating a template and the flexibility of generating reports.


The article does not consider the Flax generator from the Genie.jl package. Genie.jl is an attempt to implement Julia on Rails, and Flax is a kind of analogue of eRubis with Julia code inserts. However, Flax is not provided as a separate package, and Genie is not included in the main package repository, so it was not included in this review.


I would like to separately mention the Makie.jl and Luxor.jl packages , which provide the formation of complex vector visualizations. The result of their work can also be used as part of the reports, but this should also be a separate article.


Links




Source: https://habr.com/ru/post/439442/