Tell me, please, how in ForkManager is the variable divided between processes. Sample script:

use Parallel::ForkManager; our (@Print); my $fileLog = 'test.txt'; $pm = new Parallel::ForkManager(5); sub myFunc { my ($s) = @_; push(@Print, $s); } open(my $file, '<:encoding(UTF-8)', $fileLog); while (my $row = <$file>) { my $pid = $pm->start and next; myFunc($row); $pm->finish; } close $file; $pm->wait_all_children; print @Print; 

1 answer 1

Using shared memory will help, yes. But it is fraught with problems. For example, when a script crashes, or just stopping it under a debugger, the allocated memory freezes. What can ultimately lead to the inoperability of the system as a whole.

Options:

  1. Use the standard means of the module to transfer data from descendants to the parent process. For examples, see the module documentation: Parallel :: ForkManager-> EXAMPLES-> Data structure retrieval .

  2. Establish data exchange between the parent and the descendants independently. For example, using sockets, pipes, or something else.

  3. Give up forks in principle. You can parallelize processes in many ways: Mojo :: IOLoop, AnyEvent, Coro ...

An example of how to do without shared memory, and taking into account the answer to a parallel question :

 #!/usr/bin/env perl use Modern::Perl; use Parallel::ForkManager; use Socket; use IO::Select; use Fcntl; use Data::Printer; use constant CHILDREN => 3; # здесь мы накапливаем строки в потомках: my @child_lines; # а сюда складываем результат в родителе: my @parent_lines; # каналы для записи в потомков: my @writers; my $current_writer = 0; my $pm = new Parallel::ForkManager( CHILDREN ); # эта процедура вызывается при завершении потомков: $pm->run_on_finish( sub { my ($pid, $exit_code, $ident, $exit_signal, $core_dump, $dataref) = @_; push @parent_lines, @{$dataref}; } ); $SIG{'PIPE'} = 'IGNORE'; for( 1 .. CHILDREN ) { my ( $writer, $reader ); # создаём анонимные сокеты для общения: socketpair( $reader, $writer, AF_UNIX, SOCK_STREAM, PF_UNSPEC ); # в принципе эти сокеты можно использовать и # для получения результатов из потомков (даже ПРАВИЛЬНЕЙ # было бы использовать), но тут просто демка my $pid = $pm->start; if( $pid ) { close $reader; _set_opt( $writer ); push @writers, $writer; next; } close $writer; _set_opt( $reader ); # пока приходят строки - просто сохраняем их: while( my $line = <$reader> ) { chomp $line; push @child_lines, [ $$, $line ]; } close $reader; # при завершении потомка вызывается run_on_finish со # ссылкой на массив накопленных в потомке строк: $pm->finish(0, \@child_lines ); } while( my $line = <STDIN> ) { chomp $line; next unless $line; # перебираем каналы потомков в цикле и пишем # в каждый из них входную строку: say {$writers[$current_writer]} $line; $current_writer++; $current_writer = 0 if $current_writer > $#writers; } close $_ for @writers; $pm->wait_all_children; # теперь в @parent_lines находятся все обработанные строки: p @parent_lines; sub _set_opt { my ( $sock ) = @_; $sock->autoflush(1); my $flags = fcntl( $sock, F_GETFL, 0 ); fcntl( $sock, F_SETFL, $flags | O_NONBLOCK ); } 

And how much easier it can be done with the help of Mojo :: IOLoop :

 #!/usr/bin/env perl use Modern::Perl; use Mojo::IOLoop; use Data::Lock qw/dlock dunlock/; use Data::Printer; use constant WORKERS => 3; my @result; my $worker; my $active_workers = 0; $worker = sub { while( 1 ) { my $line = <STDIN>; unless ($line) { return $active_workers ? undef : Mojo::IOLoop->stop(); } chomp $line; if( $line ) { $active_workers++; # что-то делаем и сохраняем: dlock @result; push @result, $line; dunlock @result; $active_workers--; } } }; $worker->() for 1 .. WORKERS; Mojo::IOLoop->start(); p @result; 
  • Thank you very much for your answer! Modified your script to your version (Mojo :: IOLoop) and its speed has increased 40 times compared to the option on forks. - Firsim
  • I would recommend method number 2 - Eugen Konkov